Abstract:Computing has dramatically changed nearly every aspect of our lives, from business and agriculture to communication and entertainment. As a nation, we rely on computing in the design of systems for energy, transportation and defense; and computing fuels scientific discoveries that will improve our fundamental understanding of the world and help develop solutions to major challenges in health and the environment. Computing has changed our world, in part, because our innovations can run on computers whose performance and cost-performance has improved a million-fold over the last few decades. A driving force behind this has been a repeated doubling of the transistors per chip, dubbed Moore's Law. A concomitant enabler has been Dennard Scaling that has permitted these performance doublings at roughly constant power, but, as we will see, both trends face challenges. Consider for a moment the impact of these two trends over the past 30 years. A 1980's supercomputer (e.g. a Cray 2) was rated at nearly 2 Gflops and consumed nearly 200 KW of power. At the time, it was used for high performance and national-scale applications ranging from weather forecasting to nuclear weapons research. A computer of similar performance now fits in our pocket and consumes less than 10 watts. What would be the implications of a similar computing/power reduction over the next 30 years - that is, taking a petaflop-scale machine (e.g. the Cray XK7 which requires about 500 KW for 1 Pflop (=1015 operations/sec) performance) and repeating that process? What is possible with such a computer in your pocket? How would it change the landscape of high capacity computing? In the remainder of this paper, we articulate some opportunities and challenges for dramatic performance improvements of both personal to national scale computing, and discuss some "out of the box" possibilities for achieving computing at this scale.

10-millisecond Computing

An Overview of Computing-in-Memory Interfaces

A Dozen Essential Issues of Computing for the Masses

Latency Optimization for Resource Allocation in Mobile-Edge Computation Offloading.

BlueJay: A Platform to Quantifying the Impact of Memory Latency on Datacenter Application Performance

Opportunities and Challenges for Next Generation Computing

Delay-Optimal Computation Offloading for Computation-Constrained Mobile Edge Networks.

In-memory Computing to Break the Memory Wall

Timeliness of Information for Computation-Intensive Status Updates in Task-Oriented Communications

NO2: Speeding Up Parallel Processing of Massive Compute-Intensive Tasks

A Low-Latency Computing Framework for Time-Evolving Graphs

Workload Behavior Driven Memory Subsystem Design for Hyperscale

X10-ft

Towards Real-Time Inference Offloading with Distributed Edge Computing: the Framework and Algorithms

Scaling OLTP Applications on Commodity Multi-Core Platforms

10 Years Later: Cloud Computing is Closing the Performance Gap

Age of Computing: A Metric of Computation Freshness in Communication and Computation Cooperative Networks

Finally, how many efficiencies supercomputers have? And, what do they measure?

An empirical study of latency in an emerging class of edge computing applications for wearable cognitive assistance

Optimizing the Response Time of Memcached Systems Via Model and Quantitative Analysis.