The MIT Angstrom Project: Universal Technologies for Exascale Computing

The MIT-led Angstrom team will rethink computing and create a fundamentally new computing architecture to meet the challenges of extreme-scale computing. Project Angstrom’s goal is to create the fundamental technologies necessary for extreme-scale computers. Extreme-scale computers face several major challenges, the most difficult four being the energy efficiency challenge, the scalability challenge, the programmability challenge and the dependability challenge. We address these challenges from basic hardware/software research, to chip and system fabrication with a team that includes MIT's CSAIL, MTL, RLE, and MPhC labs; industry partners Freescale Semiconductor, Mercury Federal Systems and Lockheed ATL; and the University of Maryland Department of ECE. Angstrom is funded by the DARPA UHPC (Ubiquitous High-Performance Computing) program.

Project Angstrom’s vision to address the four major challenges of extreme-scale computing is based on two key foundations: creating a revolutionary SElf-awarE Computational model called SEEC, and a fully distributed factored architecture for both hardware and software.
 
SElf-awarE Computational model (SEEC)
SEEC is a goal-oriented computational model that radically increases developer productivity by abstracting traditional procedural programming into goals (e.g., “achieve the best possible chess move within 10s burning less than 20 W”) that are actuated in our self-aware, factored system. SEEC will also enable systems that are orders of magnitude more energy efficient and dependable by incorporating explicitly energy and resiliency goals into the hardware, operating system, compiler and languages. A major goal of our research is to create and to evaluate algorithms and interfaces for SEEC using methods based on machine learning and control theory.
 
Distributed Factored Architecture
Our factored approach will result in energy efficient multicores scalable to 1000’s of cores. For example, distributed power converters will scale since they eliminate centralized control bottlenecks and allow fine-grain voltage control, and facilitate SEEC which demands individual control of the voltage, clock, and body bias of each core. Similarly, our factored software will be radically more resilient and scalable to meet the demands of billions of threads. For example, our revolutionary SElf-aware Factored OS (SEFOS) will factor OS functions into services (e.g., scheduling service or fault tolerance service) that are each implemented by a dynamic fleet of cooperating servers. Accordingly, our second major goal is to invent fully distributed architectural mechanisms and factored software approaches.
SEEC and factoring are the two overarching themes of Project Angstrom. These two concepts instantiated in several novel mechanisms in the Angstrom project will provide solutions for the four major extreme-scale challenges.