We perform research in the broadly defined area of computer engineering. We design new hardware architectures for emerging platforms, spanning datacenters to mobile devices. Moreover, we propose new hardware management strategies to ensure service quality for diverse users in complex systems. In design and management, we navigate fundamental relationships between performance, energy-efficiency, and fairness.

1 Apr 20

IEEE Spectrum publishes our perspective on game theory for datacenter management. A great article for a high-level introduction.

30 Mar 20

Pengfei Zheng defends his PhD dissertation on machine learning for datacenter operations. Congratulations!

23 Mar 20

Calvin Ma presents his thesis, on time series analysis for straggler prediction, for graduation with distinction in computer science. Well done!

8 Mar 20

Our SOSP 2009 paper -- Better I/O through byte-addressable persistent memory -- receives the Persistent Impact Prize for its exceptional impact on non-volatile memory research.

More news...

Computer Systems and Machine Learning

We adapt and invent methods in statistical machine learning to understand and optimize distributed systems. We focus on interpretable frameworks such as causal inference and natural-language processing. We also focus on dynamic frameworks such as reinforcement learning.

Computer Systems and Economics

With the democratization of cloud computing, diverse users demand computation from complex datacenters. In this setting, we study mechanisms for hardware allocation and scheduling. Our interdisciplinary research spans computer architecture, economic mechanism design, and game theory. First, we examine welfare maximization and markets in which autonomic agents bid for hardware on behalf of users. Second, we investigate fairness and algorithms that equitably divide hardware among strategic agents. Finally, we navigate tensions between welfare and fairness.

Datacenter Design for Efficiency

To keep pace with big data computing, datacenters must provide more capability within today’s power budgets. Toward this goal, we architect servers using hardware originally intended for mobile systems. For some datacenter applications, mobile-class processors and memories are suitable and far more energy-efficient than their server-class counterparts. For others, heterogeneous datacenters, with a mix of server- and mobile-class hardware, mitigate latency penalties and ensure service quality.

Specialized and Adaptive Architecture

With the end of Dennard scaling, Moore’s Law provides more transistors but increases power densities. Moreover, Amdahl’s Law says that a multi-core strategy alone is insufficient. We study hardware specialization and its benefits for energy efficiency. Specialization tailors resources to application requirements whether through heterogeneous processors, adaptive microarchitectures, or application-specific accelerators. We study automated design space exploration to reduce the non-recurring engineering costs of specialization. Moreover, we study policies and mechanisms for managing adaptive microarchitectures.

  • Sources of inefficiency in general-purpose design [ISCA'10]

  • Efficiency from adaptive microarchitectures [ASPLOS'08]

  • Efficiency from heterogeneous microarchitectures [HPCA'07]

Scalable Technology

We coordinate architecture and circuit design and identify new system capabilities enabled by emerging technologies. We study phase change memory (PCM), which relies on programmable resistance to provide qualitatively better scaling trajectories than today’s DRAM. We architect PCM on the memory bus to expose its fast non-volatility. Our architectures increase capacity and reduce power yet offer performance that is competitive with existing DRAM-based systems. Our research spans the hardware-software interface, from links to file systems.

Design Methodology

We apply statistical inference to capture broad relationships within parameter spaces for microarchitectures and multiprocessors, providing new answers to previously intractable questions. Inference defines a design space, simulates sparsely sampled designs, and derives predictive models to act as surrogates for more expensive architectural simulators. Moreover, inference is applicable across the hardware-software interface, whether estimating the impact of process variations or estimating the performance of parallel applications.

High-Performance Software

Performance is often correlated with energy efficiency. For example, locality-enhancing optimizations not only reduce execution time but also reduce communication energy. We characterize the determinants of performance for datacenter and supercomputing software. The characterization drives our research in performance optimization. We study software run-times that mitigate communication costs. And we study heuristics that automatically tune algorithms, data structures, and code to reflect evolving compiler and hardware technologies.

Technology Policy

The increasing centralization of compute resources suggests environmental effects from IT infrastructure are most effectively monitored in an optimized for large-scale datacenters. Within datacenters, new hardware architectures can provide energy efficiency. Equally important is validating claims of net environmental benefits from adopting digital practices and forgoing conventional ones. Understanding substitution effects will be important. Our current research in digital sustainability links fundamental technology, business management, and public policy.