Conference Publication

A Many-core Architecture for In-Memory Data Processing
March 2017

We live in an information age, with data and analytics guiding a large portion of our daily decisions. Data is being generated at a tremendous pace from connected cars, connected homes and connected workplaces, and extracting useful knowledge from this data is a quickly becoming an impractical task. Single-threaded performance has become saturated in the last decade, and there is a growing need for custom solutions to keep pace with these workloads in a scalable and efficient manner. A big portion of the power in analytics workloads involves bringing data to the processing cores, and we aim to optimize that. We present the Database Processing Unit or DPU, a shared memory many-core that is specifically designed for in-memory analytics workloads. The DPU contains a unique Data Movement System (DMS), which provides hardware acceleration for data movement and preprocessing operations. The DPU also provides acceleration for core to core com- munication via a unique hardware RPC mechanism called the Atomic Transaction Engine or ATE. Comparison of a fabricated DPU chip with a variety of state of the art x86 applications shows a performance/Watt advantage of 3x to 16x.

Authors: Sandeep Agrawal, Sam Idicula, Arun Raghavan, Evangelos Vlachos, Venkat Govindaraju, Venkatanathan Varadarajan, Cagri Balkesen, Georgios Giannikis, Erik Schlanger, Charlie Roth, Nipun Agarwal, Eric Sedlar

Venue: The 50th Annual IEEE/ACM International Symposium on Microarchitecture


Hardware and Software, Engineered to Work Together