Temperature-Aware Workload Scheduling for Optimal Energy

Project

Temperature-Aware Workload Scheduling for Optimal Energy

Principal Investigator

Assistant Professor Ayse Coskun

Boston University

Oracle Fellowship Recipient

Can Hankendi, Marina Zapater, Tiansheng zhang

Oracle Principal Investigator

Kalyan Vaidyanathan
Kenny Gross

Summary

In Phase 1, along with Boston U collaborators Prof. Ayse Coskun and three of her students, we demonstrated that intelligent run-time management enables Oracle multi-core chip systems to provide more stable thermal dynamics (which will benefit long-term system reliability) while simultaneously saving energy. Specifically, we demonstrated 9% energy savings in Oracle T3 servers that are already in the field. We expect the energy savings for new T5 and post-T5 servers to be at least 20% because the “leakage” component in the newer chips is much greater (we will measure on prototype T5 systems during Phase-2). For end customers the electricity savings are even greater because the energy saved inside the IT assets reduces the burden on the data center HVAC systems (every kiloWatt saved inside the IT assets saves approximately two kiloWatts for the end customer). For Phase 1, concept feasibility was established using empirically measured Leakage Power and Fan Motor Power functions in the form of a LookUp Table (LUT). An LUT approach works great for one server in a data center with a fixed ambient temperature and at a fixed altitude. However, to maximize value for Oracle servers that are manufactured for data centers at a range of ambient temperatures (15-35 degC) and a range of altitudes (sea level to 10K feet), in Phase 2 we are going to develop and demonstrate a multi-variate optimal control formalism that minimizes CPU leakage dynamics for servers at any allowable ambient temperature and any allowable altitude.