DOC PREVIEW
Pitt CS 3150 - Mercury and Freon

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Mercury and Freon: Temperature Emulation andManagement for Server Systems∗Taliver HeathDept. of Computer ScienceRutgers [email protected] Paula CentenoDept. of Computer ScienceRutgers [email protected] GeorgeDept. of Mechanical EngineeringRutgers [email protected] RamosDept. of Computer ScienceRutgers [email protected] JaluriaDept. of Mechanical EngineeringRutgers [email protected] BianchiniDept. of Computer ScienceRutgers [email protected] densities have been increasing rapidly at all levels of serversystems. To counter the high temperatures resulting from these den-sities, systems researchers have recently started work on softwa-re-based thermal management. Unfortunately, research in this newarea has been hindered by the limitations imposed by simulatorsand real measurements. In this paper, we introduce Mercury, a soft-ware suite that avoids these limitations by accurately emulatingtemperatures based on simple layout, hardware, and component-utilization data. Most importantly, Mercury runs the entire softwarestack natively, enables repeatable experiments, and allows the studyof thermal emergencies without harming hardware reliability. Wevalidate Mercury using real measurements and a widely used com-mercial simulator. We use Mercury to develop Freon, a system thatmanages thermal emergencies in a server cluster without unneces-sary performance degradation. Mercury will soon become availablefrom http://www.darklab.rutgers.edu.Categories and Subject Descriptors D.4 [Operating systems]:MiscellaneousGeneral Terms Design, experimentationKeyw ords Temperature modeling, thermal management, energyconservation, server clusters1. IntroductionPower densities have been increasing rapidly at all levels of serversystems, from individual devices to server enclosures to machinerooms. For example, modern microprocessors, high-performance∗This research has been supported by NSF under grant #CCR-0238182(CAREER award).Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. To copy otherwise, to republish, to post on servers or to redistributeto lists, requires prior specific permission and/or a fee.ASPLOS’06October 21–25, 2006, San Jose, California, USA.Copyrightc 2006 ACM 1-59593-451-0/06/0010. . . $5.00disk drives, blade server enclosures, and highly populated computerracks exhibit power densities that have never been seen before.These increasing densities are due to increasingly power-hungryhardware components, decreasing form factors, and tighter pack-ing. High power densities entail high temperatures that now mustbe countered by substantial cooling infrastructures. In fact, whenhundreds, sometimes thousands, of these components are rackedclose together in machine rooms, appropriate cooling becomes themain concern.The reason for this concern is that high temperatures decreasethe reliability of the affected components to the point that they startto behave unpredictably or fail altogether. Even when componentsdo not misbehave, operation outside the range of acceptable tem-peratures causes mean times between failures (MTBFs) to decreaseexponentially [1, 7]. Several factors may cause high temperatures:hot spots at the top sections of computer racks, poor design of thecooling infrastructure or air distribution system, failed fans or airconditioners, accidental overload due to hardware upgrades, or de-graded operation during brownouts. We refer to these problems as“thermal emergencies”. Some of these emergencies may go unde-tected for a long time, generating corresponding losses in reliabilityand, when components eventually fail, performance.Recognizing this state of affairs, systems researchers have re-cently started work on software-based thermal management. Speci-fically, researchers from Duke University and Hewlett-Packardhave examined temperature-aware workload placement policiesfor data centers, using modeling and a commercial simulator [21].Another effort has begun investigating temperature-aware disk-scheduling policies, using thermal models and detailed simulationsof disk drives [12, 16]. Rohou and Smith have implemented andexperimentally evaluated the throttling of activity-intensive tasksto control processor temperature [26]. Taking a different approach,Weissel and Bellosa have studied the throttling of energy-intensivetasks to control processor temperature in multi-tier services [32].Despite these early initiatives, the infrastructure for software-based thermal management research severely hampers new efforts.In particular, both real temperature experiments and temperaturesimulators have several deficiencies. Working with real systemsrequires heavy instrumentation with (internal or external) sensorscollecting temperature information for every hardware componentand air space of interest. Hardware sensors with low resolution andpoor precision make matters worse. Furthermore, the environment106where the experiments take place needs to be isolated from unre-lated computations or even trivial “external thermal disruptions”,such as somebody opening the door and walking into the machineroom. Under these conditions, it is very difficult to produce repeat-able experiments. Worst of all, real experiments are inappropriatefor studying thermal emergencies. The reason is that repeatedly in-ducing emergencies to exercise some piece of thermal managementcode may significantly decrease the reliability of the hardware.In contrast, temperature simulators do not require instrumenta-tion or environment isolation. Further, several production-qualitysimulators are available commercially, e.g. Fluent [9]. Unfortu-nately, these simulators are typically expensive and may take sev-eral hours to days to simulate a realistic system. Worst of all, thesesimulators are not capable of executing applications or any typeof systems software; they typically compute steady-state tempera-tures based on a fixed power consumption for each hardware com-ponent. Other simulators, such as HotSpot [30], do execute appli-cations (bypassing the systems software) but only model the pro-cessor, rather than the entire system.To counter these problems, we introduce


View Full Document

Pitt CS 3150 - Mercury and Freon

Documents in this Course
JouleSort

JouleSort

12 pages

Load more
Download Mercury and Freon
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Mercury and Freon and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Mercury and Freon 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?