Temperature Aware Microarchitecture Kevin Skadron Mircea R Stan Wei Huang Sivakumar Velusamy Karthik Sankaranarayanan and David Tarjan Dept of Computer Science Dept of Electrical and Computer Engineering University of Virginia Charlottesville VA skadron siva karthick dtarjan cs virginia edu mircea wh6p virginia edu Abstract With power density and hence cooling costs rising exponentially processor packaging can no longer be designed for the worst case and there is an urgent need for runtime processor level techniques that can regulate operating temperature when the package s capacity is exceeded Evaluating such techniques however requires a thermal model that is practical for architectural studies This paper describes HotSpot an accurate yet fast model based on an equivalent circuit of thermal resistances and capacitances that correspond to microarchitecture blocks and essential aspects of the thermal package Validation was performed using finiteelement simulation The paper also introduces several effective methods for dynamic thermal management DTM temperaturetracking frequency scaling localized toggling and migrating computation to spare hardware units Modeling temperature at the microarchitecture level also shows that power metrics are poor predictors of temperature and that sensor imprecision has a substantial impact on the performance of DTM 1 Introduction In recent years power density in microprocessors has doubled every three years 3 17 and this rate is expected to increase within one to two generations as feature sizes and frequencies scale faster than operating voltages 25 Because energy consumed by the microprocessor is converted into heat the corresponding exponential rise in heat density is creating vast difficulties in reliability and manufacturing costs At any powerdissipation level heat being generated must be removed from the surface of the microprocessor die and for all but the lowest power designs today these cooling solutions have become expensive For high performance processors cooling solutions are rising at 1 3 or more per watt of heat dissipated 3 12 meaning that cooling costs are rising exponentially and threaten the computer industry s ability to deploy new systems Power aware design alone has failed to stem this tide requiring temperature aware design at all system levels including the processor architecture Temperature aware design will make use of power management techniques but probably in ways that are different from those used to improve battery life or regulate peak This work was conducted while David Tarjan visited U Va during his diploma program at the Swiss Federal Institute of Technology Zu rich power Localized heating occurs much faster than chip wide heating since power dissipation is spatially non uniform across the chip this leads to hot spots and spatial gradients that can cause timing errors or even physical damage These effects evolve over time scales of hundreds of microseconds or milliseconds This means that power management techniques in order to be used for thermal management must directly target the spatial and temporal behavior of operating temperature In fact many low power techniques have little or no effect on operating temperature because they do not reduce power density in hot spots or because they only reclaim slack and do not reduce power and temperature when no slack is present Temperature aware design is therefore a distinct albeit related area of study Temperature specific design techniques to date have mostly focused on the thermal package heat sink fan etc If the package is designed for worst case power dissipation they must be designed for the most severe hot spot that could arise which is prohibitively expensive Yet these worst case scenarios are rare the majority of applications especially for the desktop do not induce sufficient power dissipation to produce the worst case temperatures A package designed for the worst case is excessive To reduce packaging cost without unnecessarily limiting performance it has been suggested 4 12 13 that the package should be designed for the worst typical application Any applications that dissipate more heat than this cheaper package can manage should engage an alternative runtime thermal management technique dynamic thermal management or DTM Since typical high power applications still operate 20 or more below the worst case 12 this can lead to dramatic savings This is the philosophy behind the thermal design of the Intel Pentium 4 12 It uses a thermal package designed for a typical high power application reducing the package s cooling requirement by 20 and its cost accordingly Should operating temperature ever exceed a safe temperature the clock is stopped we refer to this as global clock gating until the temperature returns to a safe zone This protects against both timing errors and physical damage that might result from sustained high power operation from operation at higherthan expected ambient temperatures or from some failure in the package As long as the threshold temperature that stops the clock the trigger threshold is based on the hottest temperature in the system this approach successfully regulates temperature The Need for Architecture Level Thermal Management These chip level hardware techniques illustrate both the benefits and challenges of runtime thermal management while it can substantially reduce cooling costs and still allow typical applications to run at peak performance these techniques also reduce performance for any applications that exceed the thermal design point Such performance losses can be substantial with chip wide techniques like global clock gating with a 27 slowdown for our hottest application art Instead of using chip level thermal management techniques we argue that the microarchitecture has an essential role to play The microarchitecture is unique in its ability to use runtime knowledge of application behavior and the current thermal status of different units of the chip to adjust execution and distribute the workload in order to control thermal behavior In this paper we show that architecture level thermal modeling exposes architectural techniques that regulate temperature with lower performance cost than chip wide techniques by exploiting instructionlevel parallelism ILP For example one of the best techniques we found with only an 8 slowdown was a local toggling scheme that varies the rate at which only the hot unit typically the integer
View Full Document
Unlocking...