MULTIPROCESSORS ON A CHIPLeon GuDipti Motiani15-740: Computer Architecture, Fall, 2003Papersl K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, C. R. Moore, " Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture," in ISCA, 2003. l Paramjit S. Oberoi and Gurindar S. Sohi, Out-of-Order Instruction Fetch using Multiple Sequencers, The 2002 International Conference on Parallel Processing (ICPP-31), Aug. 18-21, 2002.Paper1: Motivationl Increasingly specialized architectures¡Processor fragilityl On-chip communication latenciesl Choose processor granularity¡Different types of Parallelisml Instruction l Threadl Data (Streaming)Processor GranularityTLPILPLogicallyPhysicallyPolymorphous…?Polymorphous ArchitectureTRIPS ArchitecturePolymorphous Resourcesl Frame Space¡Manage reservation stationsl Register File Banks¡Extra registers used in different waysl Block Sequencing Controls¡Policies to allocate processor to blocksl Memory Tiles¡Tiles closer to ALUs provide special high-bandwidth memoryModes of ExecutionS-MorphD-MorphT-MorphFramesRegistersBlock ControlMemory TilesDiscussionl Granularityl Stress on Compiler and OSl When and how to initiate reconfigurationl Propose to build by 2005…Papersl K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, C. R. Moore, " Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture," in ISCA, 2003. l Paramjit S. Oberoi and Gurindar S. Sohi, Out-of-Order Instruction Fetch using Multiple Sequencers, The 2002 International Conference on Parallel Processing (ICPP-31), Aug. 18-21, 2002.Motivation…l Previous work¡Fetching multiple discontinuous I-cache lines¡Trace Cachesl Instructions parallelism in traces¡Only a small fraction are executed immediately¡Parallelism between several tracesl Applications require fetching multiple threads.Multiple Sequencersl Fetch contiguous instructions from multiple points in a programl Multiple trace-granularity sequencer¡ fetch bandwidth of a trace cache ¡ storage efficiency of an instruction cacheDesign Detailsl Trace selection¡ Terminated at call, return or indirect branch, or traces are too long.l Returns and indirect branches¡ Return address stack (RAS)l Trace prediction ¡ Hash function of trace identifier.l Out-of-order renamingTrace ReuseInstructions fetched - normalized w.r.t. instructions executedSequencer WidthToo many sequencers leads to incorrect prediction, hence, loss in performance.ScalingMS more tolerant to cache misses.Discussionl Trace cache vs. Multiple sequencers¡Performance¡Storage Efficiency¡ImplementationMULTIPROCESSORS ON A CHIPLeon GuDipti Motiani15-740: Computer Architecture, Fall,
View Full Document