DOC PREVIEW
CMU CS 15740 - ultrasparc

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

UltraSparc Unleashes SPARC PerformanceFlexible Instruction AlignmentFigure 1. UltraSparc includes five floating-point/graphics units …Long Pipeline Includes FPUNew Register File Saves SpaceFigure 2. UltraSparc uses a nine-stage pipeline …FPU Includes Multimedia SupportFigure 3. UltraSparc uses fast synchronous SRAM …High-Speed System InterfaceSun, TI Jettison BiCMOSCan UltraSparc Save Sun?Price & AvailabilityMicroSparc-3 RevampedMICROPROCESSOR REPORTUltraSparc Unleashes SPARC Performance Vol. 8, No. 13, October 3, 1994 © 1994 MicroDesign Resourcesby Linley GwennapHigh-end SPARC performance, languishing at sub-Pentium levels, is set to receive a big boost next yearwhen UltraSparc debuts. Sun expects this next-genera-tion RISC chip to triple the performance of a 60-MHzSuperSparc, moving SPARC from the back of the pack towithin hailing distance of the lead. The key to this in-credible increase is a complete redesign of the processorpipeline to eliminate the constrictions of the SuperSparcdesign. The result: a projected clock speed of 167 MHz, ahuge jump for Sun and a respectable rate compared withother next-generation RISC chips.Unlike Digital, which has already measured theperformance of the 21164, Sun’s performance estimatesare conjecture, as UltraSparc has not yet seen first sili-con. Sun has built test chips to verify the speed of its de-sign and has performed extensive timing simulations,hoping to avoid the embarrassment of its SuperSparclaunch. The design avoids SuperSparc’s fatal flaws (thedouble-pumped register file and TLB), but it remains tobe seen whether Sun can deliver on its promises andturn a paper tiger into a real man-eater.The first announced processor to implement theSPARC version 9 architecture (see 070201.PDF), Ultra-Sparc is a full 64-bit design. It can issue as many as fourinstructions per cycle to nine function units: two integerALUs, one load/store unit, one branch unit, and five spe-cial-purpose units for floating-point and graphics calcu-lations. The chip has moderate on-chip caches for a pro-cessor of its generation: 16K for instructions and 16K fordata, less than SuperSparc. To make up for these modestcaches, UltraSparc connects directly to a synchronousexternal cache that can return one result per cycle. Inaddition to SPARC V9, the design implements a uniqueset of graphics and multimedia instructions.Sun has not announced price or availability for thenew processor, which will be fabricated by Texas Instru-ments. We expect UltraSparc to begin shipping in vol-ume in 3Q95, six to nine months later than the 21164.Flexible Instruction AlignmentSun, with the largest installed base of any RISCsystem vendor, has always been concerned about theperformance of existing (unrecompiled) binaries on newprocessors. UltraSparc implements a simple schemethat avoids the instruction-alignment restrictions thatprevent the 21164 and other highly superscalar proces-sors from achieving maximum performance without re-compilation. The SPARC chip fetches instructions into a12-entry FIFO buffer; the instruction dispatcher simplyissues up to four instructions from the bottom of thebuffer.This scheme works well as long as the buffer is keptreasonably full. For starters, the instruction cache candeliver four instructions (128 bits) per cycle to the buffer,but branches can disrupt this flow. To counter this prob-lem, the cache includes a “next” field that can redirectthe fetch stream if the current instruction group containsa predicted-taken branch. For cache lines that do notcontain such branches, this field contains the next se-quential address. The contents of this field direct thenext instruction fetch, eliminating any penalty for cor-rectly predicted taken branches.As they are loaded into the cache, instructions arepartially decoded to determine if they contain a branchand, if so, what the target address is. This information isused to initialize the “next” field. In what is becoming acommon superscalar design technique, the instructioncache stores four bits of decode information with each in-struction as well as two bits of branch history per cacheline. Sun’s simulations show an 88% prediction accuracyon SPECint92 using these two history bits.As Figure 1 shows, instructions are further decodedbefore being placed in the instruction buffer. Each entryin the buffer is 62 bits wide to contain all the decode in-formation. This extensive information allows the dis-patch unit to quickly decide which instructions can be is-sued and even allows time for a register file access, all ina single clock cycle.MICROPROCESSOR THE INSIDERS’ GUIDE TO MICROPROCESSOR HARDWAREREPORTOCTOBER 3, 1994VOLUME 8 NUMBER 13UltraSparc Unleashes SPARC PerformanceNext-Generation Design Could Put Sun Back in Race2 UltraSparc Unleashes SPARC Performance Vol. 8, No. 13, October 3, 1994 © 1994 MicroDesign ResourcesMICROPROCESSOR REPORTInstructions are always issued in order; if an in-struction cannot be issued due to a resource conflict or aregister dependency, no subsequent instructions are is-sued on that cycle. Unlike SuperSparc, the new designdoes not cascade the ALUs; this change prevents depen-dent integer instructions from being paired but helpssupport the high clock rate. One special case is that astore can be dispatched in the same cycle as the instruc-tion that calculates the store data; this case is handledby forwarding the result to the store queue.There is one flaw that breaks the “no alignment”strategy. The first three instructions can be dispatchedto any function unit, but the fourth can be sent to onlythe branch or floating-point units. Sun says that allow-ing the fourth slot to contain a general integer instruc-tion would have greatly increased the amount of depen-dency checking but added little performance. Restrictingthe fourth slot also reduces the number of ports in the in-teger register file.Long Pipeline Includes FPUUltraSparc uses a nine-stage pipeline, as Figure 2shows. The basic integer pipeline is actually six stages,two more than in SuperSparc; the additional stages atthe back end support the floating-point and graphicsunits.The first two stages perform instruction fetch anddecode. As noted above, the decoded instructions areplaced in the instruction buffer. If the buffer is notempty (the typical situation), instructions may wait oneor more cycles before being dispatched to the functionunits in the G (grouping) stage. The next two stages arethe classic RISC


View Full Document

CMU CS 15740 - ultrasparc

Documents in this Course
leecture

leecture

17 pages

Lecture

Lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

9 pages

Lecture

Lecture

13 pages

lecture

lecture

25 pages

lect17

lect17

7 pages

Lecture

Lecture

65 pages

Lecture

Lecture

28 pages

lect07

lect07

24 pages

lect07

lect07

12 pages

lect03

lect03

3 pages

lecture

lecture

11 pages

lecture

lecture

20 pages

lecture

lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

10 pages

Lecture

Lecture

22 pages

Lecture

Lecture

28 pages

Lecture

Lecture

18 pages

lecture

lecture

63 pages

lecture

lecture

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

18 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

lecture

lecture

34 pages

lecture

lecture

47 pages

lecture

lecture

7 pages

Lecture

Lecture

18 pages

Lecture

Lecture

7 pages

Lecture

Lecture

21 pages

Lecture

Lecture

10 pages

Lecture

Lecture

39 pages

Lecture

Lecture

11 pages

lect04

lect04

40 pages

Load more
Download ultrasparc
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view ultrasparc and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view ultrasparc 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?