DOC PREVIEW
CMU CS 15740 - MICROPROCESSOR

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

PPC 604 Powers Past PentiumEmphasis on Integer PerformanceSpeculative and Out-of-Order ExecutionFigure 1. The 604 can fetch and dispatch four instructions …Dynamic Branch PredictionDouble-Precision FPUMemory Design Avoids StallsTable 1. On a per-cycle basis, the 604 delivers twice …Table 2. The cache and TLB design of the 604 is similar …Compatible System InterfaceFigure 2. Load and store queues prevent memory accesses …Table 3. The 601, 603, and 604 provide a similar system …Figure 3. The PowerPC 604 incorporates …Performance Advantage over IntelCarving a Slice for PowerPCFigure 4. With chips shipping around the end of 1994 …Price and AvailabilityPowerPC 601 Hits 100 MHzMICROPROCESSOR REPORTPPC 604 Powers Past Pentium Vol. 8, No. 5, April 18, 1994 © 1994 MicroDesign Resourcesby Linley GwennapIBM and Motorola have taken the wraps off theirnext processor, the 604, revealing the true performancepotential of the PowerPC architecture. The chip, whichthe companies expect to ship in volume by the end of theyear, is not only far faster than current Pentium proces-sors but is likely to exceed the performance of all futurePentium chips as well. According to each company’sroadmap, Intel will not surpass the 604’s estimated 160SPECint92 performance until the P6 generation—bywhich time PowerPC will have moved to even fasterprocessors. Thus, the 604 will open a performance gapthat Intel will find difficult to close.While the 604 bears some resemblance to its littlesibling, the 603, most aspects of the design have beenpumped up to improve performance. For example, the604 can issue four instructions per cycle, twice as manyas the 603, and includes an extra integer ALU. The 604uses register renaming and out-of-order execution moreextensively than the 603. Its on-chip caches are twice aslarge, weighing in at 16K for instructions and 16K fordata. Pushing the clock rate to 100 MHz and beyond alsolifts the 604’s performance.As usual, this added performance comes at a price.The die size of the 604 is more than twice that of the 603and 20% larger than the P54C Pentium. Neither vendorhas released pricing information for the new chip, but itappears that the 604 will not replace the 601, as origi-nally planned. Rather, the 604 will be positioned abovethe new 100-MHz 601, offering significantly more per-formance than Intel’s Pentium for a similar price.The 604 will appear in PCs and workstations fromIBM as well as in Apple’s second-generation PowerMacs. The workstations will probably ship in 4Q94, withMacs and PCs shipping in 1Q95. Low-cost 604-basedsystems, running Apple’s Mac OS and IBM’s WorkplaceOS, will give the PowerPC boxes a performance advan-tage over Pentium PCs, but the 604 systems will be tooexpensive for most users, at least at first.Emphasis on Integer PerformanceLike the PowerPC 603, the 604 was designed fromscratch at the jointly owned Somerset design center, andboth Motorola and IBM will manufacture and marketthe processor. First silicon was received in January, andthese chips are currently being tested by Apple and IBM,the two lead customers. The companies say that generalsampling will not begin until 3Q94, but Canon’s Power-House subsidiary (see 0804MSB.PDF) and other inter-ested parties probably will receive early samples. Vol-ume production is planned for 4Q94.The 160-SPECint92 estimate assumes a 100-MHz604 with a 1M external cache and a 66-MHz system bus.Ultimately, the chip may do even better, as 100 MHz isthe “center frequency” of the design; some parts may runat higher clock rates. The 601, for example, was designedto a center frequency of 66 MHz and is now shipping atspeeds up to 80 MHz in the original process. A hypothet-ical 120-MHz 604 could reach 190 SPECint92.For floating-point applications, the 100-MHz 604 isestimated to achieve 165 SPECfp92. This represents amajor improvement in SPECfp92 per MHz comparedwith earlier designs, although it does not match thelarger relative increase in integer performance. The 604designers spent more effort and die area on improvinginteger performance but did not neglect floating-point.By the time the 604 is shipping, Intel expects 100-MHz Pentium chips to be available in volume desktopsystems. At the same clock rate, the 604 will deliver 60%better integer performance and nearly twice the floating-point performance of Pentium, according to Somerset’sestimates. Typical desktop systems will not include theexpensive caches used to generate the quoted perfor-mance figures, but the ratio between the 604 and Pen-tium should be similar in less expensive designs.Speculative and Out-of-Order ExecutionBecause the 603 and 604 were designed in parallel,the 604 is not derived from the 603, but the microarchi-MICROPROCESSOR THE INSIDERS’ GUIDE TO MICROPROCESSOR HARDWAREREPORTAPRIL 18, 1994VOLUME 8 NUMBER 5PPC 604 Powers Past PentiumPowerPC Chip Will Open Performance Gap, Possibly Permanently2 PPC 604 Powers Past Pentium Vol. 8, No. 5, April 18, 1994 © 1994 MicroDesign ResourcesMICROPROCESSOR REPORTtectures are similar in many ways. The 603 (see071402.PDF) can fetch and issue two instructions percycle to four function units: an integer unit, floating-point unit, load/store unit, and system-register unit. AsFigure 1 shows, the 604 adds a second integer unit andcombines the integer multiplier and divider with the sys-tem registers to create a complex integer unit. Thedesigners claim that this configuration includes threeinteger units, but in fact only two units handle typical(single-cycle) integer operations; the third unit handlesless common multicycle integer operations and accessesto the special registers.The two chips handle branch instructions some-what differently. The 603 detects and handles branchesearly in the pipeline, removing them from the instruc-tion stream; thus, the 603 can essentially execute threeinstructions per cycle if one is a branch. The 604 dis-patches up to four instructions per cycle but, like the601, has a separate unit to handle branches.As in the 603, when the 604 encounters a branch, itpredicts the outcome and begins to speculatively issueand execute instructions based on the prediction. The re-sults of speculative operations are kept in rename regis-ters or other temporary storage until the branch predic-tion is verified. The 604 includes 12 integer renameregisters and 8 FP rename registers, about twice asmany as the 603. Load instructions can be speculativelyexecuted; speculative


View Full Document

CMU CS 15740 - MICROPROCESSOR

Documents in this Course
leecture

leecture

17 pages

Lecture

Lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

9 pages

Lecture

Lecture

13 pages

lecture

lecture

25 pages

lect17

lect17

7 pages

Lecture

Lecture

65 pages

Lecture

Lecture

28 pages

lect07

lect07

24 pages

lect07

lect07

12 pages

lect03

lect03

3 pages

lecture

lecture

11 pages

lecture

lecture

20 pages

lecture

lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

10 pages

Lecture

Lecture

22 pages

Lecture

Lecture

28 pages

Lecture

Lecture

18 pages

lecture

lecture

63 pages

lecture

lecture

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

18 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

lecture

lecture

34 pages

lecture

lecture

47 pages

lecture

lecture

7 pages

Lecture

Lecture

18 pages

Lecture

Lecture

7 pages

Lecture

Lecture

21 pages

Lecture

Lecture

10 pages

Lecture

Lecture

39 pages

Lecture

Lecture

11 pages

lect04

lect04

40 pages

Load more
Download MICROPROCESSOR
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MICROPROCESSOR and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MICROPROCESSOR 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?