MICROPROCESSOR REPORT MICROPROCESSOR REPORT THE INSIDERS GUIDE TO MICROPROCESSOR HARDWARE VOLUME 8 NUMBER 5 APRIL 18 1994 PPC 604 Powers Past Pentium PowerPC Chip Will Open Performance Gap Possibly Permanently by Linley Gwennap IBM and Motorola have taken the wraps off their next processor the 604 revealing the true performance potential of the PowerPC architecture The chip which the companies expect to ship in volume by the end of the year is not only far faster than current Pentium processors but is likely to exceed the performance of all future Pentium chips as well According to each company s roadmap Intel will not surpass the 604 s estimated 160 SPECint92 performance until the P6 generation by which time PowerPC will have moved to even faster processors Thus the 604 will open a performance gap that Intel will find difficult to close While the 604 bears some resemblance to its little sibling the 603 most aspects of the design have been pumped up to improve performance For example the 604 can issue four instructions per cycle twice as many as the 603 and includes an extra integer ALU The 604 uses register renaming and out of order execution more extensively than the 603 Its on chip caches are twice as large weighing in at 16K for instructions and 16K for data Pushing the clock rate to 100 MHz and beyond also lifts the 604 s performance As usual this added performance comes at a price The die size of the 604 is more than twice that of the 603 and 20 larger than the P54C Pentium Neither vendor has released pricing information for the new chip but it appears that the 604 will not replace the 601 as originally planned Rather the 604 will be positioned above the new 100 MHz 601 offering significantly more performance than Intel s Pentium for a similar price The 604 will appear in PCs and workstations from IBM as well as in Apple s second generation Power Macs The workstations will probably ship in 4Q94 with Macs and PCs shipping in 1Q95 Low cost 604 based systems running Apple s Mac OS and IBM s Workplace OS will give the PowerPC boxes a performance advantage over Pentium PCs but the 604 systems will be too expensive for most users at least at first PPC 604 Powers Past Pentium Vol 8 No 5 April 18 1994 Emphasis on Integer Performance Like the PowerPC 603 the 604 was designed from scratch at the jointly owned Somerset design center and both Motorola and IBM will manufacture and market the processor First silicon was received in January and these chips are currently being tested by Apple and IBM the two lead customers The companies say that general sampling will not begin until 3Q94 but Canon s PowerHouse subsidiary see 0804MSB PDF and other interested parties probably will receive early samples Volume production is planned for 4Q94 The 160 SPECint92 estimate assumes a 100 MHz 604 with a 1M external cache and a 66 MHz system bus Ultimately the chip may do even better as 100 MHz is the center frequency of the design some parts may run at higher clock rates The 601 for example was designed to a center frequency of 66 MHz and is now shipping at speeds up to 80 MHz in the original process A hypothetical 120 MHz 604 could reach 190 SPECint92 For floating point applications the 100 MHz 604 is estimated to achieve 165 SPECfp92 This represents a major improvement in SPECfp92 per MHz compared with earlier designs although it does not match the larger relative increase in integer performance The 604 designers spent more effort and die area on improving integer performance but did not neglect floating point By the time the 604 is shipping Intel expects 100MHz Pentium chips to be available in volume desktop systems At the same clock rate the 604 will deliver 60 better integer performance and nearly twice the floatingpoint performance of Pentium according to Somerset s estimates Typical desktop systems will not include the expensive caches used to generate the quoted performance figures but the ratio between the 604 and Pentium should be similar in less expensive designs Speculative and Out of Order Execution Because the 603 and 604 were designed in parallel the 604 is not derived from the 603 but the microarchi 1994 MicroDesign Resources MICROPROCESSOR REPORT tectures are similar in many ways The 603 see 071402 PDF can fetch and issue two instructions per cycle to four function units an integer unit floatingpoint unit load store unit and system register unit As Figure 1 shows the 604 adds a second integer unit and combines the integer multiplier and divider with the system registers to create a complex integer unit The designers claim that this configuration includes three integer units but in fact only two units handle typical single cycle integer operations the third unit handles less common multicycle integer operations and accesses to the special registers The two chips handle branch instructions somewhat differently The 603 detects and handles branches early in the pipeline removing them from the instruction stream thus the 603 can essentially execute three instructions per cycle if one is a branch The 604 dispatches up to four instructions per cycle but like the 601 has a separate unit to handle branches As in the 603 when the 604 encounters a branch it predicts the outcome and begins to speculatively issue and execute instructions based on the prediction The results of speculative operations are kept in rename registers or other temporary storage until the branch prediction is verified The 604 includes 12 integer rename registers and 8 FP rename registers about twice as many as the 603 Load instructions can be speculatively executed speculative stores are kept in a six entry queue and are not written to the cache until the branch prediction is verified Up to four instructions per cycle are read from the instruction cache into the four entry decode buffer These instructions are decoded in a single cycle and the BTAC Fetch Unit Addr Branch Unit Addr MMU 16K Instruction Cache 128 Predict Completion Unit BHT Decode and Dispatch Dynamic Branch Prediction 4 Instruction Buses Completion Buses Double Precision FPU Load Store Unit 64 FP Registers Rename Buffer 64 Queues Dual Integer Units 32 32 64 Integer Registers Rename Buffer MMU 16K Data Cache Complex Integer Unit 32 64 Bus Interface Addr 32 Data 64 Figure 1 The 604 can fetch and dispatch four instructions per cycle to six function units Each function unit has two reservation stations for instructions that cannot be executed immediately 2 PPC
View Full Document
Unlocking...