Unformatted text preview:

High speed interconnect and packaging design of the IBM System z9 processor cage This paper describes the system packaging and technologies of the IBM System z9e enterprise class server The central electronic complex of the system consists of four nodes each housing a multichip module MCM with 16 chips consuming up to 1 200 W The z9e server doubles the multiprocessor performance of the System z990 by increasing the central processing unit CPU con guration and using an internally developed elastic interface to increase interconnect speed on all high speed buses In contrast to all previous zSeriest designs which were running at half of the processor speed the packaging interconnects on the multichip module run at the same speed as the processor 1 72 GHz High frequencies and massively parallel connectivity lead to a raw packaging bandwidth of up to 1 764 GB s between processors and cache within a single frame for a fully con gured four node z9 system 1 Introduction The IBM System z9 server follows the IBM System z family as described in 1 3 It was developed using System z990 packaging technology 2 but now includes 90 nm chip technology which results in higher signal frequency and leakage current growth It reuses the modular concept in which up to four processor nodes can be plugged into a single computing machine This results in a maximum con guration of 64 processors per frame The complete system consists of two racks housing the processor nodes three I O cages a modular refrigeration cooling unit and the bulk power supplies In contrast to the z990 server a exible I O and memory con guration supports the on demand business concept in which up to eight memory cards and up to eight I O cards can be plugged into a single node New reliability features such as concurrent node upgrade allow the customer to add processor nodes without a system reboot and dynamic oscillator card switching improves reliability and serviceability RAS The details of the system structure are described in Section 2 Section 3 describes the multichip module MCM technology and design The 375 lm glass ceramic pitch of the previous system generation 2 3 was reduced to 350 lm to contain the routing within 102 layers as in the H Harrer D M Dreps T M Winkel W Scholz B G Truong A Huber T Zhou K L Christian G F Goth z990 system The C4 solder bump pitch of the chips was adapted to match the via pitch of the module The high chip power dissipation made it necessary to develop a new sort strategy The processor chips were sorted into low leakage current chips and high leakage current chips so that the eight processor chips on the MCM would comprise a combination of high leakage and low leakage chips and would limit worst case power consumption Section 4 compares the technology and design of the cards and boards Despite running at higher frequencies and using net lengths for the point to point nets of the ring which connects the MCM between the processor nodes that are similar to those of the System z990 the System z9 does not use low loss dielectric material in its circuit cards In a two or three node con guration those circuit card nets reach a line length of more than 80 cm and join to four very high density metric VHDM connectors This was achieved by the design features of the Elastic Interface 2 EI 2 These features include Vref forwarding where the receiver threshold is generated directly from the transmitted clock signal drivers with de emphasis and restricted placement of driving and receiving circuits to minimize loss of the on chip wiring 4 Copyright 2007 by International Business Machines Corporation Copying in printed form for private use is permitted without payment of royalty provided that 1 each reproduction is done without alteration and 2 the Journal reference and IBM copyright notice are included on the rst page The title and abstract but no other portions of this paper may be copied or distributed royalty free without further permission by computer based and other information service systems Permission to republish any other portion of this paper must be obtained from the Editor 0018 8646 07 5 00 2007 IBM IBM J RES DEV VOL 51 NO 1 2 JANUARY MARCH 2007 H HARRER ET AL 37 Furthermore although all high speed signals on the z990 were bidirectional they are unidirectional on the System z9 On the z990 the bidirectional nets had to be quiesced by sending zero data before switching the direction Since the net topology and the net lengths did not change the number of zero cycles would have had to be increased with the reduction of the cycle time and this was not acceptable for the overall system performance Because the packaging technology did not support a doubling of the I O count an increase in frequency was the only solution Thus all buses on the MCM run at the same speed as the processor which between the processor and cache chips results in a total raw packaging bandwidth of 441 GB s for each node This can be achieved by improvements in the source synchronous I O circuits EI 2 on the chips as shown in Section 5 Another major challenge was chip power delivery described in Section 6 Leakage currents in 90 nm technology resulted in a signi cant power increase especially in the air cooled backup mode at high temperatures New dc drop analysis methodologies were required for robust power delivery up to 1 200 W to the processor cache and bus adapter chips on the multichip module Cooling is handled by a modular refrigeration unit MRU that cools the central electronic complex CEC chips to 458C This low operating temperature enables high reliability and reduced leakage power An air cooled backup mode at lower chip frequencies ensures system operation in case of an MRU failure The cooling of the MCM did require improvements in the technology of the z990 5 i e the thermal resistance between the processor chips and the hat was reduced This was achieved by using a small gap technology SGT 6 in which the MCM hat has special cooling pistons for each CPU allowing reduction of the gap between chip and cooling hat as shown in Section 7 2 Logical system structure 38 The node based server design of the System z9 accommodates up to 32 processor chips or 64 processor cores per system The system memory size can be increased up to a maximum of 512 GB and the system I O connectivity has been enhanced to a maximum of 64 self timed interface STI I O paths each with a capability of 2 27 Gb s The increase in number of processors memory size


View Full Document

UCSD CSE 291 - IBM System z9 Processor Cage

Documents in this Course
Bluegene

Bluegene

23 pages

TinyECC

TinyECC

19 pages

MultiNet

MultiNet

18 pages

Lecture 2

Lecture 2

23 pages

AdaBoost

AdaBoost

25 pages

Lecture 9

Lecture 9

46 pages

Lecture

Lecture

5 pages

GPSR

GPSR

18 pages

Load more
Loading Unlocking...
Login

Join to view IBM System z9 Processor Cage and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view IBM System z9 Processor Cage and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?