DOC PREVIEW
MIT 3 11 - BlueGene/L Supercomputer

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BlueGene/L SupercomputerSlide 2BlueGene/L512 Way BG/L PrototypeBlueGene/L Interconnection NetworksComplete BlueGene/L System at LLNLSummary of performance resultsSlide 8Slide 901/14/19 1BlueGene/L Supercomputer George ChiuIBM Research01/14/19 2Supercomputer Peak Performance 1940 1950 1960 1970 1980 1990 2000 2010Year Introduced1E+21E+51E+81E+111E+141E+17Peak Speed (flops)Doubling time = 1.5 yr.ENIAC (vacuum tubes)UNIVACIBM 701IBM 704IBM 7090 (transistors)IBM StretchCDC 6600 (ICs)CDC 7600CDC STAR-100 (vectors)CRAY-1Cyber 205X-MP2 (parallel vectors)CRAY-2X-MP4Y-MP8i860 (MPPs)ASCI White, ASCI QPetaflopBlue Gene/LBlue PacificDeltaCM-5ParagonNWTASCI Red OptionASCI RedCP-PACSEarthVP2600/10SX-3/44Red StormILLIAC IVSX-2SX-4SX-5S-810/20T3DT3Emulti-Petaflop01/14/19 3BlueGene/LC h i p( 2 p r o c e s s o r s )C o m p u t e C a r d( 2 c h i p s , 2 x 1 x 1 )N o d e B o a r d( 3 2 c h i p s , 4 x 4 x 2 )1 6 C o m p u t e C a r d sS y s t e m( 6 4 c a b i n e t s , 6 4 x 3 2 x 3 2 )C a b i n e t( 3 2 N o d e b o a r d s , 8 x 8 x 1 6 )2 . 8 / 5 . 6 G F / s4 M B5 . 6 / 1 1 . 2 G F / s0 . 5 G B D D R9 0 / 1 8 0 G F / s8 G B D D R2 . 9 / 5 . 7 T F / s2 5 6 G B D D R1 8 0 / 3 6 0 T F / s1 6 T B D D R01/14/19 4512 Way BG/L Prototype01/14/19 5BlueGene/L Interconnection Networks3 Dimensional TorusInterconnects all compute nodes (65,536)Virtual cut-through hardware routing1.4Gb/s on all 12 node links (2.1 GB/s per node)Communications backbone for computations0.7/1.4 Tb/s bisection bandwidth, 67TB/s total bandwidthGlobal TreeOne-to-all broadcast functionalityReduction operations functionality2.8 Gb/s of bandwidth per linkLatency of tree traversal 2.5 µs~23TB/s total binary tree bandwidth (64k machine)Interconnects all compute and I/O nodes (1024)EthernetIncorporated into every node ASICActive in the I/O nodes (1:64)All external comm. (file I/O, control, user interaction, etc.)01/14/19 6Complete BlueGene/L System at LLNLBG/Lcomputenodes65,536BG/LI/O nodes1,024Federated Gigabit Ethernet Switch2,048 portsFront-end nodesService nodeWANvisualizationarchiveCWFS88Control network85121286448102401/14/19 7Summary of performance resultsDGEMM: 92.3% of dual core peak on 1 node Observed performance at 500 MHz: 3.7 GFlopsProjected performance at 700 MHz: 5.2 GFlops (tested in lab up to 650 MHz)LINPACK:77% of peak on 1 node 70% of peak on 512 nodes (1435 GFlops at 500 MHz)sPPM, UMT2000:Single processor performance roughly on par with POWER3 at 375 MHzTested on up to 128 nodes (also NAS Parallel Benchmarks)FFT:Up to 508 MFlops on single processor at 444 MHz (TU Vienna)Pseudo-ops performance (5N log N) @ 700 MHz of 1300 Mflops (65% of peak)STREAM – impressive results even at 444 MHz:Tuned: Copy: 2.4 GB/s, Scale: 2.1 GB/s, Add: 1.8 GB/s, Triad: 1.9 GB/s Standard: Copy: 1.2 GB/s, Scale: 1.1 GB/s, Add: 1.2 GB/s, Triad: 1.2 GB/sAt 700 MHz: Would beat STREAM numbers for most high end microprocessorsMPI:Latency – < 4000 cycles (5.5 s at 700 MHz)Bandwidth – full link bandwidth demonstrated on up to 6 links01/14/19 8ApplicationsBG/L is a general purpose technical supercomputer N-body simulationƒmolecular dynamics (classical and quantum)ƒplasma physicsƒstellar dynamics for star clusters, galaxiesComplex multiphysics codeƒComputational Fluid Dynamics (weather, climate, sPPM...)ƒAccretion ƒRaleigh-Jeans instability ƒplanetary formation and evolution ƒradiative transportƒMagnetohydrodynamicsModeling thermonuclear events in/on astrophysical objectsƒneutron stars ƒwhite dwarfs ƒsupernovaeRadiotelescopeFFT01/14/19 9SummaryEmbedded technology promises to be an efficient path toward building massively parallel computers optimized at the system level. Cost/performance is ~20x better than standard methods to get to TFlops. Low Power is critical to achieving a dense, simple, inexpensive packaging solution.Blue Gene/L will have a scientific reach far beyond existing limits for a large class of important scientific problems. Blue Gene/L will give insight into possible future product directions. Blue Gene/L hardware will be quite flexible. A mature, sophisticated software environment needs to be developed to really determine the reach (both scientific and commercial) of this


View Full Document

MIT 3 11 - BlueGene/L Supercomputer

Documents in this Course
Load more
Download BlueGene/L Supercomputer
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view BlueGene/L Supercomputer and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view BlueGene/L Supercomputer 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?