New version page

GT ECE 4893 - Architectural Comparison: Xbox 360 vs. Playstation 3

Documents in this Course
Load more
Upgrade to remove ads

This preview shows page 1-2 out of 6 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

10/29/09 1 Architectural Comparison: Xbox 360 vs. Playstation 3 Prof. Aaron Lanterman School of Electrical and Computer Engineering Georgia Institute of Technology 2 Memory: Xbox 360 vs. Playstation 3 • Xbox 360 - 512 MB, 700 MHz, GDDR3, shared by CPU and GPU • CPU accesses memory through the GPU! • GPU has 10 MB RAM embedded frame buffer • PS3 - 512 MB total • 256 MB 3.2 GHz XDR main RAM for the CPU • 256 MB 700 MHz GDDR3 video RAM for the GPU 3 Xbox 360 high-level architecture Image from J. Andrews and N. Baker, “Xbox 360 System Architecture,” Hot Chips Presentation Custom-designed XMA (Xbox Media Audio) decoder for on-the-fly decoding of compressed audio streams 4 Xbox 360’s Xenon vs. Playstation 3’s Cell Xenon CPU image from “The Microsoft Xbox 360 CPU story” www-128.ibm.com/developerworks/power/library/pa-fpfxbox Cell processor image from “IBM’s Cell Processor: Preview to Greatness?” www.pcstats.com/articleview.cfm?articleid=1727 Images not to scale Both chips clocked at a 3.2 GHz10/29/09 2 5 Xenon architecture Image from J. Andrews and N. Baker, “Xbox 360 System Architecture,” Hot Chips Presentation Front Side Bus runs at 10.8 Gbit/sec read/write 6 Cell BE architecture Local Stores Synergistic Execution Units Direct Memory Access Synergistic Processing Elements Power Processing Element (64-bit PowerPC) Image from J. Andrews and N. Baker, “Xbox 360 System Architecture,” Hot Chips Presentation 7 What the PowerPC cores have in common PPE on Cell, and each core on the Xbox 360 have: • 64-bit PowerPC architecture • Two symmetric multithreading (SMT), fine-grained hardware threads (6 total in Xbox 360) • Integer arithmetic, single and double precision floating point, single cycle for most instructions • VMX128 “Altivec” vector processor Information from Andrews & Baker and Kahle et al. 8 VMX128 “Altivec” vector processor • 128, 128-bit registers (4-element single-precision) per hardware thread – 6 total Altivec-style register files in Xbox 360 • Floating point arithmetic, dot product, permute • On Xbox 360, CPU can convert 3D data to Direct3D compressed data formats before storing in L2 cache or main memory – Typically 50% in bandwidth and memory footprint Information from Andrews & Baker and Kahle et al.10/29/09 3 9 Set-associative caches From Wikipedia entry on “cache algorithms” 10 Caches • Each PowerPC core on Xenon has: – 32 KB L1 two-way set-associative instruction cache – 32 KB L1 four-way set-associative, write-through data cache • L1 data cache doesn’t allocate cache lines on write misses • xDCBT “extended data cache block touch” instruction for prefeching data direct into L1 cache, but not L2 cache as usual – Avoids thrashing L2 cache • PowerPC core on Cell has: – 32 KB L1 instruction cache – 32 KB L1 data cache – 512 KB L2 cache Information from Andrews & Baker and Kahle et al. 11 Xenon’s L2 cache • All three PowerPC cores share a 1 Megabyte, 8-way set-associative L2 cache • Cache set locking: “common in embedded systems, but not PCs” • Lets the cores dynamically allocate L2 usage • Facilitates communication between cores • GPU can read directly from the L2 cache Information from J. Andrews and N. Baker, “Xbox 360 System Architecture,” IEEE Micro, March-April 2006, pp. 25-37. 12 Xenon’s L2 cache architecture Image from J. Brown, “The Microsoft Xbox 360 CPU story” www-128.ibm.com/ developerworks/power/library/ pa-fpfxbox10/29/09 4 13 Xenon core architecture Image from J. Brown, “The Microsoft Xbox 360 CPU story” www-128.ibm.com/developerworks/power/library/pa-fpfxbox 14 Cell PPE architecture Image from J.A. Kahle et al., “Introduction to the Cell Processor,” IBM J. Res. & Dev., Vol. 49, No. 4/5, July/Sept. 2005, pp. 589-604. 15 Cell PPE pipeline Image from J.A. Kahle et al., “Introduction to the Cell Processor,” IBM J. Res. & Dev., Vol. 49, No. 4/5, July/Sept. 2005, pp. 589-604. 16 Cell SPE architecture “The SPEs are not coprocessors.” – Mike Acton, Engine Director, Insomniac Games, and keeper of www.cellperformance.com Image from J.A. Kahle et al., “Introduction to the Cell Processor,” IBM J. Res. & Dev., Vol. 49, No. 4/5, July/Sept. 2005, pp. 589-604.10/29/09 5 17 Cell SPE pipeline Image from J.A. Kahle et al., “Introduction to the Cell Processor,” IBM J. Res. & Dev., Vol. 49, No. 4/5, July/Sept. 2005, pp. 589-604. 18 GPUs: Xbox 360 Xenos vs. PS3 RSX Images not to scale Xenos image from Wikipedia RSX image from www.pctuning.cz/index.php?option=com_content& task=view&id=7787&Itemid=88&limit=1&limitstart=2 19 Xbox 360 GPU architecture Image from J. Andrews and N. Baker, “Xbox 360 System Architecture,” Hot Chips Presentation 20 Xbox 360 GPU layout Image from J. Andrews and N. Baker, “Xbox 360 System Architecture,” Hot Chips Presentation Images not to scale10/29/09 6 21 GPUs: Xbox 360 Xenos vs. PS3 RSX (1) • Xbox 360: ATI Xenos • 500 MHz • Precursor to Radeon HD 2000 series • 16 vertex fetch units with built-in tesselation • 48 unified shaders (can do vertices or pixels) – All 48 have to be doing either vertices or pixels in one clock cycle – Can change from cycle to cycle – Rumored to have more than 48 per chip; gets higher yields • 16 texture interpolating (filtering) units • 16 texture fetch (addressing) units • 8 render output units • PS3: NVIDIA RSX “Reality Synthesizer” • 550 MHz • Somewhat like 7800 (G70) • 24 pixel shaders • 8 vertex shaders • 24 texture filtering units • 8 texture addressing units • 8 render output units 22 GPUs: Xbox 360 Xenos vs. PS3 RSX (2) • 10 MB video buffer eRAM die includes some custom logic for color, alpha compositing, Z/stencil buffering, and anti-aliasing – Does not include textures – 256 GB/sec bandwidth to GPU – Currently on separate die on same package – Guess will later probably put on same die – Buffer in eRAM is copied to main memory for output • Video buffer part of 256 MB video RAM • Cell FlexIO bus interface – 20 GB/s read to the Cell and XDR memory – 15 GB/s write to the Cell and XDR memory 23 Xbox 360 CPU/GPU/memory synergy • GPU can read data directly from CPU’s L2 cache through the FSB without going through main


View Full Document
Download Architectural Comparison: Xbox 360 vs. Playstation 3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Architectural Comparison: Xbox 360 vs. Playstation 3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Architectural Comparison: Xbox 360 vs. Playstation 3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?