DOC PREVIEW
UCF COT 4810 - Game Processor Architectures

This preview shows page 1-2-17-18-19-35-36 out of 36 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Game Processor ArchitecturesDr. Mark HeinrichCOT 4810UCF EECSUniversity of Central FloridaPlayStation 3University of Central FloridaCell Broadband Engine (CBE)University of Central FloridaCell Origins and Acknowledgments Cell is the result of a partnership between Sony, Toshiba, and IBM Cell represents the work of more than 400 people starting in 2001 More detailed papers on the Cell implementation and the SPE micro-architecture can be found in the ISSCC 2005 proceedingsUniversity of Central FloridaCBE Architecture Effectively a 9-way multiprocessor– 8-way CMP plus one control processor, designed by IBM One main 64-bit PPE processor–Power Processor Element, 2 hardware threads– Good at control tasks, task switching, OS-level code 8 SPE processors–Synergistic Processor Element – Good at compute-intensive tasks Like SIMD multiprocessors of old…sort ofUniversity of Central FloridaAttributes of Cell Cell is Multi-Core– Contains 64-bit Power Architecture TM– Contains 8 Synergistic Processor Elements (SPE) Cell is a Flexible Architecture– Multi-OS support (including Linux) with Virtualization technology– Path for OS, legacy apps, and software development Cell is a Broadband Architecture– SPE is RISC architecture with SIMD organization and Local Store– 128+ concurrent transactions to memory per processor (16 per SPE) Cell is a Real-Time Architecture– Resource allocation (for Bandwidth Measurement)– Locking Caches (via Replacement Management Tables)University of Central FloridaCBE Block DiagramUniversity of Central FloridaAnother CBE Block DiagramPXUEIB (up to 96 Bytes/cycle)SXULSSXULSLSSXULSLSSXULSDual XDRTMFlexIOTMLSSXULSSXU SXUSXUBICMICL2L1MFC MFCMFCMFC MFCMFC MFCMFCPPESPESPUUniversity of Central FloridaPower Processor Element 64-bit PowerPC Architecture In-order, 2-way hardware Multi-threaded RISC core Coherent Load/Store with 32KB I & D L1 and 512KB L2 Tradition virtual memory subsystem Supports Vector/SIMD instruction set Runs OS, manages system resources etcUniversity of Central FloridaSynergistic Processor Element RISC core Dual issue, up to 16-way 128-bit SIMD 128-bit, 128 entry register file 256kb local store Vector/SIMD MFC controls DMAs to/from Local Store over EIBUniversity of Central FloridaSynergistic Processor Element User-mode architecture– No translation/protection within SPU– DMA is full Power Arch protect/x-late Direct programmer control–DMA/DMA-list– Branch hint VMX-like SIMD dataflow– Broad set of operations– Graphics SP-Float– IEEE DP-Float (BlueGene-like)256kB Local Store– Combined I & D– 16B/cycle L/S bandwidth– 128B/cycle DMA bandwidthLSLSLSLSGPRFXU ODDFXU EVNSFPDPCONTROLCHANNELDMASMMATOSBIRTBBEBFWDSPUSMF14.5mm2(90nm SOI)University of Central FloridaDMA Transfers Primary method of transferring data to/from SPU’s local store Maximum size 16KB Can be initiated by either PPE or SPE, but typically initiated by the SPE Offloads data transfer work to DMA controller– SPU continues with computation Double buffer for efficient useUniversity of Central FloridaUsage modelsMultistage pipeline Parallel ServicesUniversity of Central FloridaCell Can Support Many Systems Game console systems Blade systems (QS20) HDTV Home media servers SupercomputersCELLProcessorXDRtmXDRtmIOIF0 IOIF1CELLProcessorXDRtmXDRtmIOIF BIFCELLProcessorXDRtmXDRtmIOIFCELLProcessorXDRtmXDRtmIOIFBIFCELLProcessorXDRtmXDRtmIOIFCELLProcessorXDRtmXDRtmIOIFBIFCELLProcessorXDRtmXDRtmIOIFSWUniversity of Central FloridaProgramming the CBE C/C++ with PPU/SPU vector intrinsics SDK available from IBM– System Simulator– GNU/Toolchain– Documentation– Sample code More laterUniversity of Central FloridaTargetting the SPU Two Pipelines, dual-issue– Even (load & store)– Odd (execute) Design for maximum SIMD operation No SPU hardware branch prediction– Programmer/compiler specified branch hints– ~20 cycle penalty for misdirected branch hints Maximum use of register file– Loop unrollingUniversity of Central FloridaCell Characteristics Clock speed– > 3.2 GHz Peak performance (single precision)– > 204.8 GFLOPS Peak performance (double precision)– > 20.8 GFLOPS Area 221 mm2 Technology 90nm SOI Total # of transistors 234M (slightly less than Core 2 Duo)University of Central FloridaPeak GFLOPs (SPEs only)020406080100120140160180200SinglePrecisionDoublePrecisionFreeScaleDC 1.5 GHzPPC 970 2.2 GHzAMD DC 2.2 GHzIntel SC3.6 GHzCell 3.0 GHzUniversity of Central FloridaXbox360 – “Xenon” processor Provides game developers with a balanced, powerful platform– Three SMT processors, 32KB L1 D$ & I$, 1MB UL2 cache– 165M transistors total– 3.2 GHz Near-POWER ISA– 2-issue, 21-stage pipeline, with 128 128-bit registers– Weak branch prediction – supported by software hinting– In-order instructions– Narrow cores – 2 INT units, 2 128-bit VMX units, 1 of anything else An ATI-designed 500MZ GPU w/ 512MB of DDR3DRAM– 337M transistors, 10MB framebuffer– 48 pixel shader cores, each with 4 ALUsUniversity of Central FloridaXenon DiagramCore 0L1D L1I Core 1L1D L1I Core 2L1D L1I 1MB UL2 512MB DRAM GPUBIU/IO Intf3D Core 10MBEDRAMVideoOutMC0MC1AnalogChipXMA DecSMCDVDHDD PortFront USBs (2)WirelessMU ports (2 USBs)Rear USB (1)EthernetIRAudio OutFlashSystems ControlVideo OutUniversity of Central FloridaXbox 360 CPU– Custom-designed IBM PowerPC-based CPU with 3 symmetrical cores running at 3.2Ghz each; 2 hardware threads per core and 6 hardware threads totalUniversity of Central FloridaGraphics Processor 500 MHz custom-designed chip– Developed by Microsoft and ATI– 48 parallel processing units– 10 MB of embedded DRAM– Unified Shader Architecture-One unit can execute both pixel and vertex shader instructionsUniversity of Central FloridaMemory/Hard Drive Total memory– 512MB GDDR3 RAM Hard drive– Detachable and upgradeable 20GB hard drive– Serial ATA interface for data and power connector– Same as any other SATA notebook HDDUniversity of Central FloridaHardware Abstraction Layer There is virtually no hardware abstraction layer on the Xbox 360. Everything has direct access to the hardware This eliminates a lot of lagging and software overhead you could possibly see in a PCUniversity of Central FloridaI/O Ports 3 USB 2.0 Ports– Used for controllers,


View Full Document

UCF COT 4810 - Game Processor Architectures

Documents in this Course
Spoofing

Spoofing

25 pages

CAPTCHA

CAPTCHA

18 pages

Load more
Download Game Processor Architectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Game Processor Architectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Game Processor Architectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?