DOC PREVIEW
CORNELL CS 3410 - Lecture Notes

This preview shows page 1-2-17-18-19-36-37 out of 37 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1AnnouncementsSlide 3Moore’s LawProcessor Performance IncreaseWhat’s nextCloud ComputingCloud Computing = Network of DatacentersCloud ComputingCloud ComputingExample: Energy and PerformanceWhat’s nextSlide 13Brief HistorySlide 15Slide 16Faster than Moore’s LawNVidia Tesla ArchitectureWhy are GPUs so fast?Slide 20General computing with GPUsAMDs Hybrid CPU/GPUCellParallelismWhat’s nextWhere is the Market?Where is the Market?Where is the Market?Slide 29Where to?Security?Slide 32Slide 33Survey QuestionsWhy?Where to?Slide 37What does the Future Hold?Hakim WeatherspoonCS 3410, Spring 2011Computer ScienceCornell University2AnnouncementsFinal ProjectDemo Sign-Up:•Signup sheet in front of room now.•On desk in front of my office later today.sign up Monday, May 16th or Tuesday, May 17th or Wednesday, May 18th CMS submission due:•Due 11:59pm Wednesday, May 18th •Grace period until 4:59pm, May 19th3More of Moore4Moore’s LawMoore’s Law introduced in 1965•Number of transistors that can be integrated on a single die would double every 18 to 24 months (i.e., grow exponentially with time).Amazingly visionary •2300 transistors, 1 MHz clock (Intel 4004) - 1971•16 Million transistors (Ultra Sparc III)•42 Million transistors, 2 GHz clock (Intel Xeon) – 2001•55 Million transistors, 3 GHz, 130nm technology, 250mm2 die (Intel Pentium 4) – 2004•290+ Million transistors, 3 GHz (Intel Core 2 Duo) – 2007•731 Million transisters, 2-3Ghz (Intel Nehalem) - 20095Processor Performance Increase1987 1989 1991 1993 1995 1997 1999 2001 2003110100100010000YearP e rfo rm a n c e (S P E C In t)SUN-4/260MIPS M/120MIPS M2000IBM RS6000HP 9000/750DEC AXP/500IBM POWER 100DEC Alpha 4/266DEC Alpha 5/500DEC Alpha 21264/600DEC Alpha 5/300DEC Alpha 21264A/667Intel Xeon/2000Intel Pentium 4/3000Slope ~1.7x/year6What’s nextCloud Computing7Cloud ComputingDatacenters are becoming a commodityOrder online and have it delivered•Datacenter in a box: already set up with commodity hardware & software (Intel, Linux, petabyte of storage)•Plug data, power & cooling and turn on–typically connected via optical fiber–may have network of such datacenters8Cloud Computing = Network of DatacentersCloud ComputingEnable datacenters to coordinate over vast distances•Optimize availability, disaster tolerance, energy•Without sacrificing performance•“cloud computing”Drive underlying technological innovations.10The promise of the Cloud•A computer utility; a commodity•Catalyst for technology economy•Revolutionizing for health care, financial systems, scientific research, and societyHowever, cloud platforms today •Entail significant risk: vendor lock-in vs control•Entail inefficient processes: energy vs performance•Entail poor communication: fiber optics vs COTS endpointsVisionCloud Computing11Why don’t we save more energy in the cloud?No one deletes data anymore!•Huge amounts of seldom-accessed dataData deluge•Google (YouTube, Picasa, Gmail, Docs), Facebook, Flickr•100 GB per second is faster than hard disk capacity growth!•Max amount of data accessible at one time << Total dataNew scalable approach needed to store this data•Energy footprint proportional to number of HDDs is not sustainableExample: Energy and Performance12What’s nextGraphics Processing Units1314The dark ages (early-mid 1990’s), when there were only frame buffers for normal PC’s.This is where pipelines start for PC commodity graphics, prior to Fall of 1999.This part of the pipeline reaches the consumer level with the introduction of the NVIDIA GeForce256.Hardware today is moving traditional application processing (surface generation, occlusion culling) into the graphics accelerator.Some accelerators were no more than a simple chip that sped up linear interpolation along a single span, so increasing fill rate.Brief HistoryDisplayRasterizationProjection & ClippingTransform & Lighting Application15FIGURE A.2.1 Historical PC. VGA controller drives graphics display from framebuffer memory. Copyright © 2009 Elsevier, Inc. All rights reserved.1617Faster than Moore’s LawPeak Performance ('s/sec)YearHP CRXSGI IrisSGI GTHP VRXStellar GS1000SGI VGXHP TVRXSGI SkyWriterSGI E&SF300One-pixel polygons (~10M polygons @ 30Hz) SGIRE2RE1Megatek86 88 90 92 94 96 98 00104105106107108109UNC Pxpl4UNC Pxpl5UNC/HP PixelFlowFlat shading Gouraudshading AntialiasingSlope ~2.4x/year (Moore's Law ~ 1.7x/year)SGI IRE&SHarmonySGI R-MonsterDivision VPXE&S FreedomAccel/VSISVoodooGlintDivisionPxpl6PC GraphicsTexturesSGICobaltNvidia TNT3DLabsGraph courtesy of Professor John Poulton (from Eric Haines)GeForceATI Radeon 256nVidiaG7018NVidia Tesla Architecture19FIGURE A.3.1 Direct3D 10 graphics pipeline. Each logical pipeline stage maps to GPU hardware or to a GPU processor. Programmable shader stages are blue, fixed-function blocks are white, and memory objects are grey. Each stage processes a vertex, geometric primitive, or pixel in a streaming dataflow fashion. Copyright © 2009 Elsevier, Inc. All rights reserved.Why are GPUs so fast?Pipelined and parallelVery, very parallel: 128 to 1000 cores20FIGURE A.2.5 Basic unified GPU architecture. Example GPU with 112 streaming processor (SP) cores organized in 14 streaming multiprocessors (SMs); the cores are highly multithreaded. It has the basic Tesla architecture of an NVIDIA GeForce 8800. The processors connect with four 64-bit-wide DRAM partitions via an interconnection network. Each SM has eight SP cores, two special function units (SFUs), instruction and constant caches, a multithreaded instruction unit, and a shared memory. Copyright © 2009 Elsevier, Inc. All rights reserved.21General computing with GPUsCan we use these for general computation?Scientific Computing•MATLAB codesConvex hullsMolecular DynamicsEtc.NVIDIA’s answer:Compute Unified Device Architecture (CUDA)•MATLAB/Fortran/etc.  “C for CUDA”  GPU Codes22AMDs Hybrid CPU/GPUAMD’s Answer: Hybrid CPU/GPU23CellIBM/Sony/ToshibaSony Playstation 3PPESPEs (synergestic)24ParallelismMust exploit parallelism for performance•Lots of parallelism in graphics applications•Lots of parallelism in scientific computingSIMD: single instruction, multiple data•Perform same operation in parallel on many data items•Data parallelismMIMD: multiple instruction, multiple data•Run separate programs in parallel (on different data)•Task parallelism25What’s nextEmbedded Processors26Where is the Market?1998


View Full Document

CORNELL CS 3410 - Lecture Notes

Documents in this Course
Marra

Marra

43 pages

Caches

Caches

34 pages

ALUs

ALUs

5 pages

Caches!

Caches!

54 pages

Memory

Memory

41 pages

Caches

Caches

32 pages

Caches

Caches

54 pages

Caches

Caches

34 pages

Caches

Caches

54 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?