DOC PREVIEW
Berkeley COMPSCI 252 - Lec 5 – Projects + Prerequisite Quiz

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EECS 252 Graduate Computer Architecture Lec 5 – Projects + Prerequisite QuizReview from last lecture #1/3: The Cache Design SpaceReview from last lecture #2/3: CachesReview from last lecture #3/3: TLB, Virtual MemoryProblems with Sea ChangeBuild Academic MPP from FPGAsCharacteristics of Ideal Academic CS Research Supercomputer?Why RAMP Good for Research MPP?RAMP 1 HardwareMultiple Module RAMP 1 SystemsQuick Sanity CheckRAMP Development PlanRAMP MilestonesPowerPoint PresentationGateware Design FrameworkSlide 16RAMP FAQSlide 18Slide 19RAMP StatusRAMP uses (internal)Multiprocessing Watering HoleSupporters (wrote letters to NSF)RAMP SummaryCS 252 ProjectsCS252: Administrivia4 PapersComputers in the NewsSOTU TranscriptSlide 30Slide 31Slide 32EECS 252 Graduate Computer Architecture Lec 5 – Projects + Prerequisite Quiz David PattersonElectrical Engineering and Computer SciencesUniversity of California, Berkeleyhttp://www.eecs.berkeley.edu/~pattrsnhttp://www-inst.eecs.berkeley.edu/~cs25201/13/19CS252-s06, Lec 05-projects + prereq2Review from last lecture #1/3: The Cache Design Space•Several interacting dimensions–cache size–block size–associativity–replacement policy–write-through vs write-back–write allocation•The optimal choice is a compromise–depends on access characteristics»workload»use (I-cache, D-cache, TLB)–depends on technology / cost•Simplicity often winsAssociativityCache SizeBlock SizeBadGoodLess MoreFactor A Factor B01/13/19CS252-s06, Lec 05-projects + prereq3Review from last lecture #2/3: Caches•The Principle of Locality:–Program access a relatively small portion of the address space at any instant of time.»Temporal Locality: Locality in Time»Spatial Locality: Locality in Space•Three Major Categories of Cache Misses:–Compulsory Misses: sad facts of life. Example: cold start misses.–Capacity Misses: increase cache size–Conflict Misses: increase cache size and/or associativity.Nightmare Scenario: ping pong effect!•Write Policy: Write Through vs. Write Back•Today CPU time is a function of (ops, cache misses) vs. just f(ops): affects Compilers, Data structures, and Algorithms01/13/19CS252-s06, Lec 05-projects + prereq4Review from last lecture #3/3: TLB, Virtual Memory•Page tables map virtual address to physical address•TLBs are important for fast translation•TLB misses are significant in processor performance–funny times, as most systems can’t access all of 2nd level cache without TLB misses!•Caches, TLBs, Virtual Memory all understood by examining how they deal with 4 questions: 1) Where can block be placed?2) How is block found? 3) What block is replaced on miss? 4) How are writes handled?•Today VM allows many processes to share single memory without having to swap all processes to disk; today VM protection is more important than memory hierarchy benefits, but computers insecure01/13/19CS252-s06, Lec 05-projects + prereq51. Algorithms, Programming Languages, Compilers, Operating Systems, Architectures, Libraries, … not ready for 1000 CPUs / chip2. Software people don’t start working hard until hardware arrives•3 months after HW arrives, SW people list everything that must be fixed, then we all wait 4 years for next iteration of HW/SW3. How get 1000 CPU systems in hands of researchers to innovate in timely fashion on in algorithms, compilers, languages, OS, architectures, … ?4. Skip the waiting years between HW/SW iterations?Problems with Sea Change01/13/19CS252-s06, Lec 05-projects + prereq6Build Academic MPP from FPGAs •As ~ 25 CPUs fit in Field Programmable Gate Array, 1000-CPU system from ~ 40 FPGAs?•16 32-bit simple “soft core” RISC at 150MHz in 2004 (Virtex-II)•FPGA generations every 1.5 yrs; ~2X CPUs, ~1.2X clock rate•HW research community does logic design (“gate shareware”) to create out-of-the-box, MPP–E.g., 1000 processor, standard ISA binary-compatible, 64-bit, cache-coherent supercomputer @ 200 MHz/CPU in 2007–RAMPants: Arvind (MIT), Krste Asanovíc (MIT), Derek Chiou (Texas), James Hoe (CMU), Christos Kozyrakis (Stanford), Shih-Lien Lu (Intel), Mark Oskin (Washington), David Patterson (Berkeley, Co-PI), Jan Rabaey (Berkeley), and John Wawrzynek (Berkeley, PI)•“Research Accelerator for Multiple Processors”01/13/19CS252-s06, Lec 05-projects + prereq7Characteristics of Ideal Academic CS Research Supercomputer?•Scale – Hard problems at 1000 CPUs•Cheap – 2006 funding of academic research•Cheap to operate, Small, Low Power – $ again•Community – share SW, training, ideas, …•Simplifies debugging – high SW churn rate•Reconfigurable – test many parameters, imitate many ISAs, many organizations, …•Credible – results translate to real computers•Performance – run real OS and full apps, results overnight01/13/19CS252-s06, Lec 05-projects + prereq8Why RAMP Good for Research MPP? SMP Cluster Simulate RAMPScalability (1k CPUs) C A A ACost (1k CPUs) F ($40M) C ($2-3M) A+ ($0M) A ($0.1-0.2M) Cost of ownership A D A APower/Space(kilowatts, racks)D (120 kw, 12 racks)D (120 kw, 12 racks)A+ (.1 kw, 0.1 racks) A (1.5 kw, 0.3 racks) Community D A A AObservability D C A+ A+Reproducibility B D A+ A+Reconfigurability D C A+ A+Credibility A+ A+ F APerform. (clock) A (2 GHz) A (3 GHz) F (0 GHz) C (0.1-.2 GHz)GPA C B- B A-01/13/19CS252-s06, Lec 05-projects + prereq9•Completed Dec. 2004 (14x17 inch 22-layer PCB)•Module:–5 Virtex II FPGAs, 18 banks DDR2-400 memory, 20 10GigE conn.–Administration/maintenance ports:»10/100 Enet»HDMI/DVI»USB–~$4K in Bill of Materials (w/o FPGAs or DRAM)RAMP 1 HardwareBEE2: Berkeley Emulation Engine 2By John Wawrzynek and Bob Brodersen with students Chen Chang and Pierre Droz01/13/19CS252-s06, Lec 05-projects + prereq10Multiple Module RAMP 1 Systems•8 compute modules (plus power supplies) in 8U rack mount chassis•2U single module tray for developers•Many topologies possible•Disk storage: via disk emulator + Network Attached Storage01/13/19CS252-s06, Lec 05-projects + prereq11Quick Sanity Check•BEE2 uses old FPGAs (Virtex II), 4 banks DDR2-400/cpu•16 32-bit Microblazes per Virtex II FPGA, 0.75 MB memory for caches–32 KB direct mapped Icache, 16 KB direct mapped Dcache•Assume 150 MHz, CPI is 1.5 (4-stage pipe) –I$ Miss rate is 0.5% for SPECint2000–D$ Miss rate is 2.8% for SPECint2000, 40% Loads/stores•BW need/CPU = 150/1.5*4B*(0.5% + 40%*2.8%) = 6.4 MB/sec•BW need/FPGA = 16*6.4 =


View Full Document

Berkeley COMPSCI 252 - Lec 5 – Projects + Prerequisite Quiz

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Lec 5 – Projects + Prerequisite Quiz
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lec 5 – Projects + Prerequisite Quiz and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lec 5 – Projects + Prerequisite Quiz 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?