UW-Madison ME 964 - High Performance Computing for Engineering Applications

Unformatted text preview:

ME964High Performance Computing for Engineering Applications“ I have traveled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won't last out the year.“The editor in charge of business books for Prentice Hall, 1957.© Dan Negrut, 2012ME964 UW-MadisonThe Eclipse IDEParallel Computing: why and why now?February 2, 2012Before We Get Started… Last time Wrap up quick overview of C Programming Super quick intro to gdb (debugging tool under Linux) Learn how to login into Euler Quick intro on Mercurial for revision control for handling of your assignments Today Getting started with Eclipse, an integrated development environment (Andrew) Parallel computing: why and why now? (Dan) First assignment sent out last week, available on the class website HW 1 due tonight, at 11:59 PM Post related questions to the forum2Eclipse~ An Integrated Development Environment ~3Eclipse on Euler Eclipse 3.7 (Indigo) Includes Parallel Tools Platform, Linux Tools, CMakeEditor Will be installed into your home directory Had issues installing system-wide Other versions available – just ask Managed by Environment ModulesEnabling Eclipse Open Terminal Load the Eclipse module by typing>> module load eclipse/3.7 The first time will take a while (it’s installing) Tell modules to load Eclipse by default>> module initadd eclipse/3.7 Start Eclipse eclipseCreating a Project File > New > C (C++) Project Select the Linux GCC toolchain Preferably put the source code in your repo Or copy it by hand later Enable both Debug and Release configs All this can be managed by CMake (later…)Build/Run/Debug Build with the hammer Problems will be displayed at the bottom, under ‘Problems’ and ‘Console’ Run with the ‘play’ button Output is shown under ‘Console’ Debug with the bug Switches to the ‘Debug’ perspective Frontend to GDB But not cuda-gdb (yet…)Stack traceVariables in scope, breakpoints, etc.Source codeParallel Computing:Why? & Why Now?10The Long View… Sequential computing has been losing steam recently … The rest of the decade seems to belong to parallel computing 11High Performance Computing (HPC): Why, and Why Now. Objectives of this course segment: Discuss some barriers facing the traditional sequential computation model Discuss some solutions suggested by recent trends in the hardware and software industries Overview of hardware and software solutions in relation to parallel computing12Acknowledgements Presentation on this topic includes material due to Hennessy and Patterson (Computer Architecture, 4thedition) John Owens, UC-Davis Darío Suárez, Universidad de Zaragoza John Cavazos, University of Delaware Others, as indicated on various slides I apologize if I included a slide and didn’t give credit where was due13CPU Speed Evolution[log scale]Courtesy of Elsevier: from Computer Architecture, Hennessey and Patterson, fourth edition14…we can expect very little improvement in serialperformance of general purpose CPUs. So if we are tocontinue to enjoy improvements in software capability atthe rate we have become accustomed to, we must useparallel computing. This will have a profound effect oncommercial software development including the languages,compilers, operating systems, and software developmenttools, which will in turn have an equally profound effect oncomputer and computational scientists.15John L. Manferdelli, Microsoft Corporation Distinguished Engineer, leads the eXtreme Computing Group (XCG) System, Security and Quantum Computing Research Group 15Three Walls to Serial Performance Memory Wall Instruction Level Parallelism (ILP) Wall  Power WallSource: “The Many-Core Inflection Point for Mass Market Computer Systems”, by John L. Manferdelli, Microsoft Corporation http://www.ctwatch.org/quarterly/articles/2007/02/the-many-core-inflection-point-for-mass-market-computer-systems/16Memory Wall Memory Wall: What is it? The growing disparity of speed between CPU and memory outside the CPU chip.  Memory latency is a barrier to computer performance improvements  Current architectures have ever growing caches to improve the “average memory reference” time to fetch or write instructions or data Memory Wall: due to latency and limited communication bandwidthbeyond chip boundaries.  From 1986 to 2000, CPU speed improved at an annual rate of 55% while memory access speed only improved at 10%17Memory Bandwidths[typical embedded, desktop and server computers]Courtesy of Elsevier, Computer Architecture, Hennessey and Patterson, fourth edition18Memory Speed:Widening of the Processor-DRAM Performance Gap The processor: victim of its own success So fast it left the memory behind The CPU-Memory duo can’t move as fast as you’d like (based on CPU top speeds) with a sluggish memory  Plot on next slide shows on a *log* scale the increasing gap between CPU and memory The memory baseline: 64 KB DRAM in 1980 Memory speed increasing at a rate of approx 1.07/year However, processors improved  1.25/year (1980-1986) 1.52/year (1986-2004) 1.20/year (2004-2010)19Memory Speed:Widening of the Processor-DRAM Performance GapCourtesy of Elsevier, Computer Architecture, Hennessey and Patterson, fourth edition20Memory Latency vs. Memory Bandwidth Latency: the amount of time it takes for an operation to complete Measured in seconds The utility “ping” in Linux measures the latency of a network For memory transactions: send 32 bits to destination and back, measure how much time it takes → gives you latency Bandwidth: how much data can be transferred per second You can talk about bandwidth for memory but also for a network (Ethernet, Infiniband, modem, DSL, etc.) Improving Latency and Bandwidth The job of the colleagues in Electrical Engineering Once in a while, our friends in Materials Science deliver a breakthrough Promising technology: optic networks and layered memory on top of chip21Memory Latency vs. Memory Bandwidth Memory Access Latency is significantly more challenging to improve as opposed to improving Memory Bandwidth Improving Bandwidth: add more “pipes”.  Requires more pins that come out of the chip for DRAM, for instance. Tricky… Improving Latency: not


View Full Document

UW-Madison ME 964 - High Performance Computing for Engineering Applications

Documents in this Course
Load more
Download High Performance Computing for Engineering Applications
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view High Performance Computing for Engineering Applications and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view High Performance Computing for Engineering Applications 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?