DOC PREVIEW
Berkeley COMPSCI C267 - Introduction

This preview shows page 1-2-17-18-19-36-37 out of 37 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 37 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS267 / E233 Applications of Parallel Computers Lecture 1: Introduction 1/18/99OutlineAdministrativeWhy we need powerful computersUnits of High Performance ComputingWhy we need powerful computersSome Challenge ComputationsGlobal Climate ModelingHeart SimulationParallel Computing in Web SearchApplication: Document RetrievalLSI ChallengesTransaction Processing - it’s all parallel at some scaleWhy powerful computers are parallelHow fast can a serial computer be?Trends in Parallel Computing PerformanceEmpirical Trends: Microprocessor PerformanceMicroprocessor Clock RateMicroprocessor TransistorsMicroprocessor Transistors & ParallelismProcessor-DRAM Gap (latency)1st PrinciplesPrinciples of Parallel Computing“Automatic” Parallelism in Modern MachinesFinding Enough ParallelismOverhead of ParallelismLocality and ParallelismLoad ImbalanceParallel Programming for Performance is ChallengingCourse OrganizationSchedule of TopicsReading MaterialsComputing ResourcesRequirementsProjectsWhat you should get out of the courseFirst AssignmentCS267 L1 Intro Demmel Sp 1999CS267 / E233Applications of Parallel ComputersLecture 1: Introduction1/18/99James [email protected]://www.cs.berkeley.edu/~demmel/cs267_Spr99CS267 L1 Intro.2Demmel Sp 1999Outline°Introductions °Why large important problems require the capabilities of powerful computers °Why powerful computers must be parallel processors °Structure of the courseCS267 L1 Intro.3Demmel Sp 1999Administrative°Instructors•Prof. Jim Demmel, 737 Soda, [email protected]•TA: Fred Wong, 533 Soda, [email protected]°Office hours•T Th 2:15 - 3:30, and by appointment°Accounts and others -- fill out online registration!°Class survey -- fill out online! °Discussion section: TBD, based on survey°Most class material will be on class home page (including these notes):•www.cs.berkeley.edu/~demmel/cs267_Spr99CS267 L1 Intro Demmel Sp 1999Why we need powerful computersCS267 L1 Intro.5Demmel Sp 1999Units of High Performance Computing1 Mflop 1 Megaflop 10^6 Flop/sec1 Gflop 1 Gigaflop 10^9 Flop/sec1 Tflop 1 Teraflop 10^12 Flop/sec1 MB 1 Megabyte 10^6 Bytes1 GB 1 Gigabyte 10^9 Bytes1 TB 1 Terabyte 10^12 Bytes1 PB 1 Petabyte 10^15 BytesCS267 L1 Intro.6Demmel Sp 1999 Why we need powerful computers °Traditional scientific and engineering paradigm•Do theory or paper design•Perform experiments or build system°Replacing both by numerical experiments•Real phenomena are too complicated to model by hand•Real experiments are:-too hard, e.g., build large wind tunnels -too expensive, e.g., build a throw-away passenger jet-too slow, e.g., wait for climate or galactic evolution-too dangerous, e.g., weapons, drug design°Why parallel computers for this? Serial Computers too slowCS267 L1 Intro.7Demmel Sp 1999Some Challenge Computations°Global Climate Modeling°Dyna3D- crash simulation°Astrophysical modeling°Earthquake (structures) modeling°Heart simulation°Web search°Transaction processing°Drug design °Phylogeny -- History of species°Nuclear Weapons°now.cs.berkeley.edu/MillenniumCS267 L1 Intro.8Demmel Sp 1999Global Climate Modeling°Climate is a function of 4 arguments°Which returns a vector of 6 valuesClimate(longitude, latitude, elevation, time)Temperature, pressure, humidity, and wind velocity°To model this on a computer we •discretize the domain using a finite grid, e.g., points 1 kilometer apart-roughly .1 TB of data•devise and algorithm to predict weather at time t+1 from weather at time t-e.g., solving Navier-Stokes equations for fluid flow of gasses in the atmosphere-say this is roughly 100 Flops per grid point with a timestep of 1 minute•to at least match real time (bare minimum)-5*10^11 flops / 60 secs = 8 Gflop/s•weather prediction (7 days in 24 hours) => 7x faster => 56 Gflop/s•climate prediction (50 years in 30 days) => 50*12=600x faster => 4.8 Tflops°Current models use much coarser grids-www-fp.mcs.anl.gov/chammpCS267 L1 Intro.9Demmel Sp 1999Heart Simulation°Many biological structures can be modeled as an elastic structure in an incompressible fluid.°Using the “immersed boundary method” this involves solving Navier-Stokes equations plus some feature-specific computation on the bodies [Peskin&McQueen]°20 years of development in model, used to design artificial valves°64^3 was possible on Cray YMP, but 128^3 required for accurate model (would have taken 3 years)°Done on a Cray C90 -- could use 100x faster and 100x more memoryMore computing power => more accurate (usable) modelCS267 L1 Intro.10Demmel Sp 1999Parallel Computing in Web Search°Functional parallelism •crawling, indexing, sorting°Parallelism between queries•multiple users°Finding information amidst junk°Preprocessing of the web data set to help find information°General themes of sifting through large, unstructured data sets•when to put white socks on sale•what kind of junk mail should you receive•finding medical problems in a communityCS267 L1 Intro.11Demmel Sp 1999Application: Document Retrieval°Finding useful documents on the Web °One algorithm, Latent Semantic Indexing (LSI), needs large sparse matrix-vector multiply# keywords~100K# documents ~= 10 M24 65 18•Matrix is compressed•“Random” memory access•Scatter/gather vs. cache miss per 2Flops°10 Million documents in typical matrix. °Web storage increasing 2x every 5 months.°Similar ideas may apply to image retrieval.xCS267 L1 Intro.12Demmel Sp 1999LSI Challenges°On conventional microprocessor node•UltraSparc 166 MHz, 330 Mflops peak, Cache miss is 300 ns•Matrix-vector multiply, does roughly 3 loads and 2 flops, with 1.37 cache misses on average•~4.5 Mflops (2-5 Mflops measured)•Memory accesses are irregular°On T3E•Osni Marques at LBNL parallelized for the T3E°Implementation is also I/O intensiveCS267 L1 Intro.13Demmel Sp 1999Transaction Processing - it’s all parallel at some scale°Parallelism is natural in relational operators•select, join, ...°Many difficult issues•data partitioning, locking, threading(mar. 15, 1996)05000100001500020000250000 20 40 60 80 100 120ProcessorsThroughput (tpmC)otherTandem HimalayaIBM PowerPCDEC AlphaSGI PowerChallengeHP PACS267 L1 Intro Demmel Sp 1999Why powerful computers are parallelCS267 L1 Intro.15Demmel Sp 1999How fast can a serial computer be?°Consider the 1 Tflop sequential machine•data must travel some distance, r, to get from memory to CPU•to get


View Full Document

Berkeley COMPSCI C267 - Introduction

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Introduction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?