DOC PREVIEW
Berkeley COMPSCI C267 - Building a Reliable Software Infrastructure for Scientific Computing

This preview shows page 1-2-3-21-22-23-43-44-45 out of 45 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Building a Reliable Software Infrastructure for Scientific ComputingOsni MarquesLawrence Berkeley National Laboratory (LBNL)[email protected] Berkeley - CS26703/31/2004UC Berkeley - CS267 2Outline• Keeping the pace with the software and hardware• Hardware evolution• Performance tuning• Software selection• What is missing?• The DOE ACTS Collection Project•Goals• Related activities• Current features• Lessons learned03/31/2004UC Berkeley - CS267 3High Performance Computers (Sustainable Performance)• ~ 20 years ago →→→→ 1x106 Floating Point Ops/sec (Mflop/s)• Scalar based• ~ 10 years ago →→→→ 1x109Floating Point Ops/sec (Gflop/s)• Vector & Shared memory computing, bandwidth aware• Block partitioned, latency tolerant• ~ Today →→→→ 1x1012 Floating Point Ops/sec (Tflop/s) • Highly parallel, distributed processing, message passing, network based• data decomposition, communication/computation• ~ 10 years away →→→→ 1x1015 Floating Point Ops/sec (Pflop/s)• Many more levels of memory hierarchy, combination of grids&HPC• More adaptive, latency and bandwidth aware, fault tolerant, extended precision, attention to SMP nodes• ~ 20 years ago →→→→ 1x106 Floating Point Ops/sec (Mflop/s)• Scalar based• ~ 10 years ago →→→→ 1x109Floating Point Ops/sec (Gflop/s)• Vector & Shared memory computing, bandwidth aware• Block partitioned, latency tolerant• ~ Today →→→→ 1x1012 Floating Point Ops/sec (Tflop/s) • Highly parallel, distributed processing, message passing, network based• data decomposition, communication/computation• ~ 10 years away →→→→ 1x1015 Floating Point Ops/sec (Pflop/s)• Many more levels of memory hierarchy, combination of grids&HPC• More adaptive, latency and bandwidth aware, fault tolerant, extended precision, attention to SMP nodes03/31/2004UC Berkeley - CS267 4ArchitecturesSingle ProcessorSMPMPPSIMDConstellation Cluster - NOW0100200300400500Jun-93Nov-93Jun-94Nov-94Jun-95Nov-95Jun-96Nov-96Jun-97Nov-97Jun-98Nov-98Jun-99Nov-99Jun-00Nov-00Jun-01Y-MP C90Sun HPCParagonCM5T3DT3ESP2Cluster ofSun HPCASCI RedCM2VP500SX303/31/2004UC Berkeley - CS267 5Automatic Tuning• For each kernel1. Identify and generate a space of algorithms2. Search for the fastest one, by running them• What is a space of algorithms?• Depending on kernel and input, may vary• instruction mix and order• memory access patterns• data structures • mathematical formulation • When do we search?• Once per kernel and architecture • At compile time• At run time• All of the above•PHiPAC:www.icsi.berkeley.edu/~bilmes/phipac• ATLAS:www.netlib.org/atlas•XBLAS:www.nersc.gov/~xiaoye/XBLAS•Sparsity:www.cs.berkeley.edu/~yelick/sparsity• FFTs and Signal Processing• FFTW: www.fftw.org• Won 1999 Wilkinson Prize for Numerical Software• SPIRAL: www.ece.cmu.edu/~spiral• Extensions to other transforms, DSPs• UHFFT • Extensions to higher dimension, parallelism03/31/2004UC Berkeley - CS267 6Tuning pays off!CAB=*Example: PHiPAC ⇒03/31/2004UC Berkeley - CS267 7What About Software Selection?• Use a direct solver (A=LU)if• Time and storage space acceptable• Iterative methods don’t converge•Manyb’s for same A• Criteria for choosing a direct solver• Symmetric positive definite (SPD)• Symmetric• Symmetric-pattern• Unsymmetric• Row/column ordering schemes available• MMD, AMD, ND, graph partitioning•HardwarebAx = :ExampleBuild a preconditioning matrix K such thatKx=b is much easier to solve than Ax=b and K is somehow “close” to A (incomplete LUdecompositions, sparse approximate inverses, polynomial preconditioners, preconditioning by blocks or domains, element-by-element, etc). See Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods.03/31/2004UC Berkeley - CS267 8Bugs…On February 25, 1991, during the Gulf War, an American Patriot Missile battery in Dharan, Saudi Arabia, failed to track and intercept an incoming Iraqi Scud missile. The Scud struck an American Army barracks, killing 28 soldiers and injuring around 100 other people. The problem was an inaccurate calculation of the time since boot due to computer arithmetic errors. http://wwwzenger.informatik.tu-muenchen.de/persons/huckle/bugse.htmlOn June 4, 1996, an Ariane 5 rocket launched by the European Space Agency exploded just forty seconds after its lift-off from Kourou, French Guiana. The rocket was on its first voyage, after a decade of development costing $7 billion. The problem was a software error in the inertial reference system. Specifically a 64 bit floating point number relating to the horizontal velocity of the rocket with respect to the platform was converted to a 16 bit signed integer. On August 23,1991, he first concrete base structure for the Sleipner A platform sprang a leak and sank under a controlled ballasting operation during preparation for deck mating in Gandsfjorden outsideStavanger, Norway. The post accident investigation traced the error to inaccurate finite element approximation of the linear elastic model of the tricell(using the popular finite element program NASTRAN). The shear stresses were underestimated by 47% leading to insufficient design. In particular, certain concrete walls were not thick enough.03/31/2004UC Berkeley - CS267 9Challenges in the Development of Scientific Codes• Productivity• Time to the first solution (prototype)• Time to solution (production)• Other requirements• Complexity• Increasingly sophisticated models• Model coupling• Interdisciplinarity• Performance• Increasingly complex algorithms• Increasingly complex architectures• Increasingly demanding applications• Productivity• Time to the first solution (prototype)• Time to solution (production)• Other requirements• Complexity• Increasingly sophisticated models• Model coupling• Interdisciplinarity• Performance• Increasingly complex algorithms• Increasingly complex architectures• Increasingly demanding applications• Libraries written in different languages.• Discussions about standardizing interfaces are often sidetracked into implementation issues. • Difficulties managing multiple libraries developed by third-parties.• Need to use more than one language in one application.• The code is long-lived and different pieces evolve at different rates• Swapping competing implementations of the same idea and


View Full Document

Berkeley COMPSCI C267 - Building a Reliable Software Infrastructure for Scientific Computing

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Building a Reliable Software Infrastructure for Scientific Computing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Building a Reliable Software Infrastructure for Scientific Computing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Building a Reliable Software Infrastructure for Scientific Computing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?