DOC PREVIEW
UMD CMSC 714 - An Industry- Standard API for Shared- Memory Programming

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

FEATURE ARTICLE46 1070-9924/98/$10.00 © 1998 IEEE IEEE COMPUTATIONAL SCIENCE & ENGINEERINGApplication developers have long recognizedthat scalable hardware and software are nec-essary for parallel scalability in applicationperformance. Both have existed for some timein their lowest common denominator form, and scalablehardware—as physically distributed memories connectedthrough a scalable interconnection network (as a multi-stage interconnect, k-ary n-cube, or fat tree)—has beencommercially available since the 1980s. When develop-ers build such systems without any provision for cachecoherence, the systems are essentially “zeroth order”scalable architectures. They provide only a scalable in-terconnection network, and the burden of scalability fallson the software. As a result, scalable software for suchsystems exists, at some level, only in a message-passingmodel. Message passing is the native model for these ar-chitectures, and developers can only build higher-levelmodels on top of it.Unfortunately, many in the high-performance com-puting world implicitly assume that the only way toachieve scalability in parallel software is with a message-passing programming model. This is not necessarily true.A class of multiprocessor architectures is now emergingthat offers scalable hardware support for cache coher-ence. These are generally called scalable shared memorymultiprocessor architectures.1For SSMP systems, the na-tive programming model is shared memory, and messagepassing is built on top of the shared-memory model. Onsuch systems, software scalability is straightforward toachieve with a shared-memory programming model.In a shared-memory system, every processor has di-rect access to the memory of every other processor,meaning it can directly load or store any shared address.The programmer also can declare certain pieces of mem-ory as private to the processor, which provides a simpleyet powerful model for expressing and managing paral-lelism in an application.Despite its simplicity and scalability, many parallel ap-plications developers have resisted adopting a shared-memory programming model for one reason: portabil-ity. Shared-memory system vendors have created theirown proprietary extensions to Fortran or C for paral-lel-software development. However, the absence ofportability has forced many developers to adopt aportable message-passing model such as the MessagePassing Interface (MPI) or Parallel Virtual Machine(PVM). This article presents a portable alternative tomessage passing: OpenMP. OpenMP: An Industry-Standard API for Shared-Memory ProgrammingLEONARDO DAGUM AND RAMESH MENONSILICON GRAPHICS INC.♦♦♦OpenMP, the portable alternative to message passing, offers a powerful new way to achieve scalability in software. This article compares OpenMP to existing parallel-programming models.♦.JANUARY–MARCH 1998 47OpenMP was designed to exploit certain char-acteristics of shared-memory architectures. Theability to directly access memory throughout thesystem (with minimum latency and no explicitaddress mapping), combined with fast shared-memory locks, makes shared-memory architec-tures best suited for supporting OpenMP.Why a new standard?The closest approximation to a standard shared-memory programming model is the now-dormant ANSI X3H5 standards effort.2X3H5was never formally adopted as a standard largelybecause interest waned as distributed-memorymessage-passing systems (MPPs) came intovogue. However, even though hardware vendorssupport it to varying degrees, X3H5 has limita-tions that make it unsuitable for anything otherthan loop-level parallelism. Consequently, ap-plications adopting this model are often limitedin their parallel scalability.MPI has effectively standardized the message-passing programming model. It is a portable,widely available, and accepted standard for writ-ing message-passing programs. Unfortunately,message passing is generally a difficult way toprogram. It requires that the program’s datastructures be explicitly partitioned, so the entireapplication must be parallelized to work withthe partitioned data structures. There is no in-cremental path to parallelize an application.Furthermore, modern multiprocessor architec-tures increasingly provide hardware support forcache coherence; therefore, message passing isbecoming unnecessary and overly restrictive forthese systems.Pthreads is an accepted standard for sharedmemory in low-end systems. However, it is nottargeted at the technical, HPC space. There islittle Fortran support for pthreads, and it is nota scalable approach. Even for C applications, thepthreads model is awkward, because it is lower-level than necessary for most scientific applica-tions and is targeted more at providing task par-allelism, not data parallelism. Also, portabilityto unsupported platforms requires a stub libraryor equivalent workaround.Researchers have defined many new languagesfor parallel computing, but these have not foundmainstream acceptance. High-Performance For-tran (HPF) is the most popular multiprocessingderivative of Fortran, but it is mostly geared to-ward distributed-memory systems.Independent software developers of scientificapplications, as well as government laboratories,have a large volume of Fortran 77 code thatneeds to get parallelized in a portable fashion.The rapid and widespread acceptance of shared-memory multiprocessor architectures—fromthe desktop to “glass houses”—has created apressing demand for a portable way to programthese systems. Developers need to parallelize ex-isting code without completely rewriting it, butthis is not possible with most existing parallel-language standards. Only OpenMP and X3H5allow incremental parallelization of existingcode, of which only OpenMP is scalable (seeTable 1). OpenMP is targeted at developers whoneed to quickly parallelize existing scientificcode, but it remains flexible enough to support amuch broader application set. OpenMP pro-vides an incremental path for parallel conver-sion of any existing software. It also providesscalability and performance for a complete re-write or entirely new development.What is OpenMP?At its most elemental level, OpenMP is a set ofcompiler directives and callable runtime libraryroutines that extend Fortran (and separately, Cand C++) to express shared-memory parallelism.It leaves the base language unspecified, and ven-dors can implement OpenMP in any Fortrancompiler. Naturally, to support pointers and al-locatables,


View Full Document

UMD CMSC 714 - An Industry- Standard API for Shared- Memory Programming

Documents in this Course
MTOOL

MTOOL

7 pages

BOINC

BOINC

21 pages

Eraser

Eraser

14 pages

Load more
Download An Industry- Standard API for Shared- Memory Programming
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view An Industry- Standard API for Shared- Memory Programming and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An Industry- Standard API for Shared- Memory Programming 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?