DOC PREVIEW
Berkeley COMPSCI C267 - Unified Parallel C (UPC)

This preview shows page 1-2-3-4-5-6-44-45-46-47-48-49-50-89-90-91-92-93-94 out of 94 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 94 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

3/1/2004 CS267 Lecure 20 1CS 267Unified Parallel C (UPC)Kathy Yelickhttp://www.cs.berkeley.edu/~yelick/cs267Slides adapted from some by Tarek El-Ghazawi (GWU)3/1/2004 CS267 Lecure 20 2UPC Outline1. Background and Philosophy2. UPC Execution Model3. UPC Memory Model4. Data and Pointers5. Dynamic Memory Management6. Programming Examples8. Synchronization9. Performance Tuning and Early Results10. Concluding Remarks3/1/2004 CS267 Lecure 20 3Context• Most parallel programs are written using either:• Message passing with a SPMD model• Usually for scientific applications with C++/Fortran• Scales easily• Shared memory with threads in OpenMP, Threads+C/C++/F or Java• Usually for non-scientific applications• Easier to program, but less scalable performance• Global Address Space (GAS) Languages take the best of both• global address space like threads (programmability)• SPMD parallelism like MPI (performance)• local/global distinction, i.e., layout matters (performance)3/1/2004 CS267 Lecure 20 4Partitioned Global Address Space Languages• Explicitly-parallel programming model with SPMD parallelism• Fixed at program start-up, typically 1 thread per processor• Global address space model of memory• Allows programmer to directly represent distributed data structures• Address space is logically partitioned• Local vs. remote memory (two-level hierarchy)• Programmer control over performance critical decisions• Data layout and communication • Performance transparency and tunability are goals• Initial implementation can use fine-grained shared memory• Base languages differ: UPC (C), CAF (Fortran), Titanium (Java)3/1/2004 CS267 Lecure 20 5Global Address Space Eases Programming• The languages share the global address space abstraction• Shared memory is partitioned by processors• Remote memory may stay remote: no automatic caching implied• One-sided communication through reads/writes of shared variables• Both individual and bulk memory copies • Differ on details• Some models have a separate private memory area• Distributed array generality and how they are constructedSharedGlobal address spaceX[0]Privateptr: ptr: ptr: X[1] X[P]Thread0Thread1Threadn3/1/2004 CS267 Lecure 20 6One-Sided Communication May Improve Performance0510152025T3E/ShmT3E/E-RegT3E/MPIIBM/LAPIIBM/MPIQuadrics/ShmQuadrics/MPIMyrinet/GMMyrinet/MPIGigE/VIPLGigE/MPIusecAdded LatencySend Overhead (Alone)Send & Rec OverheadRec Overhead (Alone)• Potential performance advantage for fine-grained, one-sided programs• Potential productivity advantage for irregular applications3/1/2004 CS267 Lecure 20 7Current Implementations• A successful language/library must run everywhere•UPC• Commercial compilers available on Cray, SGI, HP machines• Open source compiler from LBNL/UCB (and another from MTU)•CAF• Commercial compiler available on Cray machines• Open source compiler available from Rice• Titanium (Friday)• Open source compiler from UCB runs on most machines• Common tools• Open64 open source research compiler infrastructure• ARMCI, GASNet for distributed memory implementations• Pthreads, System V shared memory3/1/2004 CS267 Lecure 20 8UPC Overview and Design Philosophy• Unified Parallel C (UPC) is:• An explicit parallel extension of ANSI C • A partitioned global address space language• Sometimes called a GAS language• Similar to the C language philosophy• Programmers are clever and careful, and may need to get close to hardware• to get performance, but• can get in trouble• Concise and efficient syntax• Common and familiar syntax and semantics for parallel C with simple extensions to ANSI C• Based on ideas in Split-C, AC, and PCP3/1/2004 CS267 Lecure 20 9UPC Execution Model3/1/2004 CS267 Lecure 20 10UPC Execution Model• A number of threads working independently in a SPMD fashion• Number of threads specified at compile-time or run-time; available as program variable THREADS• MYTHREAD specifies thread index (0..THREADS-1)• upc_barrier is a global synchronization: all wait• There is a form of parallel loop that we will see later• There are two compilation modes• Static Threads mode:• Threads is specified at compile time by the user• The program may is THREADS as a compile-time constant• Dynamic threads mode:• Compiled code may be run with varying numbers of threads3/1/2004 CS267 Lecure 20 11Hello World in UPC• Any legal C program is also a legal UPC program• If you compile and run it as UPC with P threads, it will run P copies of the program.• Using this fact, plus the identifiers from the previous slides, we can parallel hello world:#include <upc.h> /* needed for UPC extensions */#include <stdio.h>main() {printf("Thread %d of %d: hello UPC world\n", MYTHREAD, THREADS);}3/1/2004 CS267 Lecure 20 12Example: Monte Carlo Pi Calculation• Estimate Pi by throwing darts at a unit square• Calculate percentage that fall in the unit circle• Area of square = r2= 1• Area of circle quadrant = ¼ * π r2 = π/4• Randomly throw darts at x,y positions•If x2+ y2< 1, then point is inside circle• Compute ratio:• # points inside / # points total• π = 4*ratio r =13/1/2004 CS267 Lecure 20 13Each thread calls “hit” separatelyInitialize random in math libraryEach thread can use input argumentsEach thread gets its own copy of these variablesPi in UPC • Independent estimates of pi:main(int argc, char **argv) {int i, hits, trials = 0;double pi;if (argc != 2)trials = 1000000;else trials = atoi(argv[1]);srand(MYTHREAD*17);for (i=0; i < trials; i++) hits += hit();pi = 4.0*hits/trials;printf("PI estimated to %f.", pi);}3/1/2004 CS267 Lecure 20 14Helper Code for Pi in UPC• Required includes:#include <stdio.h>#include <math.h> #include <upc.h> • Function to throw dart and calculate where it hits:int hit(){int const rand_max = 0xFFFFFF;double x = (double) (rand()*rand_max) / rand_max;double y = (double) (rand()*rand_max) / rand_max;if ((x*x + y*y) <= 1.0) return(1);else return(0);}Hidden slide3/1/2004 CS267 Lecure 20 15UPC Memory Model• Scalar Variables• Distributed Arrays• Pointers to shared data3/1/2004 CS267 Lecure 20 16Private vs. Shared Variables in UPC• Normal C variables and objects are allocated in the private memory space for each thread.• Shared variables are allocated only once, with thread 0shared int ours;int mine;• Simple shared variables of this kind may not occur in a within a function


View Full Document

Berkeley COMPSCI C267 - Unified Parallel C (UPC)

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Unified Parallel C (UPC)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Unified Parallel C (UPC) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Unified Parallel C (UPC) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?