UMD CMSC 714 - The Sisal Model of Functional Programming and its Implementation - D652890

Home> Schools> University of Maryland, College Park> Computer Science (CMSC) > CMSC 714> The Sisal Model of Functional Programming and its Implementation

DOC PREVIEW

UMD CMSC 714 - The Sisal Model of Functional Programming and its Implementation

School name University of Maryland, College Park

Course Cmsc 714- High Performance Computing Systems

Pages 12

This preview shows page 1-2-3-4 out of 12 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1The Sisal Model of Functional Programming and its ImplementationJean-Luc Gauditot, et al.Presented by Nick RutarSisal FAQ What does Sisal stand for? Stream and Iteration in a Single Assignment Language What kind of language is it? Sisal is a functional programming language Where did it come from? Developed by Lawrence Livermore National Laboratory Colorado State University University of Manchester Digital Equipment Corporation Why are we talking about it in this class? Sisal was developed with a goal to “design a general-purpose implicitly parallel language for a wide range of parallel platforms”2Sisal overview User-defined names are “identifiers”, not variables Refer to values, not memory locations All values produced and used are dynamic Identifiers bound only for duration of execution All expressions return values based on Values bound to formal arguments Constituent identifiers No side effects Allows righer analyses of program then for imperative languagesSample Codetype OneDim = array [ real ];type TwoDim = array [ oneDim ];function generate( n: integerreturns TwoDim, TwoDim )for i in 1, n cross j in 1, nt1 := real(i) * real(j);t2 := real(i) / real(j);returns array of t1array of t2end forend function % generate3Parallelism in sample code All Sisal expressions evaluate to value sets Function evaluates to two arrays for expression indicates potential parallelism to the compiler Body of loop instantiated as many times as values in index range Each body instantiation is independent No data dependencies Independent loop bodies may be done in parallelOSC: Optimizing Sisal Compiler Compilation proceeds through various stages Translates Sisal source to data flow graph language, IF1 Translate IF1 to annotated memory management graph language, IF2 C code is produced from IF2 Target machine compiler invoked Various options for compilation & execution Optimization  Output (Syntax error messages) Available for most architectures4Sisal compiler issues Update in place and copy elimination Build in Place Reference Counting Optimization Vectorization Loop Fusion Double Buffering Pointer Swap InversionUpdate in place/copy eliminationletA: = array[1: 1,2,3];B: = A[2 : 999];inA, Bend letReturn [1:1,2,3],[1:1,999,3]Problem: Multiple copies of the arraySwapping two elements even worseAlternate replacementC := array[1: 1,2,3,4,5]T0 := C[3]; T1 := C[4];D := C[3: T1];E := D[4: T0];Use containers instead of throwing them away5Build in PlaceL := F(0,A[1],A[2]);R := F(A[N-1],A[N],0);III:= for i in 2, n-1 …LIII = array_addl(III,L);LIIIR = array_addh(LIII,R);…..L….L…RLIIIRCreate a BufferRequires allocation and unneeded data movmentReference Counting & Vectorization Reference Counting Reference counts show when A value can be updated in place Value’s memory can be recycled Can be expensive, especially on parallel machines OSC elminates most reference counting Lifetime analysis Operation merging Vectorization Sisal’s underlying dataflow representation make loops easy to move This means OSC can vectorize extremely well6Loop FusionT0 := for i in 1, n returnsarray of A[i]*2end forT1 := for i in 1, n returnsarray of B[i]*3end forX := for i in 1, n returnsarray of T0[i] + T1[i]end forX := for i in 1, n returnsarray of A[i]*2+B[i]*3end forDouble Buffering Pointer Swapfor initialA := start_values()while not done(A) repeatA := time_step(old A)Allocates new buffer eachtime stepAllocate initial bufferoutside loop and pointerswaps initial and secondarybuffers7InversionX:= for i in 1,nv:= if i = 1 then %Leftelseif i = n then %Rightelse %innerend ifIf-tests introduce large overhead and inhibits parallelismX0:= for i in 1,max(0,n) %leftX1:= for i in 1,max(0,n) %rightX2:= for i in 2,n-1 %innerX:= X0 || X2 || X1D-OSC Extension of OSC for distributed-memory machines Code generation produces C plus MPI calls Master process dvides parallel loops into slices Slices executed in parallel by designated processes D-OSC implemented in four phases8D-OSC Phases Phase 1: Base No analysis, naïve code generation Arrays and loops distributed among processors Unique array identifiers explicity created with table on each processor Phase 2: Rectangular Arrays Higher dimensional arrays of arrays replaced by rectangular arrays  Phase 3: Block Messages Algorithm to obtain block messages instead of individual array elements Phase 4: Multiple Alignment Reduces number of messages by creating overlapping array sectionsD-OSC Results Livermore loops on network of 4 workstations 1st number - messages passed, 2nd num - volume9Multithreaded Execution Sisal suited as source for multithreaded code Two models considered Blocking Thread Model Thread may be suspended and resumed later Architecture must support context switching Rely on Frame model Storage statement associated with each invocation of a code-blockNon-blocking Thread Model Thread starts and runs until termination Relies on Framelet model Fixed size unit of storage associated with each thread instance Data values shared between threads are replicatedBlocking & Non-Blocking Example10Multithreaded Code Generation Converts programs into two intermediate forms MIDC-2 (non-blocking) MIDC-3 (blocking) Both derived from MIDC (Machine Independent Dataflow Code) Guided by Minimize synchronization overhead Maximize intra-thread locality Assure deadlock-free threads Preserve functional and loop parallelism in programsMultithread Code Gen Phases Phase 1 Same for blocking & non-blocking Compile Sisal program to IF2 using OSC Phase 2 Handle remote memory accesses as split-phase or single-phase11Effect of Network &

View Full Document