DOC PREVIEW
Berkeley COMPSCI 258 - Evaluation of Release Consistent Software Distributed Shared Memory

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

AbstractEvaluation of Release ConsistentSoftware Distributed Shared MemoryEmerging Network TechnologySanclllya Dwarkaclas, Pete Keleller, Alan L. Cox, and WinyDepartment of ~~omputer ScienceRice [University *Zwaenepoelconsist of measurements of a particular implementationWeevaluate the effect of processor speed, network char-acteristics, and software overhead on the performanceof release-consistent software distributed shared mem-ory. We examine five different protocols for implement-ing release consistency: eager update, eager invalidate,lazy update, lazy invalidate, and a new protocol calledlazy hybrid.This lazy hybrid protocol combines thebenefits of both lazy update and lazy invalidate.Our simulations indicate that with the processors andnetworks that are becoming available, coarse-grainedapplications such as Jacobi and TSP perform well, moreor less independent of the protocol used.Medium-grained applications, such as Water, can achieve goodperformance, but the choice of protocol is critical. Forsixteen processors, the best protocol, lazy hybrid, per-formed more than three times better than the worst,the eager update.Fine-grained applications such asCholesky achieve little speedup regardless of the pro-tocol used because of the frequency of synchronizationoperations and the high latency involved.While the use of relaxed memory models, lazy imple-mentations, and multiple-writer protocols has reducedthe impact of false sharing, synchronization latency re-mains a serious problem for software distributed sharedmemory systems. These results suggest that futurework on software DSMS should concentrate on reducingthe amount of synchronization or its effect.1 IntroductionAlthough several models and algorithms for softwaredistributed shared memory (DSM) have been pub-lished, performance reports have been relatively rare.The few performance results that have been published●This work was supported in part by NSF Grants CCR-91 16343and CCR-921 1004, Texas ATP Grant No. 0036404013 and by aNASA C;raduate Fellowship.in a particular hardware an~ software environment [3,5, 6, 13]. Since the cost of communication is very im-portant to the performance of a DSM, these results arehighly sensitive to the implementationof the commu-nication software. Furthermore, the hardware environ-ments of many of these implementations are by now ob-solete. Much faster processors are commonplace, andmuch faster networks are becoming available.We are focusing on DSMS that support release consis-tency [9], i.e., where memory is guaranteed to be consis-tent only following certain synchronization operations.The goals of this paper are two-fold: (1) to gain anunderstanding of how the performance of release con-sistent software DSM depends on processor speed, net-work characteristics, and software overhead, and (2) tocompare the performance of several protocols for sup-porting release consistency in a software DSM.The evaluation is done by execution-driven simula-tion [7]. The application programs we use have beenwritten for (hardware) shared memory multiproces-sors. Our results may therefore be viewed as an in-dication of the possibility of “porting” shared memoryprograms to software DSMS, but it should be recog-nized that better results may be obtained by tuningthe programs to a DSM environment. The applica-tion programs are Jacobi, Traveling Salesman Prob-lem (TSP), and Water and Cholesky from the SPLASHbenchmark suite [14]. Jacobi and TSP exhibit coarse-grained parallelism, with little synchronization relativeto the amountof computation, whereas Water may becharacterized as medium-grained, and Cholesky as fine-grained.We find that, with current processors, the bandwidthof the 10-megabit Ethernet becomes a bottleneck, lim-iting the speedups even for a coarse-grained applicationsuch as Jacobi to about 5 on 16 processors. With a 100-megabit point-to-point network, represent ative of theATM LANs now appearing on the market, we get goodspeedups even for small sizes of c.oarse-grained prob-08S4-7495/93 $3.00 @ 1993 IEEE144lems such as Jacobi and TSP, moderate speedups forWater, and very little speedup for Cholesky. Regard-less of the considerable bandwidth available on thesenetworks, Cholesky ’s performance is constrained by thevery high number of synchronization operations.Among the protocols for implementing software re-lease consistency, we distinguish betweeneager and lazyprotocols.Eager protocols push modifications to allcachers at synchronization variable releases [5]. In con-trast,lazy protocols [11] pull the modifications at syn-chronization variable acquires, and communicate onlywith the acquirer. Both eager and lazy release con-sistency can be implemented using either invalidate orupdate protocols. We present a new lazyhybrid proto-col that combines the benefits of update and invalidate:few access misses, low data and message counts, and lowlock acquisition latency.Our simulations indicate that the lazy algorithmand the hybrid protocol significantly improve the per-formance of medium-grained programs, those on theboundary of what can be supported efficiently by asoftware DSM. Communication in coarse-grained pro-grams is sufficiently rare that the choice of protocolsbecomes less important. The eager algorithms performslightly better for TSP because the branch-and-boundalgorithm benefits from the early updates in the eagerprotocols (see Section 6.2). For the fine-grained pro-grams, lazy release consistency and the hybrid proto-col reduce the number of messages and the amount ofdata drastically, but the communication requirementsare still beyond what can be supported efficiently ona software DSM. For these kinds of applications, tech-niques such as multithreading and code restructuringmay prove useful.The outline of the rest of this paper is as follows.Section 2 briefly reviews release consistency, and theeager and lazy implementation algorithms. Section 3describes the hybrid protocol. Section 4 details the im-plementation of the protocols we simulated. Section 5discusses our simulation methodology, and Section 6presents the simulation results. We briefly survey re-lated work in Section 7 and conclude in Section 8.2 Release ConsistencyFor completeness, we reiterate in this section the mainconcepts behind release consistency (RC) [9],eager re-lease consistency (ERC) [5], and lazy release consis-tency (LRC) [1 1].R(3 [9] is a form of relaxed memory consistency thatallows the effects of shared memory


View Full Document

Berkeley COMPSCI 258 - Evaluation of Release Consistent Software Distributed Shared Memory

Documents in this Course
Load more
Download Evaluation of Release Consistent Software Distributed Shared Memory
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Evaluation of Release Consistent Software Distributed Shared Memory and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Evaluation of Release Consistent Software Distributed Shared Memory 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?