Unformatted text preview:

48Some Important ObservationsIn addition to assignment/orchestration, many important properties of aparallel program depend on:• Application parameters and number of processors• Working sets and cache/replication sizeShould cover realistic regimes of operation49Operating Points Based on Working SetsMany applications have a hierarchy of working sets:• A working set may consist of local and/or nonlocal data• not fitting it may dramatically increase local miss rate or even communication• Some working sets scale with application parameters and p, some don’t• Some operating points are realistic, some aren’t• operating point = f (cache/replication size, application parameters, p)unrealisticoperating pointrealisticoperating pointsSize of Cache or Replication StoreMiss rate orComm. vol50Evaluating an Idea or TradeoffTypically many things change from one generation to nextBuilding prototypes for evaluation is too expensiveBuild a simulatorCase I: Want to examine in the context of a given•Can assume technological and architectural parameters–Simulate with feature turned off and turned on to examine impact–Perhaps examine sensitivity to some parameters that were fixed•Building accurate simulators is complex–Contention difficult to model correctly–Processors becoming increasingly complex themselvesCase II: Want to examine benefit of idea in a more general context•Now machine parameters also variable–Various sizes,granularities and organizations, performance characteristics51Multiprocessor SimulationSimulation runs on a uniprocessor (can be parallelized too)•Simulated processes are interleaved on the processorTwo parts to a simulator:•Reference generator: plays role of simulated processors–And schedules simulated processes based on simulated time•Simulator of extended memory hierarchy–Simulates operations (references, commands) issued by reference generatorCoupling or information flow between the two parts varies•Trace-driven simulation: from generator to simulator•Execution-driven simulation: in both directions (more accurate)Simulator keeps track of simulated time and detailed statistics52Execution-driven SimulationMemory hierarchy simulator returns simulated time information toreference generator, which is used to schedule simulated processesP1P2P3Pp$1$2$3$pMem1Mem2Mem3MempReference generatorMemory and interconnect simulator······Network53Difficulties in Simulation-based EvaluationTwo major problems, beyond accuracy and reliability:•Cost of simulation (in time and memory)–cannot simulate the problem/machine sizes we care about–have to use scaled down problem and machine sizes•how to scale down and stay representative?•Huge design space–application parameters (as before)–machine parameters (depending on generality of evaluation context)•number of processors•cache/replication size•associativity•granularities of allocation, transfer, coherence•communication parameters (latency, bandwidth, occupancies)–cost of simulation makes it all the more critical to prune the space54Scaling Down Parameters for SimulationWant scaled-down machine running scaled-down problem to berepresentative of full-sized scenario•No good formulas exist•But very important since reality of most evaluation•Should understand limitations and guidelines to avoid pitfallsFirst examine scaling down problem size and no. of processorsThen lower-level machine parametersFocus on cache-coherent SAS for concreteness55Scaling Down Problem ParametersSome parameters don’t affect parallel performance much, but doaffect runtime, and can be scaled down•Common example is no. of time-steps in many scientific applications–need a few to allow settling down, but don’t need more–may need to omit cold-start when recording time and statistics•First look for such parameters•Others can be scaled according to earlier scaling argumentsBut many application parameters affect key characteristicsScaling them down requires scaling down no. of processors too•Otherwise can obtain highly unrepresentative behavior56Difficulties in Scaling N, p RepresentativelyMany goals, difficult individually and often impossible so to reconcileWant to preserve many aspects of full-scale scenario•Distribution of time in different phases•Key behavioral characteristics•Scaling relationships among application parameters•Contention and communication parametersCan’t really hope for full representativeness, but can•Cover range of realistic operating points• Avoid unrealistic scenarios•Gain insights and estimates of performance57Scaling Down Other Machine ParametersOften necessary when scaling down problem size•E.g. may not represent working set not fitting if cache not scaledMore difficult to do with confidence•Cache/replication size: guide by scaling of working sets, not data set•Associativity and Granularities: more difficult–should try to keep unchanged since hard to predict effects, but ...–greater impact with scaled-down application and system parameters–difficult to find good solutions for both communication and local accessSolutions and confidence levels are application-specific•Require detailed understanding of application-system interactionsShould try to use as realistic sizes as possible•Use guidelines to cover key operating points, and extrapolate withcaution58Dealing with the Parameter SpaceSteps in an evaluation study•Determine which parameters are relevant to evaluation•Identify values of interest for them–context of evaluation may be restricted•Analyze effects where possible•Look for knees and flat regions to prune where possible•Understand growth rate of characteristic with parameter•Perform sensitivity analysis where necessary59An Example EvaluationGoal of study: To determine the value of adding a block transferfacility to a cache-coherent SAS machine with distributed memoryWorkloads: Choose at least some that have communication that isamenable to block transfer (e.g. grid solver)Choosing parameters is more difficult. 3 goals:•Avoid unrealistic execution characteristics•Obtain good coverage of realistic characteristics•Prune the parameter space based on–goals of study–restrictions imposed by technology or assumptions–understanding of parameter interactionsLet’s use equation solver as example60Choosing ParametersProblem size and number of processors•Use inherent characteristics considerations as


View Full Document

TAMU ECEN 676 - ch4_3

Documents in this Course
ch5_2

ch5_2

8 pages

ch3_2

ch3_2

23 pages

Load more
Download ch4_3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view ch4_3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view ch4_3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?