Berkeley COMPSCI C267 - Performance Analysis Tools - D2662495

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI C267> Performance Analysis Tools

DOC PREVIEW

Berkeley COMPSCI C267 - Performance Analysis Tools

School name University of California, Berkeley

Course Compsci C267- Applications of Parallel Computers

Pages 103

This preview shows page 1-2-3-4-5-6-7-48-49-50-51-52-53-54-97-98-99-100-101-102-103 out of 103 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 103 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Performance Analysis ToolsOutlineMotivationSlide 4Concepts and DefinitionsInstrumentationInstrumentation – Examples (1)Instrumentation – Examples (2)Instrumentation – Examples (3)Instrumentation – Examples (4)Instrumentation – Examples (5)MeasurementMeasurement: ProfilingProfiling: Inclusive vs. ExclusiveTracing Example: Instrumentation, Monitor, TraceTracing: Timeline VisualizationMeasurement: TracingPerformance Data AnalysisTrace File VisualizationSlide 203D performance data explorationAutomated Performance AnalysisAutomation ExampleSlide 24Slide 25What is PAPIPAPI Hardware EventsWhere is PAPIPAPI Counter InterfacesPAPI High-level InterfacePAPI High-level ExamplePAPI Low-level InterfaceMany tools in the HPC space are built on top of PAPIComponent PAPI (PAPI-C)Component PAPI DesignSlide 37OpenMPOpenMP Performance Analysis with ompPUsage exampleompP’s Profiling ReportProfiling DataFlat Region Profile (2)CallgraphCallgraph (2)Overhead Analysis (1)Overhead Analysis (2)ompP’s Overhead Analysis ReportOpenMP Scalability AnalysisSPEC OpenMP Benchmarks (1)SPEC OpenMP Benchmarks (2)Incremental Profiling (1)Incremental Profiling (2)Incremental Profiling (3)Incremental ProfilingIncremental Profiling Profiling: Data Views (2)Slide 59Slide 60IPM: Design GoalsIPM: MethodologyHow to use IPM : basicsWant more detail? IPM_REPORT=fullSlide 65IPM: XML log filesMessage Sizes : CAM 336 wayScalability: RequiredMore than a pretty pictureScalability: InsightPortability: Profoundly InterestingSlide 72Vampir overview statisticsTimeline displayTimeline display – message detailsCommunication statisticsMessage histogramsCollective operationsActivity chartProcess–local displaysEffects of zoomingSlide 86Basic IdeaMPI-1 Pattern: Wait at BarrierMPI-1 Pattern: Late Sender / ReceiverSlide 90KOJAK: sPPM run on (8x16x14) 1792 PEsSlide 92TAU Parallel Performance SystemParaProf – 3D Scatterplot (Miranda)ParaProf – 3D Scatterplot (SWEEP3D CUBE)PerfExplorer - Cluster AnalysisPerfExplorer - Correlation Analysis (Flash)Slide 98Documentation, Manuals, User GuidesThe space is bigSlide 101Sharks and Fish IISharks and Fish II : ProgramSharks and Fish II: How fast?Scaling: Good 1st Step: Do runtimes make sense?Scaling: WalltimesScaling: DefinitionsScaling: SpeedupsScaling: EfficienciesScaling: AnalysisPerformance Analysis ToolsKarl [email protected] slides from David Skinner, Sameer Shende, Shirley Moore, Bernd Mohr, Felix Wolf, Hans Christian Hoppe and others.CS267 - Performance Analysis Tools | 2Karl FuerlingerOutlineMotivation–Why do we care about performanceConcepts and definitions–The performance analysis cycle–Instrumentation–Measurement: profiling vs. tracing–Analysis: manual vs. automated Tools–PAPI: Access to hardware performance counters–ompP: Profiling of OpenMP applications–IPM: Profiling of MPI apps–Vampir: Trace visualization–KOJAK/Scalasca: Automated bottleneck detection of MPI/OpenMP applications–TAU: Toolset for profiling and tracing of MPI/OpenMP/Java/Python applicationsCS267 - Performance Analysis Tools | 3Karl FuerlingerMotivationPerformance Analysis is important –Large investments in HPC systems•Procurement: ~$40 Mio•Operational costs: ~$5 Mio per year•Electricity: 1 MWyear ~$1 Mio–Goal: solve larger problems–Goal: solve problems fasterCS267 - Performance Analysis Tools | 4Karl FuerlingerOutlineMotivation–Why do we care about performanceConcepts and definitions–The performance analysis cycle–Instrumentation–Measurement: profiling vs. tracing–Analysis: manual vs. automated Tools–PAPI: Access to hardware performance counters–ompP: Profiling of OpenMP applications–IPM: Profiling of MPI apps–Vampir: Trace visualization–KOJAK/Scalasca: Automated bottleneck detection of MPI/OpenMP applications–TAU: Toolset for profiling and tracing of MPI/OpenMP/Java/Python applicationsCS267 - Performance Analysis Tools | 5Karl FuerlingerConcepts and DefinitionsThe typical performance optimization cycleCode DevelopmentUsage / ProductionMeasureAnalyzeModify / Tunefunctionally complete and correct programcomplete, cor-rect and well-performingprograminstrumentationCS267 - Performance Analysis Tools | 6Karl FuerlingerInstrumentationInstrumentation = adding measurement probes to the code to observe its executionCan be done on several levelsDifferent techniques for different levelsDifferent overheads and levels of accuracy with each techniqueNo instrumentation: run in a simulator. E.g., Valgrind User-level abstractions problem domainsource codesource codeobject code librariesinstrumentationinstrumentationexecutableruntime imagecompilerlinkerOSVMinstrumentationinstrumentationinstrumentationinstrumentationinstrumentationinstrumentationperformancedatarunpreprocessorCS267 - Performance Analysis Tools | 7Karl FuerlingerInstrumentation – Examples (1)Source code instrumentation–User added time measurement, etc. (e.g., printf(), gettimeofday())–Many tools expose mechanisms for source code instrumentation in addition to automatic instrumentation facilities they offer–Instrument program phases: •initialization/main iteration loop/data post processing–Pramga and pre-processor based#pragma pomp inst begin(foo)#pragma pomp inst end(foo)–Macro / function call basedELG_USER_START("name");...ELG_USER_END("name");CS267 - Performance Analysis Tools | 8Karl FuerlingerInstrumentation – Examples (2)Preprocessor Instrumentation–Example: Instrumenting OpenMP constructs with Opari–Preprocessor operation–Example: Instrumentation of a parallel region/* ORIGINAL CODE in parallel region */Instrumentation added by OpariOrignialsource codeModified (instrumented)source codePre-processorThis is used for OpenMP analysis in tools such as KoJak/Scalasca/ompPCS267 - Performance Analysis Tools | 9Karl FuerlingerInstrumentation – Examples (3)Compiler Instrumentation–Many compilers can instrument functions automatically–GNU compiler flag: -finstrument-functions –Automatically calls functions on function entry/exit that a tool can capture–Not standardized across compilers, often undocumented flags, sometimes not available at all–GNU compiler example:void __cyg_profile_func_enter(void *this, void *callsite) {/* called on function entry */}void __cyg_profile_func_exit(void *this, void *callsite){/* called just before returning from function */}CS267 - Performance Analysis Tools |

View Full Document