DOC PREVIEW
WUSTL CSE 567M - Performance of Alternative Topologies

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Joe Wingbermuehle, [email protected] (A paper written under theguidance of Prof. Raj Jain) DownloadThe Auto-Pipe system allows one to evaluate various resource mappings and topologies for streamingapplications. In this paper we use Auto-Pipe to gather timing information for individual blocks in a streamingapplication to solve Laplace's equation. We then use the timing information to seed a queueing model topredict the performance of various topologies and resource mappings. Finally, we use the Auto-Pipe system tomeasure the performance of the topology to verify the model and rank the topologies and resource mappingsbased on throughput.Keywords: Auto-Pipe, Laplace's Equation, Performance Analysis, Queueing Theory, Monte-CarloSimulation, X-Language1. Introduction1.1 Auto-Pipe1.2 Laplace's Equation1.3 Methods for Solving Laplace's Equation1.4 The Application2. Performance Analysis Methodology2.1 Metrics2.2 Parameters2.3 Factors2.4 Experimental Design3. Analytic Model3.1 The Model3.2 Block Timings3.3 Mappings4. Measurement4.1 Results4.2 Analysis4.3 Interpretation5. ConclusionReferencesAcronymsWe attempt to determine the best topology and resource mapping for an Auto-Pipe application to solveLaplace's equation. Auto-Pipe allows one to experiment with various topologies of the application and obtaintiming information. We use the Monte-Carlo method for solving Laplace's equation since it represents a small,Performance of Alternative Topologies for Solving Laplace's Equation usi... http://www1.cse.wustl.edu/~jain/cse567-11/ftp/laplace/index.html1 of 11 5/4/2011 4:49 PMbut useful application that fits nicely into the streaming framework.1.1 Auto-PipeAuto-Pipe is a system for creating computing pipelines across heterogeneous compute platforms [Tyson06,Franklin06, Chamberlain07]. A heterogeneous compute platform may contain a combination of CPUs, GPUs,FPGAs, or other accelerator devices. Auto-Pipe provides the user with the ability to author compute blocks tooperate on a data stream. The user can then specify the topology and resource mapping of the applicationusing Auto-Pipe's X language. This X language description is fed to the Auto-Pipe X compiler to build theapplication. The Auto-Pipe approach allows one to access timing information for individual blocks and toeasily modify the application topology and block-to-resource mapping of the application.1.2 Laplace's EquationThe goal of the application is to solve Laplace's equation in two dimensions (shown in figure 1.1). Laplace'sequation is a second-order partial differential equation (PDE) [Strauss92]. This equation has several usesincluding modeling stationary diffusion (such as heat) and Brownian motion. For heat, given the temperaturesat the boundaries of a two-dimensional object, solutions to Laplace's equation provide the interiortemperatures after the temperatures have stabilized. The ease of solving Laplace's equation depends on thenature of the boundary conditions. An analytic solution exists for simple boundary conditions, however, formany boundaries conditions, no analytic solution exists and numerical solutions must be sought [Farlow93].Figure 1.1: Laplace's Equation in Two Dimensions1.3 Methods for Solving Laplace's EquationThere are numerous ways to solve Laplace's equation. As previously stated, for certain simple boundaryconditions a straight-forward analytical solution exists. However, it is often the case that a numerical methodis required. One numerical approach is Gauss-Seidel iteration [Strauss92]. This method converges to thecorrect solution quickly, but it requires that the complete grid be stored in memory. Another method isMonte-Carlo simulation. This technique is provably correct [Reynolds65], but converges slowly.Nevertheless, this method is useful if only a few interior points are needed. This is because the Monte-Carlomethod does not require storing the whole grid since the grid is implicit and only those points that are ofinterest need be computed. Further, Monte-Carlo methods work well in a pipelined framework such asAuto-Pipe [Singla08].1.4 The ApplicationThe Auto-Pipe application to solve Laplace's equation is made up of several compute blocks. We canreplicate some blocks to achieve better performance via parallelism by dividing the work across multiplecompute resources. All of the blocks considered here are implemented in C for a general purpose processor.However, Auto-Pipe would allow any of these to be re-implemented for an FPGA. Table 1.1 lists the blocksused in this application. Note that there are a few other blocks used to seed the random number generator andto read the boundary conditions, but these blocks don't contribute much to the running time so we don'tconsider them here.Table 1.1: BlocksPerformance of Alternative Topologies for Solving Laplace's Equation usi... http://www1.cse.wustl.edu/~jain/cse567-11/ftp/laplace/index.html2 of 11 5/4/2011 4:49 PMBlock DescriptionRNG Random number generator.Split Block to distribute numbers equally among two outputs.Walk Block to perform random walks.AVG Block to take the average of two inputs.Print Block to output the result.The application works by generating random numbers which are then fed to a block to perform a randomwalk starting from the coordinates of interest. For this application we consider all grid coordinates. Therandom numbers determine the direction for each stage of the random walk. The walk continues until aboundary is crossed. Once the boundary is crossed, the temperature at the boundary point is forwarded toeither a print block, which saves the result to a file, or an average block, which averages the results of twoinputs. The application continues this process many times for each coordinate in a rectangular region. Notethat in a real use of this application one would probably only be interested in few locations, but for thepurpose of performance analysis, we consider the whole grid.There are multiple ways to divide the random walks among compute resources. One method is to configurethe block used to generate random numbers to feed a block that splits the output among multiple walk blocks.A second method is to replicate the random number generation blocks along with the walk blocks by usingdifferent random number seeds for each random number generation block. In either case, we feed the resultsfrom the random walk blocks to a block that averages them before we send the averages to a print block to berecorded. Given a single compute


View Full Document

WUSTL CSE 567M - Performance of Alternative Topologies

Documents in this Course
Load more
Download Performance of Alternative Topologies
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Performance of Alternative Topologies and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Performance of Alternative Topologies 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?