WUSTL CSE 567M - Analysis of Sorting as a Streaming Application

Unformatted text preview:

Analysis of Sorting as a Streaming ApplicationGreg Galloway, [email protected] (A class project report writtenunder the guidance of Prof. Raj Jain)DownloadAbstract Expressing concurrency of applications has always been a challenging and error-prone task, yeteffective use of multi-core processors and hardware implementations requires that the concurrency ofapplications be well represented and understood. One approach to the expression of concurrency is streaming.In this paper I express the classic problem of sorting in the streaming paradigm, and explore variousalgorithmic and architectural design parameters.Keywords: Sorting, Auto-Pipe, Sorting, X-Language, Performance ModelingTable of Contents:1 Introduction2 Auto-Pipe Streaming Application Development Environment2.1 Analysis Techniques3 Sorting Application3.1 Benefits of Streaming Approach3.2 Mapping Alternatives4 Performance Results4.1 Benefits and Drawbacks of Initial Presentation4.2 Application of Performance Techniques4.3 Improvements to Analysis5 Applied Performance Techniques5.1 Result Analysis6 Summary7 References8 Acronyms1 IntroductionWith multi-core processors becoming the standard for general-purpose computing, there has been an increaseof interest in parallel processing topics, specifically parallel algorithms. A relatively new approach is thestream programming paradigm. Expanding upon the traditional shared-memory programming paradigm andthe message-passing programming paradigm, stream computing has been introduced as a more data-centricapproach to authoring parallel applications.There are many languages that support stream computing. It has been argued that languages expressingstreams represent a better mechanism for reasoning about concurrency than thread-based approaches. The XLanguage [Franklin06, Tyson06] is a stream-based coordination language for hybrid systems (systems witharchitecturally diverse components, such as FPGAs, GPUs, etc.).This paper describes the classic sorting problem in terms of a streaming computation. Variations examinedinclude the degree of pipelining vs. data parallelism, the performance of the sorting application whenSorting as a Streaming Application Using Auto-Pipe1 of 10deployed on multi-processors, and shared vs. common memory systems. The Auto-pipe design environmentwas used to generate the data, which is also briefly described.2 Auto-Pipe Streaming Application DevelopmentEnvironmentAuto pipe is a performance-oriented development environment for hybrid systems. It concentrates onapplications that are represented primarily as dataflow graphs and is especially useful in dealing withstreaming applications placed on pipelined architectures. In Auto-Pipe, applications are expressed in theX-language [Tyson06] as acyclic dataflow graphs. These graphs contain individual tasks called "blocks" andare connected with interconnections called "edges". Blocks represent the computational tasks, and edgesindicate the type and flow of data between the blocks.The implementations of the blocks are written in various languages for any subset of the available platforms.Currently Auto-Pipe supports C for general-purpose processors, HDL (Hardware Description Language) forFPGAs (Field Programmable Gate Arrays)[Gayen07], and assembly for network processors and DSPs (DigitalSignal Processors) [Chamberlain07]. Auto-Pipe further provides an extensible infrastructure for supporting awider variety of interconnection and computational devices, simulators, and languages.2.1 Analysis TechniquesBuilt in analysis is lacking in the toolset itself, so making use of a good reference on systems performanceanalysis [ex: Jain91] assist in making the best use of the results. Anyone who makes use of computer systemsshould be able to analyze the performance of the systems they design, and be able to state the requirements.There is a strong need to be able to consider the provided alternatives and to choose the solution that bestmeets their needs [Jain91]. More on this topic will be discussed in a later section.3 Sorting ApplicationFrequently large data sets are split into smaller groups before being sorted, and then merged in a differentstep. Split blocks are used to quickly divide an incoming data stream, routing half of the incoming records toanother block, usually another split or a sort. After each group of records is sorted, they are then routed to amerge block, which uses the simple merge sort to recombine the data. The particular sorting algorithm used bythe sort blocks is not significant; however in the results presented, comb sort [Lacey91] was used. Comb sortwas selected due to it being a reasonably efficient O(nlogn) in-place algorithm. The process of splitting thedata stream up can be done in a few different ways, shown below. The first figure demonstrates using two sortblocks in combination with a single merge and split block. The second figure shows a topology that uses foursort blocks, and figure 3 shows a topology consisting of eight sorts.Sorting as a Streaming Application Using Auto-Pipe2 of 10Figure 1: Example Two Sort TopologyFigure 2: Example Four Sort TopologyFigure 3: Example Eight Sort Topology3.1 Benefits of Streaming ApproachSorting as a Streaming Application Using Auto-Pipe3 of 10The power of expressing the sorting application in these ways is that the computation supports a streamingdata model, where pipelining is used to enable the sort blocks to work on one group of records while themerge blocks concurrently work on another group. In this model, pipeline-based parallelism and dataparallelism are both explicitly represented[Franklin06].There are a few clear benefits to authoring applications using this approach. First, it is possible to build alibrary of blocks that can be re-used [Tyson05]. This would enable application development primarily in thecoordination language without requiring implementation of individual blocks. Second, the data movementbetween blocks is not something that needs to be explicitly managed by the application developer. TheX-coordination language already determines where data is to be delivered [Tyson06]. Third, algorithmdecomposition is known to the system, making it straightforward to adjust the mapping. This makes theprocess of distributing the blocks to various compute resources much easier. Finally, the streaming dataparadigm is a natural approach to reasoning about the correctness of an application, which reduces thechances of making programming errors [Gayen07].


View Full Document

WUSTL CSE 567M - Analysis of Sorting as a Streaming Application

Documents in this Course
Load more
Download Analysis of Sorting as a Streaming Application
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Analysis of Sorting as a Streaming Application and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Analysis of Sorting as a Streaming Application 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?