DOC PREVIEW
UW-Madison ECE 734 - Mapping DSP algorithms

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Mapping DSP algorithms to a general purpose out-of-order processorOutlineIntroductionOut-of-order execution overviewUnderstanding DG change for mapping to GPPTo-do listExpected ResultsMapping DSP algorithms to a general purpose out-of-order processorECE 734Ilhyun KimDonghyun BaikOutlineIntroductionOut-of-order execution overviewDependence graph change for mapping to GPPTo-do listExpected resultsIntroductionWhy DSP applications are implemented on GPP–Lower development cost•commodity part, lower maintenance cost, faster turn-around–GPP meets performance requirement•GPP becomes faster. Only DSP chips could do it in the pastProblems with algorithm transformations–Used for faster operations / efficient hardware–On GPP, no control over hardware configurations–Some of them are effective while others notProblems with extracting parallelism–Duplicated efforts on each of layers (source code, compiler, processor)–Narrow machine scope over independent operationsWhat are the efficient ways to map algorithm to GPP?–Understanding how a GPP executes instructions–How can software improve the performance?Out-of-order execution overviewDynamic parallelism extraction–Instruction re-ordering–Dynamically searching independent operations within a limited scope–Trying to keep all available hardware resources busyfor(i=1~m) for(j=1~n) c(i,j)=c(i,j-1) +c(i-1,j)functionalunitsUnderstanding DG change for mapping to GPPfor(i=1~m) for(j=1~n) c(i,j)=c(i,j-1) + c(i-1,j)for(i=1~m) for(j=1~n) c(i,j)=c(i,j-1) + c(i-1,j)for(i=1~m) for(j=1~n) c(i,j)=c(i,j-1) + c(i-1,j)computationmem accesscontrol-relatedinstruction windowinstruction windowinstruction windowinstruction windowinstruction windowinstruction windowinstruction windowinstruction windowinstruction windowinstruction windowTo-do listInfrastructure–Building a perfect machine model simulator that keeps track of only computations in the algorithm, measuring ideal execution time of a compiled binary assuming perfect parallelism–Building a profile tool that locates an instruction that we are interested in among instructions in the binaryThe effect of single assignment–Characterizations on various machine configurationsThe effect of unfoldingThe effect of SIMD parallelismOptimization techniques for the Alpha architecture based on characterization dataOptimizing an existing DSP application: MPEG-2 decoderExpected ResultsSingle assignment transformation doesn’t work–Rather, try to recycle storage space whenever possible–HW-based single assignment is performedUnfolding transformation–It works on iteration-independent loops w/ trivial computations–It reduces the loop indices overhead (even on iteration-dependent loops)–Do not unfold loops w/ non-trivial computationsSIMD parallelism to reduce memory communications–Alpha doesn’t support SIMD instruction sets but it has 64-bit datapath and instructions–read/write 64 bits at a time by splitting/merging narrow words–There are more computing units (4) than memory units (2) : reducing memory operations helps performancePerformance improvement of MPEG-2 decoder based on the optimizations that we


View Full Document

UW-Madison ECE 734 - Mapping DSP algorithms

Documents in this Course
Load more
Download Mapping DSP algorithms
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Mapping DSP algorithms and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Mapping DSP algorithms 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?