Berkeley COMPSCI C267 - Advances in the Parallelization of Music and Audio Applications - D2393426

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI C267> Advances in the Parallelization of Music and Audio Applications

DOC PREVIEW

Berkeley COMPSCI C267 - Advances in the Parallelization of Music and Audio Applications

School name University of California, Berkeley

Course Compsci C267- Applications of Parallel Computers

Pages 19

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Slide 1OverviewCurrent Support for Parallelism is Copy-BasedPartitioned ConvolutionPartitioned ConvolutionUniform Partitioned ConvolutionFrequency Delay Line ConvolutionMultiple FDL ConvolutionScheduling Multiple FDLsAuto-Tuning for Real-TimeAccelerating Non-Negative Matrix Factorization (NMF)Slide 12Slide 13Slide 14Example of Music ApplicationA plea for more flexible GPU I/OThanks for your attention.Reserve SlidesTessellation in Server Environment+Advances in the Parallelization of Music and Audio ApplicationsEric Battenberg, David Wessel & Juan Colmenares+OverviewParallelism today in the popular interactive music languagesParallel Partitioned ConvolutionAccelerating Non-Negative Matrix Factorization (NMF) for use in audio source separation and music information retrieval and the importance of Selective, Embedded Just In Time Specialization (SEJITS)Real-time in the Tessellation OSA plea for more flexible I/0 with GPUs+Current Support for Parallelism is Copy-Based The widely used languages for music and audio applications are fundamentally sequential in character – this includes Max/MSP, PD, SuperCollider, and CHUCK among others.Limited multithreadingOne approach to exploiting multi-core processors is to run copies of the applications on separate cores.Max/MSP provides a useful multi-threading mechanism called poly~ .PD provides PD~ each instance of which runs in a separate thread inside a PD patch.+Partitioned ConvolutionFirst real-time app in the Par Lab.Partitioned Convolution – an efficient way to do low-latency filtering with a long (> 1 sec) impulse response.Important in real-time reverb processing for environment simulation.Sound examples:Acoustic Guitar …in a giant mausoleum…convolved with a sine sweepImpulse responseImpulse response+Partitioned ConvolutionConvolution: a way to do linear filtering with a finite impulse response (FIR) filter.Direct convolution: For length L filter, O(L) ops per output point, zero delay.L can be greater than 100,000 samples (> 3 sec of audio)Block FFT Convolution:Only O(log(L)) ops per output point, but delay of L.How can we trade off between complexity and latency?€ y[n] = h[k]x[n − k]k∑FFTFFTComplex MultComplex MultIFFTIFFTxxyyHH H = FFT(h)+Uniform Partitioned ConvolutionWe would like the latency to be less than 10ms (512 samples)Cut an impulse response up into equal-sized blocks.Then we can use a parallellayout of Block FFT convolverswith delays to implement the filter.The latency is now N, and we still get complexity savings.LN1 43 52delay(N)delay(N)delay(N)delay(N)delay(N)delay(N)delay(N)delay(N)1122334455++xxyyBlock FFT Convolver+Frequency Delay Line ConvolutionWe can also exploit linearity of the FFT so that only one FFT/IFFT is required.So the parallel Block FFT Convolver above becomes a Frequency Delay Line (FDL) Convolver: delay(N)delay(N)delay(N)delay(N)112233++xxyyFFTFFTComplex MultComplex MultIFFTIFFTH1H1Block FFT Convolverdelay(N)delay(N)delay(N)delay(N)++xxyyComplex MultComplex MultH1H1Complex MultComplex MultH2H2Complex MultComplex MultH3H3FFTFFTIFFTIFFTFrequency Delay Line Convolver+Multiple FDL ConvolutionIf L is big (e.g. > 100,000) and N is small (e.g. < 1000), our FDL will have 100’s of partitions to handle.We can connect multiple FDL’s in parallel to get the best of both worlds.xxdelay(Nx6)delay(Nx6)delay(4Nx4)delay(4Nx4)FDL 1FDL 1FDL 2FDL 2FDL 3FDL 3++yyxxFDLFDLyy+Scheduling Multiple FDLsFDLs are run in separate threads.Each is allowed to compute for a length of time corresponding to its block size.Synchronization is performed at the vertical lines.+Auto-Tuning for Real-TimeWe are not trying to only maximize throughput.We are trying to improve our ability to make real-time guarantees.For now, we estimate a Worst-Case Execution Time (WCET) for each size of FDL.Then we combine the FDLs that are most likely to meet their scheduling deadlines. In the future, we will use a notion of predictability along with more robust scheduling.We are finishing development on a Max/MSP object, Audio Unit plugin, and a portable standalone version of this.+Accelerating Non-Negative Matrix Factorization (NMF) NMF is widely used in audio source separation. The idea is to factor the time/frequency representation (spectogram) into source coupled spectral (W) and gain (H) matricies.+The Importance of SEJITSin Developing an Information Retrieval (MIR) ApplicationRather using a domain restricted language developers write in a full blown scripting language such as PYTHON or RUBY.Functions are selected by annotation as performance critical.If efficiency layer implementations of these functions are available appropriate code is generated and JIT compiled.If not the selected function is executed in the scripting language itself.The scripted implementation remains as the portable reference implementation.With this simple music computer application we expect to initially show that Tessellation can provide acceptable performance and time predictabilityIn cooperation with the OS Group2nd-level RT scheduler ACell A2nd-level RT scheduler BCell BInitial CellSound cardShellFOutputInputMusic ProgramEnd-to-end DeadlineIntermediate DeadlineAudio Processing & Synthesis EngineChannelFMost of the engine’s functionalityFilterParallel version of a partition-based convolution algorithmAudio InputAdditional CellsA real-time application in Tessellation2nd-level SchedulingCellTessellation Kernel(Partition Support)(*) Bottom part of the diagram was adapted from Liu and Asanovic, “Mitosys: ParLab Manycore OS Architecture,” Jan. 2008.1.A) Cell and Space PartitioningA Spatial Partition (or Cell) comprises a group of processors acting within a hardware boundaryEach cell receives a vector of basic resources–Some number of processors, a portion of physical memory, a portion of shared cache memory, and potentially a fraction of memory bandwidthA cell may also receive –Exclusive access to other resources (e.g., certain hardware devices and raw storage partition)–Guaranteed fractional services (i.e., QoS guarantees) from other partitions (e.g., network service and file service)CPUCPUL1L1L2BankL2BankDRAMDRAMDRAM & I/O InterconnectDRAM & I/O InterconnectL1 InterconnectL1

View Full Document

Berkeley COMPSCI C267 - Advances in the Parallelization of Music and Audio Applications

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Berkeley COMPSCI C267 - Advances in the Parallelization of Music and Audio Applications

Sign up for free to view:

Please select your school