Unformatted text preview:

Fast Fourier Transform:Practical aspects and Basic ArchitecturesMultiplication complexity per output pointMultiplies and addsStructural considerationsInverse FFTIn-place computationRegularity, parallelismQuantization noiseParticular casesDFT pruningRelated transformsRelated transforms: DHTRelated transforms: DCTRelationship with FFTImplementation issuesImplementation on general purpose computersDigital Signal ProcessorsVector and multi-processorsASICsArchitecturesThe naive approach1 multiply-add cellCascade FFTFFT networkFFT networkThe perfect-shuffle networkThe Mesh implementationPerformance summaryReadings6.973 Communication System Design – Spring 2006Massachusetts Institute of TechnologyVladimir StojanovićCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Fast Fourier Transform:Practical aspects and Basic ArchitecturesLecture 9Multiplication complexity per output point CTFFT and SRFFTCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].6.973 Communication System Design2CTFFT and SRFFT0.01.02.03.04.05.06.07.08.09.010.03 4 5 6 7 8 9 10n = log2 NM/N- radix 2- radix 4- split-radix- lower boundMrMcFigure by MIT OpenCoursWare.3Multiplies and addsReal multipliesReal adds Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].6.973 Communication System DesignN Radix 2Radix 2Radix 4Radix 4SRFFTSRFFTPFAPFAWinogradWinograd16326412825651210242048163264128256512102420483060120240504100825203060120240504100825202488264712180043601024823560202081392785620686819651612843076717216388136276632157235489492N15240810322504589613566307286861614897654882833614838896423085380122922765261444384888207648121338829548840763848882076501614540346689962810020046011002524580417660Figure by MIT OpenCourseWare.Structural considerations How to compare different FFT algorithms? Many metrics to choose from The ease of obtaining the inverse FFT In-place computation Regularity Computation Interconnect Parallelism and pipelining Quantization noiseCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].6.973 Communication System Design4Inverse FFT FFTs often used for computing FIR filtering Fast convolution (FFT + pointwise multiply + IFFT) In some applications (like 802.11a) Can reuse FFT block to do the IFFT (half-duplex scheme) Simple trick [Duhamel88] Swap the real and imaginary inputs and outputs If FFT(xR,xI,N) computes the FFT of sequence xR(k)+jxI(k) Then FFT(xI,xR,N) computes the IFFT of jxR(k)+xI(k)Exchange the real and imag partCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].6.973 Communication System Design5In-place computation Most algorithms allow in-place computation Cooley-Tukey, SRFFT, PFA No auxilary storage of size dependent on N is needed WFTA (Winograd Fourier Transform Algorithm) does not allow in-place computation A drawback for large sequences Cooley-Tukey and SRFFT are most compatible with longer size FFTsCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].6.973 Communication System Design6Regularity, parallelism Regularity Cooley-Tukey FFT very regular Repeat butterflies of same type Sums and twiddle multiplies SRFFT slightly more involved Different butterfly types in parallel e.g. radix-2 and radix-4 used in parallel on even/odd samples PFA even more involved Repetitive use of more complicated modules (like cyclic convolution, for prime length DFTs) WFTA most involved Repetition of parts of the cyclic conv. modules from PFA Parallelization Fairly easy for C-TFFT and SRFFT Small modules applied on sets of data that are separable and contiguous More difficult for PFA Data required for each module not in contiguous locationsCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].6.973 Communication System Design76.973 Communication System Design8Quantization noise Roundoff noise generated by finite precision of operations inside FFT (adds, multiplies) CTFFT (lengths 2n) Four error sources per butterfly (variance 2-2B/12) Total variance per butterfly 2-2B/3 Each output node receives signals from a total of N-1 butterflies in the flow graph (N/2 from the first stage, N/4 in the second, …) Total variance for each output ~ N/3*2-2B Assuming input power 1/3N2(|x(n)|<1/N to avoid overflow) Output power is 1/3N Error-to-signal ratio is then N22-2B(needs 1 additional bit per stage to maintain SER) Since a maximum magnitude increases by less than 2x from stage to stage we can prevent the overflow by requiring that |x(n)|<1 and scaling by ½ from stage-to-stage The output will be 1/N of the previous case, but the input magnitude can be Nx larger, improving the SER Error-to-signal ratio is then 4N*2-2B(needs 1/2 additional bit per stage to maintain SER) Radix-4 and SRFFT generate less roundoff noise than radix-2 WFTA Fewer multiplications (hence fewer noise sources) More difficult to include proper rescaling in the algorithm Error-to-signal ratio is higher than in CTFFT or SRFFT Two more bits are necessary to represent data in WFTA for same error order as CTFFTCite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Particular cases DFT algorithms for real data sequence xk Xkhas Hermitian


View Full Document

MIT 6 973 - Practical aspects and Basic Architectures

Download Practical aspects and Basic Architectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Practical aspects and Basic Architectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Practical aspects and Basic Architectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?