CUSTOM-FPGA DESIGN AND MAPPING FOR DSP TRANSFORMSOUTLINEDSP TRANSFORMSMATH BACKGROUNDSlide 5Slide 6Slide 7LUT ENTRIESProposed ArchitectureSYSTOLIC ARRAY ARCHITECTURESlide 11Slide 12Slide 13LOGIC BLOCK OF THE FPGAPIPELINING OF THE DFGSlide 16Slide 17Slide 18WHAT’s NEW IN THIS?MAPPING ALGORITHMS TO FPGADFT SOFTWARE CODEDFT – Higher levelSlide 23Slide 24DCT/DST SOFTWARE CODEDCT/DST – HIGHER LEVELSlide 27Slide 28Slide 29DHT – HIGHER LEVELDHT – HIGHER LEVELSlide 32MAPPING AT LOGIC BLOCK LEVELLOGIC BLOCK LEVELSlide 35Slide 36Slide 37DHT – LOGIC BLOCK LEVELFPGA OR DSPSTATISTICSTHANK YOU! QUESTIONS?CUSTOM-FPGA DESIGN AND MAPPING FOR DSP TRANSFORMSBHARATHWAJ V SANKARASHUBHIKA TANEJAOUTLINEMathematics of different DSP algorithmsGeneralization of DSP algorithmsSystolic array architectureLogic block architecture & pipeliningMapping different algorithms to hardwareStatisticsDSP TRANSFORMSDFT: X(k) =DCT: X(k) =DST: X(k) =DHT: X(k) = 10)2/)12(*cos()(Nnk Nknpinx10)/*2*exp(*)(NnNnkpijnx10))1/()1)(1(sin(*)(NnNknpinx10))/*2sin()/*2(cos(*)(NnNnkpiNn kpinx1/√N, k=0 √2/N, k= 1 to N-1αk =MATH BACKGROUNDLet the Transform function be δ(n,k)then X(k)=Where For k=0;X(0) = ),()(),( knnxkn),(10knNn10)0,3()0,2()0,1()0,0()0,(NnnFor a 4 point transform:X(0) = x(0)δ(0,0) + x(1)δ(1,0) + x(2)δ(2,0) + x(3)δ(3,0)X(1) = x(0)δ(0,1) + x(1)δ(1,1) + x(2)δ(2,1) + x(3)δ(3,1)X(2) = x(0)δ(0,2) + x(1)δ(1,2) + x(2)δ(2,2) + x(3)δ(3,2)X(3) = x(0)δ(0,3) + x(1)δ(1,3) + x(2)δ(2,3) + x(3)δ(3,3)Consider DFT:X(k) = ==This is simillar to Hartley Transform except that the second term is multiplied by –j coefficient. Xh(k) = 10))/*2sin()/*2(cos(*)(NnNnkpijNnkpinx10)/*2*exp(*)(NnNnkpijnx10)/*2sin()()/*2cos()(NnNnkpinjxNnkpinx10)/*2sin()()/*2cos()(NnNnkpinxNnkpinxGeneralizing the transforms:X(k) = Where: cos(2*pi*nk/2N) – jsin(2*pi*nk/2N) - DFT αkcos[pi(2n+1)k/2N] - DCTδ(n,k) = sin[pi(n+1)(k+1)/(N+1)] - DST cos(2*pi*nk/N) + sin(2*pi*nk/N) - DHT10),()(NnknnxLUT ENTRIESTRANSFORMδ1(n,k) δ2(n,k)DFT cos(2*pi*nk/N) -jsin(2*pi*nk/N)DCT αkcos*[pi(n+1)k/2N] 0DSTcos[90-pi(n+1)(k+1)/(N+1)]0DHT cos(2*pi*nk/N) sin(2*pi*nk/N)SAD 1 -1δ(n,k) = δ1(n,k) + δ2(n,k)Proposed ArchitectureSYSTOLIC ARRAY ARCHITECTURESYSTOLIC ARRAY ARCHITECTUREδ(0,0) δ(0,1)δ(3,2)δ(2,0)δ(3,1)δ(2,3)δ(0,2)δ(1,0) δ(1,1)δ(2,2)δ(2,1)δ(3,0)δ(1,2) δ(1,3)δ(0,3)δ(3,3)δ(n,k)= δ1(n,k)+ δ2(n,k)SYSTOLIC ARRAY ARCHITECTUREδ(0,0) δ(0,1)δ(3,2)δ(2,0)δ(3,1)δ(2,3)δ(0,2)δ(1,0) δ(1,1)δ(2,2)δ(2,1)δ(3,0)δ(1,2) δ(1,3)δ(0,3)δ(3,3)x(0)x(1)x(2)x(3)δ(n,k)= δ1(n,k)+ δ2(n,k)SYSTOLIC ARRAY ARCHITECTUREδ(0,0) δ(0,1)δ(3,2)δ(2,0)δ(3,1)δ(2,3)δ(0,2)δ(1,0) δ(1,1)δ(2,2)δ(2,1)δ(3,0)δ(1,2) δ(1,3)δ(0,3)δ(3,3)x(0)x(1)x(2)x(3)X(0) X(1) X(2) X(3)δ(n,k)= δ1(n,k)+ δ2(n,k)LOGIC BLOCK OF THE FPGALUTk1k2××+/-+From other LBMode Mode δ(n,k)= δ1(n,k1)+ δ2(n,k2)Mode= SAD selectionModex(n)x(n)/y(n)PIPELINING OF THE DFGL×× MUXMUX+ +MUXPIPELINING OF THE DFGL×× MUXMUX+ +MUXPIPELINING OF THE DFGL×× MUXMUX+ +MUXDDDDDDDDDPIPELINING OF THE DFGL×× MUXMUX+ +MUXTcritial = max{ TL, TM, TA+TMUX} = TMDDDDDDDDDWHAT’s NEW IN THIS?Customized for transformsC-Code CAD - Systolic array architecture – suited for transformsEasier routingSpecific Data-Path that supports tranformsBetter performance Better utilization of resources.intuitiveMAPPING ALGORITHMS TO FPGADFT SOFTWARE CODE//PSEUDOCODEfor (i=0;i<N;i++) {X(i) = 0; X*(i) = 0;for (j=0;j<N;j++) {X(i) = X(i) + x(j)*cos(2*pi*i*j/N);X*(i) = X*(i) + x(j)*sin(2*pi*i*j/N);}}DFT – Higher levelδ1(0,2)δ1(1,2)δ2(0,2)δ2(1,2)δ1(0,3)δ1(1,3)δ1(2,2)δ1(3,2)δ2(2,2)δ2(3,2)δ1(2,3)δ1(3,3)δ2(2,3)δ2(3,3)δ2(0,3)δ2(1,3)δ1(0,0)δ1(1,0)δ2(0,0)δ2(1,0)δ1(0,1)δ1(1,1)δ1(2,0)δ1(3,0)δ2(2,0)δ2(3,0)δ1(2,1)δ1(3,1)δ2(2,1)δ2(3,1)δ2(0,1)δ2(1,1)DFT – Higher levelδ1(0,2)δ1(1,2)δ2(0,2)δ2(1,2)δ1(0,3)δ1(1,3)δ1(2,2)δ1(3,2)δ2(2,2)δ2(3,2)δ1(2,3)δ1(3,3)δ2(2,3)δ2(3,3)δ2(0,3)δ2(1,3)x(0)x(1)x(2)x(3)δ1(0,0)δ1(1,0)δ2(0,0)δ2(1,0)δ1(0,1)δ1(1,1)δ1(2,0)δ1(3,0)δ2(2,0)δ2(3,0)δ1(2,1)δ1(3,1)δ2(2,1)δ2(3,1)δ2(0,1)δ2(1,1)x(0)x(1)x(2)x(3)DFT – Higher levelδ1(0,2)δ1(1,2)δ2(0,2)δ2(1,2)δ1(0,3)δ1(1,3)δ1(2,2)δ1(3,2)δ2(2,2)δ2(3,2)δ1(2,3)δ1(3,3)δ2(2,3)δ2(3,3)δ2(0,3)δ2(1,3)x(0)x(1)x(2)x(3)δ1(0,0)δ1(1,0)δ2(0,0)δ2(1,0)δ1(0,1)δ1(1,1)δ1(2,0)δ1(3,0)δ2(2,0)δ2(3,0)δ1(2,1)δ1(3,1)δ2(2,1)δ2(3,1)δ2(0,1)δ2(1,1)x(0)x(1)x(2)x(3) X(2) X*(2) X(3) X*(3)X(0) X*(0) X(1) X*(1)DCT/DST SOFTWARE CODE//PSEUDOCODEfor (i=0;i<N;i++) {X(i) = 0; X*(i) = 0;for (j=0;j<N;j++) {X(i) = X(i) + x(j)*cos(pi*(2n+1)k/2N);}}DCT/DST – HIGHER LEVELδ(0,0)δ(1,0)δ(0,1)δ(1,1)δ(0,2)δ(1,2)δ(2,0)δ(3,0)δ(2,1)δ(3,1)δ(2,2)δ(3,2)δ(2,3)δ(2,3)δ(0,3)δ(1,3)δ(0,0)δ(1,0)δ(0,1)δ(1,1)δ(0,2)δ(1,2)δ(2,0)δ(3,0)δ(2,1)δ(3,1)δ(2,2)δ(3,2)δ(2,3)δ(2,3)δ(0,3)δ(1,3)DCT/DST – HIGHER LEVELδ(0,0)δ(1,0)δ(0,1)δ(1,1)δ(0,2)δ(1,2)δ(2,0)δ(3,0)δ(2,1)δ(3,1)δ(2,2)δ(3,2)δ(2,3)δ(2,3)δ(0,3)δ(1,3)x(0)x(1)x(2)x(3)δ(0,0)δ(1,0)δ(0,1)δ(1,1)δ(0,2)δ(1,2)δ(2,0)δ(3,0)δ(2,1)δ(3,1)δ(2,2)δ(3,2)δ(2,3)δ(2,3)δ(0,3)δ(1,3)x(1)x(2)x(3)x(0)DCT/DST – HIGHER LEVELδ(0,0)δ(1,0)δ(0,1)δ(1,1)δ(0,2)δ(1,2)δ(2,0)δ(3,0)δ(2,1)δ(3,1)δ(2,2)δ(3,2)δ(2,3)δ(2,3)δ(0,3)δ(1,3)x(0)x(1)x(2)x(3)X(0) X(1) X(2) X(3)δ(0,0)δ(1,0)δ(0,1)δ(1,1)δ(0,2)δ(1,2)δ(2,0)δ(3,0)δ(2,1)δ(3,1)δ(2,2)δ(3,2)δ(2,3)δ(2,3)δ(0,3)δ(1,3)x(1)x(2)x(3)x(0)X(0) X(1) X(2) X(3)DCT/DST SOFTWARE CODE//PSEUDOCODEfor (i=0;i<N;i++) {X(i) = 0; X*(i) = 0;for (j=0;j<N;j++) {X(i) = X(i) + x(j)*(cos(2*pi*nk/N)+sin(2*pi*nk/N);}}DHT – HIGHER LEVELδ(0,0) δ(0,1)δ(3,2)δ(2,0)δ(3,1)δ(2,3)δ(0,2)δ(1,0) δ(1,1)δ(2,2)δ(2,1)δ(3,0)δ(1,2) δ(1,3)δ(0,3)δ(3,3)DHT – HIGHER LEVELδ(0,0) δ(0,1)δ(3,2)δ(2,0)δ(3,1)δ(2,3)δ(0,2)δ(1,0) δ(1,1)δ(2,2)δ(2,1)δ(3,0)δ(1,2) δ(1,3)δ(0,3)δ(3,3)x(0)x(1)x(2)x(3)DHT – HIGHER LEVELδ(0,0) δ(0,1)δ(3,2)δ(2,0)δ(3,1)δ(2,3)δ(0,2)δ(1,0) δ(1,1)δ(2,2)δ(2,1)δ(3,0)δ(1,2) δ(1,3)δ(0,3)δ(3,3)x(0)x(1)x(2)x(3)X(0) X(1) X(2) X(3)MAPPING AT LOGIC BLOCK
View Full Document