Verilog Implementation of CORDIC adaptive lattice filter (CALF)Pranay Koka ([email protected])Abstract In this project the cordic based algorithm for an adaptive lattice filter was mapped into hardware. It involved applying techniques and algorithms like systolic array mapping, pipelining and retiming to achieve the desired objective of an efficient implementation of a the filter. The basic CALF algorithm was first analysed in-terms of the computation dependence and then the areas of improvement were spotted. Thealgorithm was then modified so that it could be pipelined and re-timed for speed. Systolic array mapping was also performed. This project also involved implementation of the improved version of the algorithm in verilog HDL. The filter modeled here is a 16-bit 2 – cordic stage, 2-section filter with saturation arithmetic.Introduction and related work: Lattice filters are used in various areas and are an important class of filters which need to be optimized for performance. This project aims at an efficient verilog implementation and synthesis of an adaptive lattice filter. It is important to quote the related work done in this field. The design of lattice filters have been improved with the advent of CORDIC (Co-ordinate rotation digital computer) algorithms[1]. Cordic algorithms are a class of algorithms that use basic shift and add techniques to realize complex trignometric, hyperbolic and logarithmic functions. All trignometric functions can be computed using vector rotations. The cordic algorithms provide efficient methods to perform the vector rotations by arbitrary angles using only shift and adds. Because of the shift and add properties these algorithms have become attractive for an FPGA implementation. A cordic based algorithm for an adaptive was proposed [2],[3], which used basic cordic iterative units to form a lattice section. The adaptive feedback was also provided inusing a simple implementation as discussed in the later sections. The Basic CALF algorithm: The calf algorithm proposed in [2] is given below: While(true) txtbtfoo;For p = 1 to Pl tfxp 10; 101tbyp;For i = 0 to n-1 Do 12 pdvii iyixvviyixiiii1 2 2 11111End nytbnxtfpp 11,If 120 nptd nynxtdtdppsgn.sgn.1Elseif 12or 0 npptdtd tdtdpp1Endt = t+1if t > tmax then break loopEndThe important point to note here is that the logic required to generate the ‘d’ vectors can be as simple as an up-down counter. The block diagram of the lattice filter is shown in figure 1Figure 1 block diagram of cordic adaptive lattice filterRepresentation of the CALF algorithm in dependence graph: The basic algorithm is a 3 nested loop. Analysing the inner loop and the next outer loop P,The dependence graph in Figure 2 shows a dependency between the iterations inside a section and also inter-section dependence. The DG represents a four section lattice filter with four cordic iterations in each section. As shown in the figure ith cordic iteration is dependent on the (i-1)th iteration and the first iteration of the p+1th lattice section is dependent on the last iteration of the pth section. DbDDCORDICSectionCORDICsectionUp/Down binary counterfFigure 2 Dependence graph between iterationsFigure 3 shows the dependence between the sections of the lattice filter with respect to time. The dependence of X and Y tallies with the placement of the extra delay in the case of y as shown in the figure. The iterations in p+1th section depend on the pth section.Figure 3 Dependence graph for inter-section dependenceiPX(0) y(-1) d(p,0)X(0,1,N) y(0,1,n)X(0,2,N) y(0,2,N)X(0,3,N) y(0,3,N)X(0)Y(-1)X(1)Y(0)d(0)d(1)d(2)d(3)tPSystolic array mapping: Before deriving the mapping for systolic arrays, let us see the 3 variable form of the above described CALF algorithm. For t = 0,1,2….. txtx ,0,0 1,0,0 txtyFor P = 1 to PL doFor I = 0 to N-1 do 1,*2,, iptdiptv 12*,,,,iiptviptc iptciptyiptxiptx ,,*,,,,1,, iptciptxiptyipty ,,*,,,,1,, End do NptyNptxtpdtpd ,,sgn*,,sgn,1,or tpdtpd ,1, Nptxptx ,,0,1, Nptypty ,,10,1, end doend doFrom the DG in figure 3 we have the following dependence matrix1 0 10 1 0DWe choose a scheduling vector 1 1TSand a projection vector: 1 0TDCausality constraint: 0vST 1101 11vST 1011 12vST 2111 13vSTThis satisfies the causality constraint.Resource conflict avoidance: 0dST 1101 1 dSTNow 0dpT 0 1TpPE assignment (node mapping) : ptp0 1ARC mapping:The linear schedule is given by1 1 02 1 11 0 11 1 0 0 11 1DpSttThe systolic array with this mapping is shown in figure 4Figure 4 Systolic array mapping after inter-section pipeliningThe systolic array in Figure4 is after coarse grain pipelining i.e. after pipelining between the sections of the lattice filter. Piplining and retiming inter-cordic interations. This being a stochastic gradient filter, we can add more delays to the computation of the d vector in the up-down counter without much changing the characteristics of the filter. Hence the algorithm will now change as follows:D D D D2D 2D 2DD D D…X(1) X(0)….Y(1) Y(0)For t = 0,1,2….. txtx ,0,0 1,0,0 txtyFor P = 1 to PL doFor I = 0 to N-1 do 1,*2,, iptdiptv 12*,,,,iiptviptc iptciptyiptxiptx ,,*,,,,1,, iptciptxiptyipty ,,*,,,,1,, End do NptyNptxtpdtpd ,,sgn*,,sgn3,1,or 3,1, tpdtpd
View Full Document