DOC PREVIEW
UA ECE 274A - RTL Design

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1ECE 274 - Digital LogicLecture 16 Lecture 16 – RTL Design RTL Examples RTL Design Pitfalls and Good Practices Control and Data Dominated RTL Design2RTL Example: Video Compression – Sum of Absolute Differences Video is a series of frames (e.g., 30 per second) Most frames similar to previous frame Compression idea: just send difference from previous frameDigitizedframe 21 MbyteFrame 2Digitizedframe 1Frame 11 Mbyte(a)Digitizedframe 1Frame 11 Mbyte(b)Only difference: ball movingaDifference of2 from 10.01 MbyteFrame 2Just send difference3RTL Example: Video Compression – Sum of Absolute Differences Need to quickly determine whether two frames are similar enough to just send difference for second frame Compare corresponding 16x16 “blocks” Treat 16x16 block as 256-byte array Compute the absolute value of the difference of each array item Sum those differences – if above a threshold, send complete frame for second frame; if below, can use difference method (using another technique, not described)Frame 2Frame 1compareEach is a pixel, assume represented as 1 byte(actually, a color picture might have 3 bytes per pixel, for intensity of red, green, and blue components of pixel)4RTL Example: Video Compression – Sum of Absolute Differences Want fast sum-of-absolute-differences (SAD) component When go=1, sums the differences of element pairs in arrays Aand B, outputs that sum!(i<256)BAgoSADsad256-byte array256-byte arrayinteger5RTL Example: Video Compression – Sum of Absolute Differences S0: wait for go S1: initialize sumand index S2: check if done (i>=256) S3: add difference to sum, increment index S4: done, write to output sad_reg!(i<256)BAgoSADsadInputs: A, B (256 byte memory); go (bit)Outputs: sad (32 bits)Local registers: sum, sad_reg (32 bits); i (9 bits)!goS0goS1sum = 0i = 0S3sum=sum+abs(A[i]-B[i])i=i+1S4sad_reg = sumS2i<256(i<256)’a6RTL Example: Video Compression – Sum of Absolute Differences Step 2: Create datapath!(i<256)!(i<256) (i_lt_256)i_lt_256i_inci_clrsum_ldsum_clrsad_reg_ldDatapathsumsad_regsadAB_addr A_data B_data<256932888 8323232i–+absInputs: A, B (256 byte memory); go (bit)Outputs: sad (32 bits)Local registers: sum, sad_reg (32 bits); i (9 bits)!goS0goS1sum = 0i = 0S3sum=sum+abs(A[i]-B[i])i=i+1S4sad_reg=sumS2i<256(i<256)’a7RTL Example: Video Compression – Sum of Absolute Differences Step 3: Connect to controller Step 4: Replace high-level state machine by FSM!(i<256)!(i<256) (i_lt_256)S0S1S2S3S4go’gogo AB_rdsum=0i=0i<256!(i<256) (i_lt_256)?sum=sum+abs(A[i]-B[i])i=i+1sad_reg=sumControlleri_lt_256i_inci_clrsum_ldsum_clrsad_reg_ldsumsad_regsadAB_addr A_data B_data<256932888 8323232i–+absasum_ld=1; AB_rd=1sad_reg_ld=1i_inc=1i_lt_256i_clr=1sum_clr=18RTL Example: Video Compression – Sum of Absolute Differences Comparing software and custom circuit SAD  Circuit: Two states (S2 & S3) for each i, 256 i’sÆ 512 clock cycles Software: Loop (for i = 1 to 256), but for each i, must move memory to local registers, subtract, compute absolute value, add to sum, increment i– say about 6 cycles per array item Æ 256*6 = 1536 cycles Circuit is about 3 times(300%) faster Later, we’ll see how to build SAD circuit that is even faster!(i<256)!(i<256) (i_lt_256)S3sum=sum+abs(A[i]-B[i])i=i+1S2i<256(i<256)’9RTL Design Pitfalls and Good Practice Common pitfall: Assuming register is update in the state it’s written Final value of Q? Final state? Answers may surprise you Value of Qunknown Final state is C, not D Why? State A: R=99and Q=Rhappen simultaneously State B: Rnot updated with R+1until next clock cycle, simultaneously with state register being updatedA BCDR> = 1 0 0R< 1 0 0R= R+ 1R= 99Q=R??99A99?100B100?CR< 1 0 0clkRQ(a)(b)Local registers: R, Q (8 bits)10RTL Design Pitfalls and Good Practice Solutions Read register in following state (Q=R) Insert extra state so that conditions use updated value Other solutions are possible, depends on the exampleBA B2CDR> = 1 0 0R< 1 0 0R= R+ 1Q=RR= 9 9Q=R??99A99?100B100 10099 99B2 DR<100 R>=100clkRQ(a)(b)Local registers: R, Q (8 bits)11RTL Design Pitfalls and Good Practice Common pitfall: Reading outputs Outputs can only be written Solution: Introduce additional register, which can be written and readTSP=P+BP=A(a)Inputs: A, B (8 bits)Outputs: P (8 bits)Inputs: A, B (8 bits)Outputs: P (8 bits)Local register: R (8 bits)TSP=R+BR=AP=A(b)12RTL Design Pitfalls and Good Practice Good practice: Register all data outputs In fig (a), output Pwould show spurious values as addition computes Furthermore, longest register-to-register path, which determines clock period, is not known until that output is connected to another component In fig (b), spurious outputs reduced, and longest register-to-register path is clear+RBP(a)+RPregBP(b)13Control vs. Data Dominated RTL Design Designs often categorized as control-dominated or data-dominated Control-dominated design – Controller contains most of the complexity Data-dominated design – Datapath contains most of the complexity General, descriptive terms – no hard rule that separates the two types of designs Laser-based distance measurer – control dominated Bus interface, SAD circuit – mix of control and data Now let’s do a data dominated design14Data Dominated RTL Design Example: FIR Filter Filter concept Suppose Xis data from a temperature sensor, and particular input sequence is 180, 180, 181, 240, 180, 181 (one per clock cycle) That 240 is probably wrong! Could be electrical noise Filter should remove such noise in its output Y Simple filter: Output average of last Nvalues Small N: less filtering Large N: more filtering, but less sharp output1212YclkXdigital filter15Data Dominated RTL Design Example: FIR Filter FIR filter “Finite Impulse Response” Simply a configurable weighted sum of past input values y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)  Above known as “3 tap” Tens of taps more common Very general filter – User sets the constants (c0, c1, c2) to define specific filter RTL design Step 1: Create high-level state machine But there really is none! Data dominated indeed. Go straight to step 21212YclkXdigital filtery(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)16Data Dominated RTL Design Example: FIR Filter Step 2: Create datapath Begin by creating


View Full Document

UA ECE 274A - RTL Design

Download RTL Design
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view RTL Design and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view RTL Design 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?