Purdue CS 63600 - Implementation Models - D677447

Home> Schools> Purdue University> Computer Sciences (CS) > CS 63600> Implementation Models

Purdue CS 63600 - Implementation Models

Pages 36

Download Save

Unformatted text preview:

Ramana Kompella Lecture 2: Implementation Models CS 636 Internetworking Spring 2009 1CS 636 Internetworking Spring 2009  Before we can play with improvements, we must know “the rules of the game”.  Algorithmics uses four separate areas: protocols, architecture, OS and algorithms  Question: How can a logic designer understand protocols and how can an algorithm designer understand hardware ◦ Use simple models that have explanatory and predictive power ◦ Leave details to the specialist 2CS 636 Internetworking Spring 2009  Networking (protocols)  Hardware  Architecture  Operating systems 3CS 636 Internetworking Spring 2009  Interfaces ◦ Used by applications or higher level protocols  Packet (message) format ◦ Examples: IP,TCP, UDP  Typical protocol operations ◦ Data manipulation ◦ Demultiplexing ◦ Looking up state ◦ Set timers and receive alarms 4CS 636 Internetworking Spring 2009  Throughput: number of jobs per second ◦ Focus of industry, ISPs wish to maximize  Latency: time to complete a job ◦ E.g. time for a packet to go from one end to another (RTT) ◦ Ideal interactive time 100ms 5CS 636 Internetworking Spring 2009  Backbone link speeds: ◦ Optical carrier (OC)-n  n x 51.8Mbits ◦ OC48 ~ 2.5 Gbps, OC192~10Gbps  TCP and web dominance : Web (70%) and TCP (90%)  Small transfers: 50% of accessed files < 50Kbytes  Latencies large, e.g. 100 ms for wide area  ~ half of packets 40 bytes, many 1500 bytes ◦ Takes 8 ns to send 40 bytes at 40 Gbps 6CS 636 Internetworking Spring 2009  Networking (protocols)  Hardware  Architecture  Operating systems 7CS 636 Internetworking Spring 2009  Combinatorial logic ◦ From transistors to gates ◦ Timing and power ◦ High level building blocks (standard cells)  Memories ◦ Registers, SRAM, DRAM ◦ Fancy memory tricks (page mode, interleaved) 8CS 636 Internetworking Spring 2009  A function from digital inputs to outputs ◦ Wires, transistors, resistors, capacitors ◦ Gates: AND, NAND, NOT, etc. (60 ps) ◦ More complex blocks: adders, multiplexers  Different designs for same boolean function offer different tradeoffs ◦ Time for the output to stabilize ◦ Number of transistors ◦ Power dissipation P=CV2F 9 Input I and outputs O are N-bit vectors such that ◦ O[j] = 1 iff I[j] = 1 and I[k] = 0, for all k<j ◦ Find first one, represent in one-hot notation  Design1: Implement using N AND gates where each gate takes all 1 to N inputs. ◦ Space - O(N2) since each AND gate needs O(N) transistors ◦ Time - O(1) CS 636 Internetworking Spring 2009 10 Design 2: Every output bit requires the AND of complment of first j-1 bits ◦ Construct recursively P[j] = P[j-1]. ~I[j] ◦ Using two N two input AND gates in series ◦ O(N) transistors but takes O(N)  Design 3: Tradeoff design. Use balanced binary tree to compute P[j] and combine partial results. CS 636 Internetworking Spring 2009 11 Build four input mux from 2 input muxes. CS 636 Internetworking Spring 2009 12 PPE: like a priority encoder but with rotating priority ◦ Find first 1 beyond P  Arises in switch arbitration  Design 1: ◦ Use a barrel shifter to shift I to the left by P bits. ◦ Apply PE ◦ Shift back by P bits.  Two shifts + PE ~ 3 log N gate delays. ◦ Each barrel shift using tree of 2-muxes ~ log (N) CS 636 Internetworking Spring 2009 13 Smarter scheme due to Gupta/McKeown  First check if any bit set at or after P  If yes, apply PE on bits after P  If no, apply PE on bits before P  Used in Tiny Tera and Abrizio  Tested using TI libraries ◦ 2x fast and 3x less area ◦ For a 32 port router. CS 636 Internetworking Spring 2009 14 Router forwarding done using logic but packets stored in memories  Memory are significant bottlenecks ◦ Slower than logic delays  Different subsystems require different types ◦ 200ms worth of packet buffering (DRAM) ◦ <1Mbyte of fast SRAM for forwarding tables  Need different models CS 636 Internetworking Spring 2009 15 Need to store bit such that it stays indefinitely (absence of writes/power failures)  Register is ordered collection of flip-flops  Access from logic to a register ~ 0.5-1ns CS 636 Internetworking Spring 2009 16 N registers addressed by log N bits Addresses  Self refreshing.  Needs a decoder to translate A to unary so add decode delay.  On chip SRAM 1-2ns,  Off-chip 5-10ns. CS 636 Internetworking Spring 2009 17 SRAM flip-flop needs 5 transistors.  DRAM needs one transistor plus capacitor (so more dense)  Leakage fixed by refresh.  Slower because output not driven by the power supply as in SRAM  40-60ns CS 636 Internetworking Spring 2009 18CS 636 Internetworking Spring 2009  Page mode DRAM (latency 40-60ns)  Direct decoding hard: use divide and conquer ◦ Decode in stages, row first and column next ◦ Reduces decoder from O(N) to O(√N) gates  Accesses within rows do not require Row address strobe. 19CS 636 Internetworking Spring 2009  Increase DRAM throughput (same latency)  While Bank 1 works, send addresses to Bank 2, etc.  SDRAM uses 2 banks, RDRAM uses 16 banks.  If consecutive accesses to different banks, bandwidth increased by a factor of B 20CS 636 Internetworking Spring 2009  Example: Router with five 10GBps links  Overall buffering – 200ms * 50Gbps  Want to use DRAM ◦ Bw required 2 * 50 = 100 Gbits/sec + overhead ~ 200 Gbps  Use single RDRAM with 16 banks ◦ Peak bw of 1.6 Gbytes or 13Gbps  Accessing each RDRAM requires 64 pins for data, 25 for address/control ~90 ◦ 200 Gbps memory requires 16 RDRAMs ~ 1440 pins  Chips have pin limtiations (typically <1000pins)  Thus requires atleast one more chip. 21 Box architect partitions functions between chips  Design teams write specifications, logic designers write RTL using verilog  Synthesize gates to generate circuits.  Timing checks to change design.  Finally chip tapes out, manufactured, yield inspected  Easy to add

View Full Document


School:
Email:
New Password:
Confirm Password:

Purdue CS 63600 - Implementation Models

Sign up for free to view:

Please select your school