1February 17, 2009 http://csg.csail.mit.edu/arvind L06-1IP LookupArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 17, 2009L06-2http://csg.csail.mit.edu/arvindIP Lookup block in a routerQueueManagerPacket ProcessorExit functionsControlProcessorLine Card (LC)IP LookupSRAM(lookup table)ArbitrationSwitchLCLCLCA packet is routed based on the “Longest Prefix Match”(LPM) of it’s IP address with entries in a routing tableLine rate and the order of arrival must be maintainedline rate ⇒15Mpps for 10GE2February 17, 2009L06-3http://csg.csail.mit.edu/arvind1823M RefResultIP addressE5.13.7.2C10.18.200.77.14.7.2F10.18.201.5F7.13.7.3Sparse tree representation3A…A…BC…C…5DF…F…14A…A…7F…F…200F…F…F*E5.*.*.*D10.18.200.5C10.18.200.*B7.14.7.3A7.14.*.*F…F…FF…E5710255014February 17, 2009L06-4http://csg.csail.mit.edu/arvind“C” version of LPMintlpm (IPA ipa) /* 3 memory lookups */{ int p;/* Level 1: 16 bits */p = RAM [ipa[31:16]]; if (isLeaf(p)) return value(p);/* Level 2: 8 bits */p = RAM [ptr(p) + ipa [15:8]]; if (isLeaf(p)) return value(p);/* Level 3: 8 bits */p = RAM [ptr(p) + ipa [7:0]]; return value(p); /* must be a leaf */}Not obvious from the C code how to deal with -memory latency- pipelining…216 -10……28 -10…28 -10Must process a packet every 1/15 μs or 67 nsMust sustain 3 memory dependent lookups in 67 nsMemory latency ~30ns to 40ns3February 17, 2009L06-5http://csg.csail.mit.edu/arvindLongest Prefix Match for IP lookup:3 possible implementation architecturesRigid pipelineInefficient memory usage but simple designLinear pipelineEfficient memory usage through memory port replicatorCircular pipelineEfficient memory with most complex controlDesigner’s Ranking:Which is “best”?Arvind, Nikhil, Rosenband & Dave ICCAD 2004February 17, 2009L06-6http://csg.csail.mit.edu/arvindCircular pipelineThe fifo holds the request while the memory access is in progressThe architecture has been simplified for the sake of the lecture. Otherwise, a “completion buffer” has to be added at the exit to make sure that packets leave in order.enter?enter?done?done?RAMyesinQfifonooutQ4February 17, 2009L06-7http://csg.csail.mit.edu/arvindinterface FIFO#(type t);method Action enq(t x); // enqueue an itemmethod Action deq(); // remove oldest entrymethod t first(); // inspect oldest itemendinterfaceFIFOn = # of bits neededto represent avalue of type tnot fullnot emptynot emptyrdyenabnnrdyenabrdyenqdeqfirstFIFOmoduleFebruary 17, 2009L06-8http://csg.csail.mit.edu/arvindRequest-Response Interface for Synchronous MemorySynch MemLatency NMaking a synchronous component latency-insensitive5February 17, 2009L06-9http://csg.csail.mit.edu/arvindCircular Pipeline Coderule enter (True);IP ip = inQ.first(); ram.req(ip[31:16]);fifo.enq(ip[15:0]); inQ.deq();endruleenter?enter?done?done?RAMinQfiforule recirculate (True);TableEntry p = ram.peek(); ram.deq();IP rip = fifo.first();if (isLeaf(p)) outQ.enq(p);else beginfifo.enq(rip << 8);ram.req(p + rip[15:8]);endfifo.deq();endruleWhen can enter fire?done? Is the same as isLeafFebruary 17, 2009L06-10http://csg.csail.mit.edu/arvindCircular Pipeline Code: discussionrule enter (True);IP ip = inQ.first(); ram.req(ip[31:16]);fifo.enq(ip[15:0]); inQ.deq();endruleenter?enter?done?done?RAMinQfiforule recirculate (True);TableEntry p = ram.peek(); ram.deq();IP rip = fifo.first();if (isLeaf(p)) outQ.enq(p);else beginfifo.enq(rip << 8);ram.req(p + rip[15:8]);endfifo.deq();endruleWhen can recirculatefire?6February 17, 2009L06-11http://csg.csail.mit.edu/arvindmodule mkFIFO1 (FIFO#(t));Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False);method Action enq(t x) if (!full);full <= True; data <= x;endmethodmethod Action deq() if (full);full <= False;endmethodmethod t first() if (full);return (data);endmethodmethod Action clear();full <= False;endmethodendmoduleOne Element FIFOnnot emptynot fullrdyenabrdyenabenqdeqFIFOmoduleFebruary 17, 2009L06-12http://csg.csail.mit.edu/arvindDead cyclesrule enter (True);IP ip = inQ.first(); ram.req(ip[31:16]);fifo.enq(ip[15:0]); inQ.deq();endruleenter?enter?done?done?RAMinQfiforule recirculate (True);TableEntry p = ram.peek(); ram.deq();IP rip = fifo.first();if (isLeaf(p)) outQ.enq(p);else beginfifo.enq(rip << 8);ram.req(p + rip[15:8]);endfifo.deq();endruleCan a new request enter the system when an old one is leaving?Is this worth worrying about?assume simultaneous enq& deqis allowed7February 17, 2009L06-13http://csg.csail.mit.edu/arvindThe Effect of Dead Cyclesenterenterdone?done?RAMyesinfifonoWhat is the performance loss if “exit” and “enter” don’t ever happen in the same cycle?Circular Pipeline RAM takes several cycles to respond to a request Each IP request generates 1-3 RAM requests FIFO entries hold base pointer for next lookup and unprocessed part of the IP addressFebruary 17, 2009L06-14http://csg.csail.mit.edu/arvindThe compiler issueCan the compiler detect all the conflicting conditions?Does the compiler detect conflicts that do not exist in reality?8February 17, 2009L06-15http://csg.csail.mit.edu/arvindScheduling conflicting rulesWhen two rules conflict on a shared resource, they cannot both execute in the same clockThe compiler produces logic that ensures that, when both rules are applicable, only one will fire Which one?source annotations(* descending_urgency = “recirculate, enter” *)February 17, 2009L06-16http://csg.csail.mit.edu/arvindSo is there a dead cycle?rule enter (True);IP ip = inQ.first(); ram.req(ip[31:16]);fifo.enq(ip[15:0]); inQ.deq();endruleenter?enter?done?done?RAMinQfiforule recirculate (True);TableEntry p = ram.peek(); ram.deq();IP rip = fifo.first();if (isLeaf(p)) outQ.enq(p);else beginfifo.enq(rip << 8);ram.req(p + rip[15:8]);endfifo.deq();endruleIn general these two rules conflict but when isLeaf(p) is true there is no apparent conflict!9February 17, 2009L06-17http://csg.csail.mit.edu/arvindRule Splitingrule foo (True);if (p) r1 <= 5;else r2 <= 7;endrulerule fooT (p);r1 <= 5;endrulerule fooF (!p);r2 <= 7;endrule≡rule fooT and fooF can be scheduled independently with some other ruleFebruary 17, 2009L06-18http://csg.csail.mit.edu/arvindSpliting the recirculate rulerule recirculate (!isLeaf(ram.peek()));IP rip = fifo.first(); fifo.enq(rip << 8);ram.req(ram.peek() + rip[15:8]);fifo.deq(); ram.deq();endrulerule exit (isLeaf(ram.peek()));outQ.enq(ram.peek()); fifo.deq(); ram.deq();endrulerule
View Full Document