Slide 1IP Lookup block in a routerSparse tree representationTable representation issues“C” version of LPMIP Lookup:Static PipelineStatic (Synchronous) Pipeline MicroarcitectureStatic codeThe next functionThe nextReq functionAnother Static OrganizationCode for Static-2 OrganizationImplementations of Static pipelines Two designers, two resultsSlide 15Circular pipelineCircular Pipeline CodeCompletion bufferSlide 19Longest Prefix Match for IP lookup: 3 possible implementation architecturesSynthesis resultsA problem ...One Element FIFOAnother Problem: Dead cycle eliminationFebruary 21, 2007 http://csg.csail.mit.edu/6.375/ L07-1Bluespec-4:Architectural exploration usingIP lookupArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 21, 2007L07-2http://csg.csail.mit.edu/6.375/IP Lookup block in a routerQueueManagerPacket ProcessorExit functionsControlProcessorLine Card (LC)IP LookupSRAM(lookup table)ArbitrationSwitchLCLCLCA packet is routed based on the “Longest Prefix Match” (LPM) of it’s IP address with entries in a routing tableLine rate and the order of arrival must be maintainedline rate 15Mpps for 10GEFebruary 21, 2007L07-3http://csg.csail.mit.edu/6.375/1823IP address Result M Ref7.13.7.3 F10.18.201.5 F7.14.7.25.13.7.2 E10.18.200.7 CSparse tree representation3A…A…BC…C…5DF…F…14A…A…7F…F…200F…F…F*E5.*.*.*D10.18.200.5C10.18.200.*B7.14.7.3A7.14.*.*F…F…FF…E57102550144AReal-world lookup algorithms are more complex but all make a sequence of dependent memory references.February 21, 2007L07-4http://csg.csail.mit.edu/6.375/Table representation issuesTable sizeDepends on the number of entries: 10K to 100KToo big to fit on chip memory SRAM DRAM latency, cost, power issuesNumber of memory accesses for an LPM?Too many difficult to do table lookup at line rate (say at 10Gbps)Control-plane issues:incremental table updatesize, speed of table maintenance softwareIn this lecture (to fit the code on slides!):Level 1: 16 bits, Level 2: 8 bits, Level 3: 8 bits from 1 to 3 memory accesses for an LPMFebruary 21, 2007L07-5http://csg.csail.mit.edu/6.375/“C” version of LPMintlpm (IPA ipa) /* 3 memory lookups */{ int p;/* Level 1: 16 bits */ p = RAM [ipa[31:16]]; if (isLeaf(p)) return value(p);/* Level 2: 8 bits */ p = RAM [ptr(p) + ipa [15:8]]; if (isLeaf(p)) return value(p);/* Level 3: 8 bits */ p = RAM [ptr(p) + ipa [7:0]]; return value(p); /* must be a leaf */}Not obvious from the C code how to deal with - memory latency - pipelining…216 -10……28 -10…28 -10Must process a packet every 1/15 s or 67 nsMust sustain 3 memory dependent lookups in 67 nsMemory latency ~30ns to 40nsFebruary 21, 2007 http://csg.csail.mit.edu/6.375/ L07-6IP Lookup:Microarchitecture -1Static PipelineFebruary 21, 2007L07-7http://csg.csail.mit.edu/6.375/Static PipelineAssume the memory has a latency of n (4) cycles and can accept a request every cycleAssume every IP look up takes exactly m (3) memory reads Assuming there is always an input to process: Pipelining to deal with latencyNumber of lookups per packetThe system needs space for at least n packets for full pipeliningInefficient memory usage – unused memory slots represent wasted bandwidthDifficult to schedule table updatesFebruary 21, 2007L07-8http://csg.csail.mit.edu/6.375/Static (Synchronous) Pipeline MicroarcitectureProvide n (> latency) registers; mark all of them as EmptyLet a new message enter the system when the last register is empty or an old request leavesEach Register r hold either the result value or the remainder of the IP address. r5 also has to hold the next address for the memorytypedef union tagged { Value Result; struct{Bit#(16) remainingIP; Bit#(19) ptr;} IPptr} regDataThe state c of each register istypedef enum { Empty , Level1 , Level2 , Level3} State;RAMreqIP addrrespRAM latency=4ri, ciFebruary 21, 2007L07-9http://csg.csail.mit.edu/6.375/Static coderule static (True); if (next(c5) == Empty) if (inQ.notEmpty) begin IP ip = inQ.first(); inQ.deq(); ram.req(ext(ip[31:16])); r1 <= IPptr{ip[15:0],?}; c1 <= Level1; end else c1 <= Empty; else begin r1 <= r5; c1 <= next(c5); if(!isResult(r5)) ram.req(ptr(r5));end r2 <= r1; c2 <= c1; r3 <= r2; c3 <= c2; r4 <= r3; c4 <= c3; TableEntry p; if((c4 != Empty)&& !isResult(r4)) p <- ram.resp(); r5 <= nextReq(p, r4); c5 <= c4; if (c5 == Level3) outQ.enq(result(r5)); endruleRAMreqIP addrrespRAM latency=4r1, c1inQoutQr2, c2r3, c3r4, c4r5, c5action value methodFebruary 21, 2007L07-10http://csg.csail.mit.edu/6.375/The next functionfunction State next (State c); case (c) Empty : return(Empty); Level1 : return(Level2); Level2 : return(Level3); Level3 : return(Empty); endcaseendfunctionFebruary 21, 2007L07-11http://csg.csail.mit.edu/6.375/The nextReq functionfunction RegData nextReq(TableEntry p, RegData r); case (r) matches tagged Result .*: return r; tagged IPptr .ip: if (isLeaf(p)) return tagged Result value(p), else return tagged IPptr{remainingIP: ip.remainingIP << 8, ptr: ptr(p) + ip.remainingIP[15:8] };endcaseendfunctionFebruary 21, 2007L07-12http://csg.csail.mit.edu/6.375/Another Static OrganizationRAMFSMMUX / De-MUXFSM FSM FSMCounterMUX / De-MUXresultIP addrEach packet is processed by its own FSM Counter determines which FSM gets to goFebruary 21, 2007L07-13http://csg.csail.mit.edu/6.375/Code for Static-2 Organizationfunction Action doFSM(r,c); action if (c == Empty) else if (c == Level1 || c == Level2) begin else if (c == Level3) beginendaction endfunctionrule static2(True); cnt <= cnt + 1; for (Integer i=0; i<maxLat; i=i+1) if(fromInteger(i)==cnt) doFSM(r[cnt],c[cnt]); endruleRAMFSMMUX / De-MUXFSM FSM FSMMUX / De-MUXresultIP addrCounterif (inQ.notEmpty) begin IP ip = in.first(); inQ.deq(); ram.req(ext(ip[31:16])); c <= Level1; r <= IPptr{ip[15:0],?}; end else c <= Empty;if (!isResult(r)) p <- ram.resp(); RegData nextr = nextReq(p, r); if (!isResult(nextr)) ram.req(ptr(nextr)); c <= next(c); r <= nextr; endif (!isResult(r)) p <- ram.resp(); RegData nextr = nextReq(p, r); outQ.enq(result(nextr)); if (inQ.notEmpty) begin IP ip = in.first(); inQ.deq(); ram.req(ext(ip[31:16])); c <= Level1; r <= IPptr{ip[15:0],?}; end else c <= Empty;February 21,
View Full Document