DOC PREVIEW
Pitt CS 3150 - A Search Memory Substrate

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1BackgroundNetwork packet processing tasksLookup performance scalabilityIP lookup using a trieIP lookup using TCAMRecap: Why is TCAM inefficient?CA-RAM–a hybrid approachCA-RAM–Content Addressable RAMVery simple, yet efficientPipelined CA-RAM operationDealing w/ bucket overflowsCA-RAM reconfig. opportunitiesAdapting key sizeCA-RAM reconfig. opportunitiesSupporting binary/ternary matchingCA-RAM reconfig. opportunitiesSimult. key matching & data accessCA-RAM reconfig. opportunitiesSupporting range checkingEvaluationMapping a large IP routing tableMapping a large IP routing tableComparing CA-RAM and TCAMConclusionsMapping a large IP routing tableCA-RAM componentsA Search Memory Substrate forHigh Throughput and Low PowerPacket ProcessingSangyeun Cho, Michel Hanna and Rami MelhemDept. of Computer ScienceUniversity of PittsburghBackgroundISPInternetSubnetSubnetISPISPend userend userrouterrouterend userend userend userend userNetwork packet processing tasksPacket forwarding•Given an IP address•Look up in a table (IP table) a matching prefix•Make sure the chosen prefix is longest  LPM (Longest Prefix Matching) requirementRule-based packet filtering•Given a set of packet fields (src/dst IP, src/dst port, protocol, …)•Look up in a rule database matching entriesDeep packet inspection•Given a string in packet payload•Look up in a signature database matching entriesLookup performance scalabilityLookup performance must match increasing line speeds•For OC-768, up to 104M packets must be processed per second•Network traffic has doubled every year [McKeown03]•Router capacity doubles every 18 monthsCapacity pressure•Routing tables (~200K prefixes in a core router) are growing [RIS]•# of firewall rules increases; 100K rules are practical [Baboescu04]•IPv6Power and thermal issues already a critical limiting factor in network processing device design [McKeown03]Two conventional lookup solutions•Software methods (tries, hash table, …)•Hardware methods (TCAM, Bloom filter, …)IP lookup using a trie Consider an IP address: 0 1 0 0 0 1 1 0 “flexibility” high memory capacity requirement low memory bandwidth utilization not SCALABLEIP lookup using TCAM Consider an IP address: 0 1 0 0 0 1 1 0110100*110101*110111*01000*01100*01101*11011*0100*0110*1101*10*0*sort beforestoringchoose the firstamong the matched high bandwidth, constant time lookup TCAMs are relatively small, expen-sive power consumption very high not SCALABLERecap: Why is TCAM inefficient? all bits are involved in matching large embedded match logic “large” means more work in this caseCA-RAM–a hybrid approachCan we do better than the existing schemes?•Flexibility and search performance•Exploit optimized RAM designs•Hardware approach (software too slow)CA-RAM combines hashing w/ hardware parallel matchingCA-RAM design goals•High lookup performance•Low power consumption•Smaller chip area per stored datum•Straightforward system-level integrationCA-RAM–Content Addressable RAMSeparate match logic and memoryÞMatch logic for a single row, not every rowÞAllows the use of dense RAM technologyÞEnables highly reconfigurable match logicÞ(Keep keys sorted in each row, not in entire array)Match logicMemory cellsConventional CAM/TCAM CA-RAMVery simple, yet efficientUse hashing to store keys in a particular rowTo look up, hash the key and retrieve one rowPerform matching on entire row in parallelÞAchieve (full) content addressability w/o paying overhead!Index generatorKeyi1Match processor1……Keyi2Keyj2Keyj1Match processor2…keyPipelined CA-RAM operationIndex generatorSearch keyKeyi1Match processor1Keyi2Keyj2Keyj1Match processor2ResultMatch processor3Keyi3Keyj3Step 1 Step 2 Step 3 Step 4IndexKeyj2Keyj1Keyj3Search key Match processor2Index generationMemory accessKey matchingResult forwardingDealing w/ bucket overflowsCareful design of hash functionIncrease bucket size•Reduce load factor ();  = # of occupied entries / # of total entries•Trade-off space for performanceUse “chaining”; store overflows in subsequent rows•Multiple accesses per lookupUse a small overflow CAM, accessed in parallel•Similar to popular “victim caching” in computer architectureUse two-level hashing and employ multiple CA-RAM banks……CA-RAM reconfig. opportunitiesReconfigurable match logic allows:Adapting key size to apps•Same hardware to support multiple apps or standards……Adapting key sizeKeyi1Reconfigurable match logicKeyi2Keyj2Keyj1Keyi3Keyj3Match informationKeyi1Keyi2Keyj2Keyj1 Adapting key size is straightforward Will benefit supporting multiple apps/ standardsSelect key bitsfor matchingCA-RAM reconfig. opportunitiesReconfigurable match logic allows:Adapting key size to apps•Same hardware to support multiple apps or standardsBinary and ternary matching•Some apps require ternary matching, some don’t……Supporting binary/ternary matchingReconfigurable match logicMatch informationKeyi1Keyi2Keyj2Keyj1Search keyMaskj1Maski1 Developed configurable compara-tor T-matching requires 2 bits / 1 symbol Supporting different types of matching in different bit positions feasibleConsider maskbits or notCA-RAM reconfig. opportunitiesReconfigurable match logic allows:Adapting key size to apps•Same hardware to support multiple apps or standardsBinary and ternary matching•Some apps require ternary matching, some don’tStoring data and keys in a CA-RAM module•Cuts # of memory accesses for IP lookup by half……Simult. key matching & data accessReconfigurable match logicMatch informationKeyi1Keyi2Keyj2Keyj1Search keyDataj1Datai1 Data access follows TCAM lookup CA-RAM supports data embedding Cuts memory traffic & latency by halfMatch information & DataMatch key &bypass dataCA-RAM reconfig. opportunitiesReconfigurable match logic allows:Adapting key size to apps•Same hardware to support multiple apps or standardsBinary and ternary matching•Some apps require ternary matching, some don’tStoring data and keys in a CA-RAM module•Cuts # of memory accesses for IP lookup by halfProviding range checking capabilities•Beneficial for rule-based packet filtering……Supporting range checkingReconfigurable match logicMatch informationKeyi1Rangei1Rangej1Keyj1Search key (Range checking causes troubles) (Entries must be


View Full Document

Pitt CS 3150 - A Search Memory Substrate

Documents in this Course
JouleSort

JouleSort

12 pages

Load more
Download A Search Memory Substrate
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A Search Memory Substrate and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Search Memory Substrate 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?