Purdue CS 63600 - Protocol Processing

Unformatted text preview:

CS 636 InternetworkingCS 636 InternetworkingRamana KompellaEND NODE ALGORITHMICSLecture 13: Protocol Processing1CS 636 InternetworkingOutline Buffer management CRCs and checksums High speed protocol processing Packet reassembly2CS 636 InternetworkingTCP protocol processing TCP input processing used to take too long (1100 lines of code)◦ TCP has many flags and parameters◦ Too slow for high performance transport◦ Several high performance transport protocols◦ HTPNET (High performance transport protocol)◦ XTP (Xpress transport protocol)◦ TP++ ◦ RTP (Rapid transport protocol)3HTPNET [Chan & Gorton] Designed based on highly parallel architecture Several concurrent finite state machines in contrast to one FSM in TCP/TP4 Four main functions◦ Tx/Rx data/control packet processor Out of band control processing Selective retransmit instead of go-back-nCS 636 Internetworking 4Performance comparisonCS 636 Internetworking 5Htpnet [chan&gorton94]Xpress Transport Protocol (XTP) XTP is designed with hardware protocol engine in mind  Represents a compromise between◦ Bandwidth◦ Latency ◦ Protocol complexity◦ VLSI constraintsXTP Features  All numbers are 32-bits (optionally 64-bits) Compromise between simplicity and features Lightweight core based on header address + sequence number Bitflags, fields, modes managed by protocol above XTPXTP Frames Each XTP packet contains ◦ Fixed-size header◦ Fixed-size trailer Two types: Information and control segments Inbound packet processing◦ Identify the packet◦ Associate with local object◦ Reference state◦ Steer data to a destinationXTP  Header◦ Cmd -- indicates pkt type, version etc◦ Key -- Address◦ Route -- Destination infromation◦ Seq -- byte position in the stream Trailer◦ Two checksum fields ◦ Flags  Only first packet has all the information◦ Subsequent packets have a key used to lookupXTP state machine Contexts (akin to connection blocks)◦ Initially in quiescent state◦ First packet contains all information (moves to active)◦ Subsequent packets don’t contain all the connection information◦ Only a unique key that allows demuxingappropriate context◦ After data transfer is finished, returns to quiescentCS 636 Internetworking 10Approach 2: TCP already has so much penetration Don’t want to invent a new protocol How to speed up TCP implementation ?A look at TCP processing First operation: lookup protocol control block (PCB)◦ Contains state (rx/tx seq. numbers)◦ If few active connections, PCBs can be cached One optimization◦ Exploit temporal locality◦ Expect next packet to match the same PCB as the current packet (so cache it)CS 636 Internetworking 12TCP header processing Low information fields◦ Source/Dest ports fixed ◦ Sequence number likely next in sequence ◦ Control bits off (except ack)◦ Window doesn’t change, urgent ptr. irrelevant Ack number and Checksum most varyingCS 636 Internetworking 13Source port Destination portSequence numberAck numberOffset control bitsWindowChecksumUrgent pointerHigh infofieldsCS 636 InternetworkingTCP header prediction The expected case:  receive an ack data segment in order14Machine word optimization Processing bits considerably slower than operating on machine words◦ Checking for each flag (SYN/FIN/RESET, etc.) individually is too taxing◦ Pre-compute expected flags + window and make a check in one cycle With these optimization, receiver processing can be performed in 30 SPARC instructionsCS 636 Internetworking 15CS 636 InternetworkingTCP transmit side optimizations Building new header by copying a template and overwriting the fields that change Used in linux today Server environments:◦ Packets experience less temporal locality◦ Use hashing to quickly obtain PCBs◦ Shown to be an order-of-magnitude more efficient in OLTP environments16UDP processing UDP is stateless; header prediction irrelevant Two time-consuming tasks◦ PCB lookups and checksums Caching of PCBs is not so straight-forward◦ PCBs contain wild cards ◦ E.g., local port L, local port L & IP address X Only entries that do not have shadows can be cached (87% hit rate)CS 636 Internetworking 17CS 636 InternetworkingOutline Buffer management CRCs and checksums TCP header prediction Packet reassembly18CS 636 InternetworkingFragments Different links with different packet sizes Routers slice up individual IP packets into fragments Fragments identifid by IP id, start offset, fragment length Last fragment sets a bit to indicate last Intermediate routers can fragment a fragment further19Receiver side First fragment sets up state indexed by IP id Subsequent fragments looked up in a list of IP ids After all fragments received, packet is reassembled  If all fragments not received within a time out, state is flushedCS 636 Internetworking 20Problems with fragments Expensive for a router to create new headers and do other processing Reassembly at receivers is expensive ◦ determining when a complete packet is received requires sorting of fragments Loss of one fragment can lead to loss of the entire packet. Today, path MTU is used ◦ Insufficient, since UDP does not use path MTUCS 636 Internetworking 21CS 636 InternetworkingSlow reassembly Linked list of fragments indexed by packet ID and sorted by fragment start offset Insert requires traversing the list Why not use a byte array instead of the list?22CS 636 InternetworkingFast reassembly AAL-5 does reassembly at Gigabit speeds because ATM guarantees cell order Predict what fragment is going to show up If everything in order, things work fine Also can cache packet id to avoid list traversal Avoid extra copy by storing data at byte offset [CV98a] show implementation in 38 instructions23Counter example Counterexample: Linux sends fragments in reverse order Idea is to allow receiver to create a buffer of appropriate size Causes problem since prediction fails Trick: Use the first fragment to determine whether to store in reverse order or correct order! CS 636 Internetworking


View Full Document

Purdue CS 63600 - Protocol Processing

Download Protocol Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Protocol Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Protocol Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?