1Placement of Function in aBest Effort WorldCourse Logistics Update• 45 total in class. Still off target.• Prioritized waitlist now available. See meafter class.2Internet Architecture Redux• The network is stateless (w.r.t. e2econnection state), for survivability• The network is unreliable– No guarantees; 1%ish drop rate OK; mayburst to higher or have temp. outagesEnd-hosts must do reliability/retransmission• The network provides no QoSEnd-hosts must do congestion controlThings the Transport Faces• Loss– Congestion, corruption, routing probs, failures• Congestion? Yup: Stat mux! Finite buffers for store-and-forward. Bursts or excess traffic.– Queue size is tricky: Too large == too much delay, too small== bad statmux. More later• Variable delay• Reordering (how can this happen?)– Bugs, multipath• Duplication– Bugs, lower-layer spurious retransmissions3So: TCP?• A reliable, congestion-controlled, in-orderbytestream– As mentioned last time: Not a perfect fit foreverybody. Drawbacks?• Unreliable apps don’t need it• May delay app processing, causing CPU/memory/disk to bemore bursty than necessary– Known problem. Nice paper @ SIGCOMM 2006 onDCCP, datagram congestion control: Unreliable,congestion controlled datagrams– Also mentioned ITP, image transport protocol:reliable *out of order*– Rel to end-to-end arguments?Reliable Transfers• Forward Error Correction: (redundancy in-band)• Automatic Repeat reQuest (ARQ):retransmissions. How does it know?– Acknowledgements!– In tcp: cumulative ACKs• How do you detect a loss with ACKs?– Timeout– Or – (diagram) notice that you’re getting dup ACKs4TCP Gets Older• V1: “go-back-N” retransmissions based ontimeouts with a fixed-size sliding window• V2: Congestion collapse!– Van Jacobsen and Mike Karels congestion avoidance(congestion control), better RTT estimators (why? Toset the timeout), and slow start. (Coming up later)• Karn & Partridge: Retransmission Ambiguity– How do you tell if an ACK is for orig or Rx packet? Ifyou can’t tell, your RTT estimator can break– Next step: TCP timestamp optionsTCP Variants• TCP Tahoe: Fast retransmission + Jacobson/Karels(slow start/cong avoid)– First data-driven, not timeout-driven, Rxmit• TCP Reno: Fast recovery: The now-classic sawtooth• TCP NewReno: improves fast recovery to survivemultiple data losses in large windows• TCP SACK: Pretty close to the state of the art. Insteadof just ACKing last received seq, tell sender about otherrecv’d packets too (easier to recover from weird losspatterns)• Most machines today are NewReno / SACK (Sally Floyd2005 study)5TCP Principles• Don’t retransmit early; aoid spuriousretransmissions– A response to congestion collapse. Part of cong.collapse was Rx of packets that were queued, notlost.• Conserve packets– Don’t blast network with extra traffic duringretransmissions.– General technique: Count # dup acks to know howmany packets have left the network.Timers• How do you know when packets werelost? If no other ACKs, must timeout.• How long to wait? Well, depends on theRTT.– Easy to measure: segment -> ACK– Problems: Variable delay, variableprocessing times at receiver– Solution: Averaging6EWMA• Exponential Weighted Moving Averages– A low-pass filter by another name– Srtt = alpha * r + (1 – alpha)*srtt• R is the current sample• Srtt is the smoothed average• TCP uses alpha = 1/8. Doesn’t matter too much.– Great! We’re done.– Or not.• EWMA is great to put in your toolboxVariance• If we use srtt, then half of our transmissions arespurious.• TCP early solution: Beta * srtt (beta =~ 2)– But what if the path has high variance?• Real solution:– RTO = srtt + 4 * rttdev– Rttdev = mean linear deviation (>= stddev)• = rttdev = gamma * dev + (1-gamma) * rttdev• Dev = |r – srtt|. TCP uses gamma = ¼• Final note: What to do on timeout?– Exponential backoff (another good toolkit thing)7Using more information: FastRetransmission• Data-driven retransmission• What happens on loss? Duplicate ACKs– Imagine you sent:– 1:1000, 1001:1700, 2501:3000, 3001:4000,4001:4576– And got ACKs 1001 1701 1701 1701 1701– Clear that something’s going wrong!Wither dup acks• Why DUP acks?– Window updates: TCP receivers have limitedspace in socket buffer, so they can tell thesource to “shut up” (flow control)– Segment loss– Or … segment re-ordering• TCP says three dupacks == not re-ordered• Works pretty well, but now forces network designto abide by it because it works pretty well!– Per-packet load balancing8Congestion Control• Okay. Great. We know we had a lossbecause of a timeout or dup ACKs. Whatdo we do?– 1: Retransmit– 2: Adjust our congestion window• TCP isn’t just doing reliability…The basics• Slow start: Ramp up• Congestion avoidance: Be conservativewhen you’re near the limit9Fast Recovery• There’s still more information floating in the net.• If you did fast recovery, you got dup ACKs, andwill probably keep getting more.• Retransmit lost packet. Cut cong window in half.Wait until half of the window has been ACKed,and then send _new_ data.• Basic idea in TCP Reno.• TCP NewReno adds more tricks to deal withmultiple losses in a window, which kills Reno.SACK• You can get still more information!• 1:1000 1001:1700 2501:3000 3001:40004577:5062 6000:7019• Send ACKs• 1001 1701 1701 [2501-3000] 1701 [2501-4000]1701 [4577-5062; 2501-4000] etc.• Aimed at Long Fat Networks (LFNs;pronounced “Elephants”). Standardized inRFC2018 after years of debate.• SACK isn’t perfect! If TCP window is very small,not enough Rx to deal with it.10Other tricks in TCP• Three way handshake: Establishes the sequence #space with both sender and receiver• Segment size: How big can the network support?– Path MTU discovery– Set IP “Don’t Fragment” bit.• Routers with smaller MTU send back ICMP error messagecontaining their MTU• Low-bandwidth links: TCP Header Compression.– Most fields in TCP header stay the same. Can be compressedon a link-by-link basis.• 40 byte TCP+IP header in ~3-6 bytes.ALF• Example: Streaming video protocol• Consider MPEG:– Reference frames– Difference frames (for simplicity, vs previousframe)• Problem: Propagation of errors11Video over TCP?• Completely reliable delivery.
View Full Document