DOC PREVIEW
Stanford CS 144 - Data Center Networking

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Data Center NetworkingStanford CS144 Lecture 17Philip Levis, 11/30/11Low latencies: µsHigh capacity: GigE, 10 GigESpecialized trafficCentrally managedTopology(picture courtesy of Al-Fares et al, “A Scalable, Commodity Data Center Network Architecture”)Storage Workload(picture courtesy of Phanishayee et al, “Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems”)Query Workload(picture courtesy of Alizadeh et al., “Data Center TCP (DCTCP)”)ProblemsPer-Pair Bandwidth(picture courtesy of Al-Fares et al, “A Scalable, Commodity Data Center Network Architecture”)Incast(from Phanishayee et al, “Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems”)Incast Details(from Phanishayee et al, “Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems”)Mixed traffic•Low latency for short flows•High burst tolerance (incast)•High throughput for long flowsRecent Research•New switching topology: Al-Fares et al.•Fix TCP incast: Vasudevan et al.•Data Center TCP: Alizadeh et al.Per-Pair Bandwidth(picture courtesy of Al-Fares et al, “A Scalable, Commodity Data Center Network Architecture”)Fat TreeFat Tree(k/2)2k/2k/2kSwitchingPrefixPort10.2.0.0/24010.2.1.0/2410.0.0.0/0SuffixPort0.0.0.2/820.0.0.3/8310.2.0.X10.2.1.XX.X.X.2X.X.X.3TCAMEncoderPrefixNext HopPort0010.2.0.100110.2.1.111010.4.1.121110.4.1.23Not Perfect(k/2)2k/2k/2kFat-Tree StatusIncast•RTO = SRTT + (4 X RTTVAR)Behavior(from Phanishayee et al, “Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems”)RFC 6298 (2.4) Whenever RTO is computed, if it is less than 1 second, then the RTO SHOULD be rounded up to 1 second. - in practice, often 200ms The delayed ACK algorithm specified in [Bra89] SHOULD be used by a TCP receiver. When used, a TCP receiver MUST NOT excessively delay acknowledgments. Specifically, an ACK SHOULD be generated for at least every second full-sized segment, and MUST be generated within 500 ms of the arrival of the first unacknowledged packet. - in practice, often 40msRFC 2581Solutions•Proposal 1: Adjust RTO (Vasudevan et al.)•Proposal 2: DCTCP (Alizadeh et al.)RTTRTT 2RTO•Make RTOmin 200µs•Timeout = (RTO + (rand(0.5) x RTO))ImprovementWide AreaDCTCP•Three goals•Low latency for short flows•High burst tolerance (incast)•High throughput for long flows•Basic approach: keep switch queues shortQueue Length•RTT measurements are noisy•At high speeds, very small•GigE: 10 packets is 120µs•10GigE: 10 paciets is 12µs•Use ECN (explicit congestion notification)•RFC 3168Setting ECNKSet ECN bitMonitoring α•Per RTT, measure F, the fraction of packets sent that had the ECN bit set•DCTCP acks copy the ECN bit of the corresponding data packets into ECN-Echo field•Compute α, EWMA of FAdjusting cwnd•cwnd = cwnd x (1 - α/2)DCTCP Caveat“We stress that DCTCP is designed for the data center environment. In this paper, we make no claims about suitability of DCTCP for wide area networks.”Data Center Networks•Very different than wide area Internet•Tiny RTTs•Different traffic patterns•Single administrative domain•Standards (e.g., IETF) much less important•A lot of very novel network


View Full Document

Stanford CS 144 - Data Center Networking

Documents in this Course
IP Review

IP Review

22 pages

Load more
Download Data Center Networking
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Data Center Networking and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Data Center Networking 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?