FAST TCP:From Theory to Experiments∗C. Jin, D. Wei, S. H. LowG. Buhrmaster, J. Bunn, D. H. Choe, R. L. A. Cottrell, J. C. DoyleW. Feng, O. Martin, H. Newman, F. Paganini, S. Ravot, S. Singh†http://netlab.caltech.edu/FAST/March 30, 2003AbstractWe describe a variant of TCP, called FAST, that can sustain high throughput and utilization atmulti-Gbps over large distance. We present the motivation, review the background theory, summarizekey features o f FAST TCP, and report preliminary e xperimental results.Keywords: FAST TCP, large bandwidth-delay product, high speed TCP, multi-Gbps experimentContents1 Motivation 22 Background theory 22.1 Equilibrium and performance . . . . . . ........................... 32.2 Stability . . . . . . . . ..................................... 33 Implementation 54 Experimental results 64.1 Infrastructure.......................................... 64.2 Throughput and utilization . . . . . . . ........................... 74.3 Fairness ............................................. 9∗Submitted to IEEE Communications Magazine, Internet Technology Series, April 1, 2003.†G. Buhrmaster and L. Cottrell are with SLAC (Stanford Linear Accelerator Center), Stanford, CA. W. Feng is withLANL (Los Alamos National Lab). O. Martin is with CERN (European Organization for Nuclear Research), Geneva. F.Paganini is with EE Department, UCLA. All other authors are with Caltech, Pasadena, CA.11 MotivationOne of the key drivers of ultrascale networking is the High Energy and Nuclear Physics (HENP) commu-nity, whose explorations at the high energy frontier are breaking new ground in our understanding of thefundamental interactions, structures and symmetries that govern the nature of matter and spacetimein our universe. The largest HENP projects each encompasses 2,000 physicists from 150 universitiesand laboratories in more than 30 countries. Collaborations on this global scale would not have beenattempted if the physicists could not count on excellent network performance. Rapid and reliable datatransport, at speeds of 1 to 10 Gbps and 100 Gbps in the future, is a key enabler of the global collabora-tions in physics and other fields. The ability to analyze and share many terabyte-scale data collections,accessed and transported in minutes, on the fly, rather than over hours or days as is the current practice,is at the heart of the process of search and discovery for new scientific knowledge.For instance, the CMS (Compact Muon Solenoid) Collaboration, now building next-generation ex-periments scheduled to begin operation at CERN’s (European Organization for Nuclear Research) LargeHadron Collider (LHC) in 2007, along with the other LHC Collaborations, is facing unprecedentedchallenges in managing, processing and analyzing massive data volumes, rising from the petabyte (1015bytes) to the exabyte (1018bytes) scale over the coming decade. The current generation of experimentsnow in operation and taking data at SLAC (Stanford Linear Accelerator Center) and Fermilab facesimilar challenges. SLAC’s experiment has already accumulated more than a petabyte of stored data.Effective data sharing will require 10 Gbps of sustained throughput on the major HENP network linkswithin the next 2 to 3 years, rising to terabit/sec within the coming decade.Continued advances in computing, communication, and storage technologies, combined with thedevelopment of national and global Grid systems, hold the promise of providing the required capacitiesand an effective environment for computing and science. The key challenge we face, and intend toovercome, is that the current congestion control algorithm of TCP does not scale to this regime.Our goal is to develop the theory and algorithms for ultrascale networking, implement and test themin state-of-the-art testbeds, and deploy them in communities that need it urgently.2 Background theoryThere is now a preliminary theory to understand large-scale networks, such as the Internet, underend-to-end control. The theory clarifies how control algorithms and network parameters determine theequilibrium and stability properties of the network, and how these properties affect its performance. It isuseful both in understanding the performance problems of the current congestion control algorithm andin designing better algorithms to solve these problems, while maintaining fairness in resource allocation.Congestion control consists of two components, a source algorithm, implemented in TCP, thatadapts sending rate (or window) to congestion information in the source’s path, and a link algorithm,implemented in routers, that updates and feeds back a measure of congestion to sources that traversethe link. Typically, the link algorithm is implicit and the measure of congestion is either loss probabilityor queueing delay. For example, the current protocol TCP Reno and its variants use loss probabilityas a congestion measure, and TCP Vegas primarily uses queueing delay as a congestion measure [1, 2].Both are implicitly updated by the queueing process and implicitly fed back to sources via end-to-endloss and delay, respectively.The source-link algorithm pair, referred to here as TCP/AQM (active queue management) algo-rithms,1forms a distributed feedback system, the largest man-made feedback system in deployment. Inthis system, hundreds of millions TCP sources and hundreds of thousands of network devices interactwith each other, each executing a simple local algorithm, implicitly or explicitly, based on local infor-1We will henceforth refer it as a “TCP algorithm” even though we really mean the congestion control algorithm inTCP.2mation. Their interactions result in a collective behavior, whose equilibrium and stability properties wenow discuss.2.1 Equilibrium and performanceWe can interpret TCP/AQM as a distributed algorithm over the Internet to solve a global optimizationproblem [3, 2]. The solution of the optimization problem and that of an associated problem determinethe equilibrium and performance of the network. Different TCP and AQM algorithms all solve the sameprototypical problem. They differ in the objective function of the underlying optimization problem andthe iterative procedure to solve it.Even though historically TCP and AQM algorithms have not been designed as an optimizationprocedure, this interpretation is valid under fairly general conditions, and useful in understandingnetwork performance, such as throughput, utilization, delay, loss, and
View Full Document