Scalability & Stability of the Internet InfrastructureContextMotivationImminent Collapse of the InternetInternet GrowthInfrastructure Topological EvolutionInternet Evolution: NSFNetInternet Evolution: TodayImpact of Instability & FailuresBackground: Internet ArchitectureSlide 16Background: BGP Routing ProtocolBackground: Internet CoreRoadmapSlide 21Internet Routing Instability ResultsInstability Results (Continued)Growth in Routing StateInitial Findings (SIGCOMM’97)More Initial ObservationsBGP Updates30 Second Frequency Components 1997Origins of Pathological Updates (INFOCOM99)After Initial Publication of ResultsBGP Announcements and WithdrawsFrequency ComponentsBGP Failures -- Congestion Collapse (BGP Frequency)Slide 36Slide 37BGP Congestion Collapse HypothesisWhat about Failures?Internet Infrastructure Failures (FTCS99)DefinitionsRoute Failures: How long before a network is unreachable?Route Repairs: How long before a network is reachable again?Failover: How long before traffic is re-routed?Conventional Wisdom on Convergence18-Month Study of Convergence BehaviorSlide 50TerminologySlide 52Withdraw Convergence (Tdown)Withdraw ConvergenceSlide 55Failovers and RepairsSlide 58Slide 59End2End ConnectivityImpact of Convergence Delay on End-to-End PathWhat is Happening?BGP Bad NewsInternet vs. Telephone NetworkSlide 65AcknowledgementsScalability & Stability of the Internet Infrastructure Farnam JahanianDepartment of EECSUniversity of Michigan<[email protected]>ContextNetworkInfrastructure•Network Attacks•S/H Failures•Operational Faults•Windmill Probes•Netflow Statistics•Protocol Scrubbers•Event Aggregation•Data Mining•Replication schemesActiveResponseCapabilitiesAnalysisEngines•Routers•Name Servers•Critical ServicesAnomalousNetwork EventsCoarse andFine GrainedMeasurementTools•CountermeasuresLIGHTHOUSE: Survivable Network InfrastructureJoint projects between U. Michigan & Merit NetworkMotivationIncreasing reliance of financial and national utility infrastructures on interconnected IP-based networksExplosive growth in both size and topological complexity of the underlying communication infrastructureReliance on off-the-self infrastructure & shrink-wrapped code Network infrastructure is vulnerable:–inherent instability and transient oscillations –delayed convergence and long failover–coordinated denial of service attacks on network resources–hardware and software failures–operational faults and misconfigurationsImminent Collapse of the InternetCollapse of the InternetNow?Internet GrowthExplosive growth in both size and topological complexityInternet end-system growthTraffic volume & characteristicsInfrastructure topological evolutionInfrastructure Topological EvolutionBetween 1995-1999:Decentralization: from a single backbone network to a conglomeration of 100s of backbone and 1000s ISP.Loss of hierarchy and abstraction: from strict hierarchical network to increasingly a full-mesh interconnection.Significant bandwidth increase: from signle T3 (45MB) circuit and T1 (1MB) links to multiple OC48 (1.2GB) circuits and OC12 (622MB) lines between nodes.Internet Evolution: NSFNetNSFNet BackboneRegionalRegional RegionalCampus Campus Campus CampusHello/EGPHello/EGPHello/EGPHierarchical network with a single central backboneInternet Evolution: TodayAS1AS2AS3AS4C4C2C3C1Full-mesh interconnection of ISP backbones and customersImpact of Instability & Failures–Increased end-to-end Loss/Latency–Increased delay in convergence & network reachability–Backbone infrastructure CPU/Memory requirements–Backbone “route flap storms” –Network management complexityBackground: Internet Architecture BGPBGPBGPBackground: Internet RoutingTwo major categories–Inter-domain (BGP between autonomous systems)–Intra-domain (OSPF, ISIS, IGRP inside an AS)BGP–Incremental: announcements and withdraws–Updates include policy (e.g. MED, ASPath)–Maintain multiple possible routesBackground: BGP Routing ProtocolBGP is an incremental protocol that sends update information only upon changes in network topology or routing policy.Two forms of messages:announcements:New network accessible Prefer another route to network destination withdrawals:Destination network is no longer accessibleRouting policies vs. shortest number of hopsBackground: Internet CoreNetworks aggregated into CIDR (Classless Inter-Domain Routing) prefixesPrefix represents a set of destination IP addressesAt Internet “core” all routers maintain paths to “default-free” routesOriginally 5 major Internet Exchange Points (IXPs)In 1996, approximately 30,000 default-free routesRoadmapStudy of stability of routing in the Internet backbone–Transient oscillations, pathological redundant updates–congestion collapse and correlation to network usage–SIGCOMM’97 and INFOCOMM’99Study of route availability and failover rates–long-term availability of Internet backbone routes–Case study of regional provider–FTCS’99Study of convergence behavior of routing protocols–Injection of route changes into the Internet backbone–Impact of convergence delay on end-to-end path–18-month study & ongoingInternet Exchange PointsDeployed probes machines at five public exchange pointsCollected all routing updates at IXPs over four year periodInternet Routing Instability ResultsNumber of BGP routing updates exchanged per day in the Internet core is orders of magnitude larger than expected.Most routing information is dominated by pathological, or redundant updates, which do not directly reflect changes in routing policy or topology.Instability and redundant updates exhibit a specific periodicity of 30 and 60 seconds.Instability and redundant updates show a surprising correlation to network usage and exhibit corresponding daily and weekly cyclic trends.Instability Results (Continued) Instability is not dominated by a small set of autonomous systems or routes.Instability is not disproportionately dominated by prefixes of specific lengths, i.e. independent of aggregation.Discounting policy fluctuation and pathological behavior, there remains a significant level of Internet forwarding instability.Details: SIGCOMM’97 & INFOCOMM’99Growth in Routing StateLinear growth in routing tableInitial Findings (SIGCOMM’97)Up to 60 million BGP updates/day for only 30,000 default-free routes! –On avg. 2-6 Million
View Full Document