MIT 6 829 - Interdomain Routing Correctness and Stability

Unformatted text preview:

Interdomain Routing Correctness and StabilitySlide 2Is correctness really that important?Why does routing go wrong?What can go wrong?Review: Simple operation……but complex configuration!Configuration SemanticsWhat types of problems does configuration cause?Slide 10These problems are realSeveral “Big” Problems a WeekWhy is routing hard to get right?Correctness SpecificationSafety: No Persistent OscillationStrawman: Global Policy CheckThink Globally, Act LocallyMain Idea of Today’s PaperRelationship #1: Customer-ProviderRelationship #2: PeeringRankingsAdditional Assumption: HierarchySafety: Proof SketchActivation Sequence: IntuitionSlide 25Proof, Step 1: Customer RoutesProof, Step 2: Peer & Provider RoutesRanking and Filtering InteractionsSome problemsOther Possible Local RankingsWhat Rankings Violate Safety?What about properties of resulting paths, after the protocol has converged?Slide 33Path Visibility: Internal BGP (iBGP)iBGP Signaling: Static CheckHow do we guarantee these additional properties in practice?Today: Reactive OperationGoal: Proactive Operationrcc Overviewrcc ImplementationSummary: Faults across 17 ASesrcc: Take-home lessonsTwo PhilosophiesPreventing Errors in the First PlaceSlide 45Configuration Syntax (Example)Why is Routing Hard to Get Right?Which faults does rcc detect?Normalizing Router ConfigurationRoute Validity: Consistent ExportInconsistent Export Observed at AT&TExample: “Bogon” Routesrcc InterfaceParsing ConfigurationList of FaultsNick FeamsterInterdomain RoutingCorrectness and Stability2Is correctness really that important?3Is correctness really that important?•The Internet is increasingly becoming part of the mission-critical Infrastructure (a public utility!).Big problem: Very poor understanding of how to manage it.4Why does routing go wrong?•Complex policies–Competing / cooperating networks–Each with only limited visibility•Large scale–Tens of thousands networks–…each with hundreds of routers–…each routing to hundreds of thousands of IP prefixes5What can go wrong?Two-thirds of the problems are caused by configuration of the routing protocolSome things are out of the hands of networking researchBut…6Review: Simple operation…Route AdvertisementAutonomous Systems (ASes)Session Destination Next-hop AS Path18.0.0.0/818.0.0.0/8192.5.89.8966.250.252.441 3356 3174 3MIT7…but complex configuration!•Which neighboring networks can send traffic•Where traffic enters and leaves the network•How routers within the network learn routes to external destinationsFlexibility for realizing goals in complex business landscapeFlexibility ComplexityTrafficRouteNo Route8Configuration SemanticsRanking: route selectionDissemination: internal route advertisementFiltering: route advertisementCustomerCompetitorPrimaryBackup9What types of problems does configuration cause?•Persistent oscillation (today’s reading)•Forwarding loops•Partitions•“Blackholes”•Route instability•…10These problems are real“…a glitch at a small ISP… triggered a major outage in Internet access across the country. The problem started when MAI Network Services...passed bad router information from one of its customers onto Sprint.” -- news.com, April 25, 1997UUNetFlorida InternetBarnSprint11These problems are real“…a glitch at a small ISP… triggered a major outage in Internet access across the country. The problem started when MAI Network Services...passed bad router information from one of its customers onto Sprint.” -- news.com, April 25, 1997“Microsoft's websites were offline for up to 23 hours...because of a [router] misconfiguration…it took nearly a day to determine what was wrong and undo the changes.” -- wired.com, January 25, 2001“WorldCom Inc…suffered a widespread outage on its Internet backbone that affected roughly 20 percent of its U.S. customer base. The network problems…affected millions of computer users worldwide. A spokeswoman attributed the outage to "a route table issue." -- cnn.com, October 3, 2002"A number of Covad customers went out from 5pm today due to, supposedly, a DDOS (distributed denial of service attack) on a key Level3 data center, which later was described as a route leak (misconfiguration).” -- dslreports.com, February 23, 200412Several “Big” Problems a Week13Why is routing hard to get right?•Defining correctness is hard•Interactions cause unintended consequences–Each network independently configured–Unintended policy interactions•Operators make mistakes –Configuration is difficult–Complex policies, distributed configuration14Correctness SpecificationSafetyThe protocol converges to a stable path assignment for every possible initial state and message orderingThe protocol does not oscillate15Safety: No Persistent Oscillation12 31 3 0 1 03 2 0 3 02 1 0 2 00Varadhan, Govindan, & Estrin, “Persistent Route Oscillations in Interdomain Routing”, 199616Strawman: Global Policy Check•Require each AS to publish its policies•Detect and resolve conflictsProblems:•ASes typically unwilling to reveal policies•Checking for convergence is NP-complete•Failures may still cause oscillations17Think Globally, Act Locally•Key features of a good solution–Safety: guaranteed convergence–Expressiveness: allow diverse policies for each AS–Autonomy: do not require revelation/coordination–Backwards-compatibility: no changes to BGP•Local restrictions on configuration semantics–Ranking–Filtering18Main Idea of Today’s Paper•Permit only two business arrangements–Customer-provider–Peering•Constrain both filtering and ranking based on these arrangements to guarantee safety•Surprising result: these arrangements correspond to today’s (common) behaviorGao & Rexford, “Stable Internet Routing without Global Coordination”, IEEE/ACM ToN, 200119Relationship #1: Customer-ProviderFiltering–Routes from customer: to everyone–Routes from provider: only to customersproviderscustomerFrom the customerTo other destinationsadvertisementstrafficFrom other destinationsTo the customercustomerproviders20Relationship #2: PeeringFiltering –Routes from peer: only to customers–No routes from other peers or providersadvertisementstrafficcustomercustomerpeer peer21Rankings•Routes from customers over routes from peers•Routes from peers over routes from providersproviderpeercustomer22Additional Assumption: HierarchyDisallowed!23Safety: Proof


View Full Document

MIT 6 829 - Interdomain Routing Correctness and Stability

Download Interdomain Routing Correctness and Stability
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Interdomain Routing Correctness and Stability and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Interdomain Routing Correctness and Stability 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?