DOC PREVIEW
Princeton COS 461 - A Scalable Large-file Transfer Service

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CoBlitz: A Scalable Large-file Transfer Service (COS 461)Large-file DistributionWhat CDNs Are Optimized ForWhy Not Web CDNs?Peer-to-Peer?Slide 6What We’d Like IsCoBlitz: Scalable Large-file CDNHow It WorksSmart AgentChunk Indexing: Consistent HashingOperation & ChallengesUnilateral PeeringPeering Set DifferenceSlide 15Slide 16Reducing Origin LoadScale ExperimentsThroughput DistributionDownloading TimesWhy Is BitTorrent Slow?Synchronized Workload CongestionAddressing CongestionNumber of FailuresPerformance After Flash CrowdsData ReuseReal-world UsageFedora Core 6 ReleaseOn Fedora Core Mirror ListConclusionThank you!CoBlitz: A Scalable Large-file Transfer Service (COS 461)KyoungSoo ParkPrinceton UniversityKyoungSoo Park 2Large-file Distribution•Increasing demand for large files•Movies or software release•On-line movie/ downloads•Linux distribution•Files are 100MB ~ tens of GB•One-to-many downloadsHow to serve large files to many clients? •Content Distribution Network(CDN)?•Peer-to-peer system?KyoungSoo Park 3What CDNs Are Optimized ForMost Web files are small (1KB ~ 100KB)KyoungSoo Park 4Why Not Web CDNs?•Whole file caching in participating proxy•Optimized for 10KB objects•2GB = 200,000 x 10KB•Memory pressure•Working sets do not fit in memory•Disk access is 1000 times slower •Waste of resources•More servers needed•Provisioning is a mustKyoungSoo Park 5Peer-to-Peer?•BitTorrent takes up ~30% Internet BW1. Download a “torrent” file2. Contact the tracker3. Enter the “swarm” network4. Chunk exchange policy - Rarest chunk first or random - Tit-for-tat: incentive to upload - Optimistic unchoking5. Validate the checksumstorrenttrackerpeersupdownBenefit: extremely good use of resources!KyoungSoo Park 6Peer-to-Peer?•Custom software•Deployment is a must•Configurations needed•Companies may want managed service•Handles flash crowds•Handles long-lived objects•Performance problem•Hard to guarantee the service quality•Others are discussed laterKyoungSoo Park 7What We’d Like IsLarge-file service withNo custom clientNo custom serverNo prepositioningNo rehostingNo manual provisoningKyoungSoo Park 8CoBlitz: Scalable Large-file CDN•Reducing the problem to small-file CDN•Split large-files into chunks•Distribute chunks at proxies•Aggregate memory/cache •HTTP needs no deployment•Benefits•Faster than BitTorrent by 55-86% (~500%) •One copy from origin serves 43-55 nodes•Incremental build on existing CDNsKyoungSoo Park 9How It WorksAgent CDNClientOnly reverse proxy(CDN) caches the chunks!CDNCDNCDNCDN ClientAgentCDNchunk1chunk 1chunk 2chunk 3chunk 2chunk 5chunk 5chunk 1chunk 1chunk 4chunk 5chunk 5chunk 4chunk1 chunk2chunk 3chunk3chunk5chunk4CDN = Redirector + Reverse ProxyDNScoblitz.codeen.orgOriginServerHTTP RANGE QUERYKyoungSoo Park 10Smart Agent•Preserves HTTP semantics•Parallel chunk requestsClientsliding window of “chunks”donedonedoneHTTPCDNCDNCDNCDNno actionCDNno actionno actionwaitingwaitingwaitingdonewaitingdonewaitingwaitingAgentKyoungSoo Park 11Chunk Indexing: Consistent HashingStatic hashing f(x) = some_f(x) % nBut n is dynamic for servers - node can go down - new node can joinCDN node (proxy)Problem: How to find the node responsible for a specific chunk?Xk : Chunk requestX1Consistent Hashing F(x) = some_F(x) % N (N is a large but fixed number)Find a live node k, where|F(k) – F(URL) | is minimum… N-1 0 …X2X3KyoungSoo Park 12Operation & Challenges•Provides public service over 2.5 years•http://coblitz.codeen.org:3125/URL•Challenges•Scalability & robustness•Peering set difference•Load to the origin serverKyoungSoo Park 13Unilateral Peering•Independent proximity-aware peering•Pick “n” close nodes around me•Cf. BitTorrent picks “n” nodes randomly•Motivation•Partial network connectivity•Internet2, CANARIE nodes•Routing disruption•Isolated nodes•Benefits•No synchronized maintenance problem•Improve both scalability & robustnessKyoungSoo Park 14Peering Set Difference•No perfect clustering by design•Assumption•Close nodes shares common peersBoth can reach Only can reach Only can reachKyoungSoo Park 15Peering Set Difference•Highly variable App-level RTTs•10 x times variance than ICMP•High rate of change in peer set•Close nodes share less than 50%•Low cache hit•Low memory utility•Excessive load to the originKyoungSoo Park 16Peering Set Difference•How to fix?•Avg RTT  min RTT•Increase # of samples•Increase # of peers•Hysteresis•Close nodes share more than 90%KyoungSoo Park 17Reducing Origin Load•Still have peering set difference•Critical in traffic to origin•Proximity-based routing•Converge exponentially fast•3-15% do one more hop•Implicit overlay tree•Result•Origin load reduction by 5xOrigin serverRerun hashingKyoungSoo Park 18Scale Experiments•Use all live PlanetLab nodes as clients•380~400 live nodes at any time•Simultaneous fetch of 50MB file•Test scenarios•Direct•BitTorrent Total/Core•CoBlitz uncached/cached/staggered•Out-of-order numbers in paperKyoungSoo Park 19Throughput Distribution00.10.20.30.40.50.60.70.80.910 2000 4000 6000 8000 10000Throughput(Kbps)Fraction of Nodes <= X (CDF)DirectBT - totalBT - coreIn - order uncachedIn - order staggeredIn - order cached55-86%Out-of-order staggeredBT-CoreKyoungSoo Park 20Downloading Times00.10.20.30.40.50.60.70.80.910 200 400 600 800 1000 1200 1400 1600 1800 2000Download Time (sec)Fraction of Nodes <= XIn-order cachedIn-order staggeredIn-order uncachedBT-coreBT-totalDirect95% percentile: 1000+ secs fasterKyoungSoo Park 21Why Is BitTorrent Slow?•In the experiments•No locality – randomly choose peers•Chunk indexing – extra communication•Trackerless BitTorrent – Kademlia DHT•In practice•Upload capacity of typical peers is low•10 to a few 100 Kbps for cable/DSL users•Tit for tat may not be fair•A few high-capacity uploaders help the most•BitTyrant[NSDI’07]KyoungSoo Park 22Synchronized Workload CongestionOrigin ServerKyoungSoo Park 23Addressing Congestion•Proximity-based multi-hop routing•Overlay tree for each chunk •Dynamic chunk-window resizing•Increase by 1/log(x), (where x is win size) if chunk finishes < average•Decrease by 1 if retry kills the first chunkKyoungSoo Park 24Number of Failures4.35.72.10123456Direct BitTorrent CoBlitzFailure Percentage(%)KyoungSoo Park 25Performance After


View Full Document

Princeton COS 461 - A Scalable Large-file Transfer Service

Documents in this Course
Links

Links

39 pages

Lecture

Lecture

76 pages

Switches

Switches

35 pages

Lecture

Lecture

42 pages

Links

Links

39 pages

Lecture

Lecture

34 pages

Topology

Topology

42 pages

Lecture

Lecture

42 pages

Overview

Overview

42 pages

Sockets

Sockets

45 pages

Load more
Download A Scalable Large-file Transfer Service
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A Scalable Large-file Transfer Service and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Scalable Large-file Transfer Service 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?