15-441: Computer NetworkingOverviewHTTP BasicsHTTP RequestSlide 5HTTP Request ExampleHTTP ResponseSlide 8HTTP Response ExampleTypical WorkloadSlide 11HTTP CachingExample Cache Check RequestExample Cache Check ResponseHTTP 0.9/1.0Single Transfer ExampleMore ProblemsSlide 18Netscape SolutionPersistent Connection SolutionPersistent Connection ExampleSlide 22Persistent Connection PerformanceRemaining ProblemsSlide 25Web CachingWeb ProxiesCaching Proxies – Sources for missesCache HierarchiesICPSquid Cache ICP UseSquidSlide 33Slide 34Slide 35Slide 36Slide 37ICP vs HTTPOptimal Cache Mesh BehaviorProblemsProxy Implementation ProblemsQuestions – Population SizeQuestions – Common InterestsSlide 44CDNHow Akamai WorksSlide 47Slide 48Consistent HashConsistent Hash – ExampleSlide 51Akamai – Subsequent Requests15-441: Computer NetworkingLecture 23: HTTPLecture 24: 11-29-01 2Overview•HTTP Basics•HTTP Fixes•Web Caches•Content Distribution NetworksLecture 24: 11-29-01 3HTTP Basics•HTTP layered over bidirectional byte stream•Almost always TCP•Interaction•Client sends request to server, followed by response from server to client•Requests/responses are encoded in text•How to mark end of message?•Size of message Content-Length•Must know size of transfer in advance•Delimiter MIME style Content-Type•Server must “byte-stuff”•Close connection•Only server can do thisLecture 24: 11-29-01 4HTTP Request•Request line•Method•GET – return URI•HEAD – return headers only of GET response•POST – send data to the server (forms, etc.)•URI•E.g. http://www.seshan.org/index.html with a proxy•E.g. /index.html if no proxy•HTTP versionLecture 24: 11-29-01 5HTTP Request•Request headers•Authorization – authentication info•Acceptable document types/encodings•From – user email•If-Modified-Since•Referrer – what caused this page to be requested•User-Agent – client software•Blank-line•BodyLecture 24: 11-29-01 6HTTP Request ExampleGET / HTTP/1.1Accept: */*Accept-Language: en-usAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)Host: www.seshan.orgConnection: Keep-AliveLecture 24: 11-29-01 7HTTP Response•Status-line•HTTP version•3 digit response code•1XX – informational•2XX – success•3XX – redirection•4XX – client error•5XX – server error•Reason phraseLecture 24: 11-29-01 8HTTP Response•Headers•Location – for redirection•Server – server software•WWW-Authenticate – request for authentication•Allow – list of methods supported (get, head, etc)•Content-Encoding – E.g x-gzip•Content-Length•Content-Type•Expires•Last-Modified•Blank-line•BodyLecture 24: 11-29-01 9HTTP Response ExampleHTTP/1.1 200 OKDate: Tue, 27 Mar 2001 03:49:38 GMTServer: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24Last-Modified: Mon, 29 Jan 2001 17:54:18 GMTETag: "7a11f-10ed-3a75ae4a"Accept-Ranges: bytesContent-Length: 4333Keep-Alive: timeout=15, max=100Connection: Keep-AliveContent-Type: text/html…..Lecture 24: 11-29-01 10Typical Workload•Multiple (typically small) objects per page •Request sizes•In one measurement paper median 1946 bytes, mean 13767 bytes•Why such a difference? Heavy-tailed distribution•Pareto – p(x) = akax-(a+1)•File sizes•Why different than request sizes?•Also heavy-tailed•Pareto distribution for tail•Lognormal for body of distributionLecture 24: 11-29-01 11Typical Workload•Popularity•Zipf distribution (P = kr-1)•Surprisingly common•Embedded references•Number of embedded objects = pareto•Temporal locality•Modeled as distance into push-down stack•Lognormal distribution of stack distances•Request interarrival•Bursty request patternsLecture 24: 11-29-01 12HTTP Caching•Clients often cache documents•Challenge: update of documents•If-Modified-Since requests to check•HTTP 0.9/1.0 used just date•HTTP 1.1 has file signature as well•When/how often should the original be checked for changes?•Check every time?•Check each session? Day? Etc?•Use Expires header•If no Expires, often use Last-Modified as estimateLecture 24: 11-29-01 13Example Cache Check RequestGET / HTTP/1.1Accept: */*Accept-Language: en-usAccept-Encoding: gzip, deflateIf-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMTIf-None-Match: "7a11f-10ed-3a75ae4a"User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)Host: www.seshan.orgConnection: Keep-AliveLecture 24: 11-29-01 14Example Cache Check ResponseHTTP/1.1 304 Not ModifiedDate: Tue, 27 Mar 2001 03:50:51 GMTServer: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24Connection: Keep-AliveKeep-Alive: timeout=15, max=100ETag: "7a11f-10ed-3a75ae4a"Lecture 24: 11-29-01 15HTTP 0.9/1.0•One request/response per TCP connection•Simple to implement•Disadvantages•Multiple connection setups three-way handshake each time•Several extra round trips added to transfer•Multiple slow startsLecture 24: 11-29-01 16Single Transfer ExampleClientServerSYNSYNSYNSYNACKACKACKACKACKDATDATDATDATFINACK0 RTT1 RTT2 RTT3 RTT4 RTTServer reads from diskFINServer reads from diskClient opens TCP connectionClient sends HTTP request for HTMLClient parses HTMLClient opens TCP connectionClient sends HTTP request for imageImage begins to arriveLecture 24: 11-29-01 17More Problems•Short transfers are hard on TCP•Stuck in slow start•Loss recovery is poor when windows are small•Lots of extra connections•Increases server state/processing•Server also forced to keep TIME_WAIT connection state•Why must server keep these?•Tends to be an order of magnitude greater than # of active connections, why?Lecture 24: 11-29-01 18Overview•HTTP Basics•HTTP Fixes•Web Caches•Content Distribution NetworksLecture 24: 11-29-01 19Netscape Solution•Use multiple concurrent connections to improve response time•Different parts of Web page arrive independently•Can grab more of the network bandwidth than other users•Doesn’t necessarily improve response time•TCP loss recovery ends up being timeout dominated because windows are smallLecture 24: 11-29-01 20Persistent Connection Solution•Multiplex multiple transfers onto one TCP connection•Serialize transfers client makes next request only after previous response•How to demultiplex requests/responses•Content-length and delimiter
View Full Document