15-441 Computer Networking The Web Web historyWeb history (cont)Design the WebBasic Concepts Basic Concepts Overview of Concepts in This LectureHTTP BasicsHTTP RequestHTTP Request ExampleHTTP Response ExampleHTTP RequestHTTP Request (cont.)HTTP ResponseHTTP Response (cont.)How to Mark End of Message?Cookies: Keeping “State” (Cont.)Cookies: Keeping “state”OutlineTypical Workload (Web Pages)HTTP 0.9/1.0Single Transfer ExampleMore ProblemsPersistent Connection SolutionPersistent Connection ExamplePersistent HTTPOutlineWeb Proxy CachesCaching Example (1)Caching Example (2)Caching Example (3)HTTP CachingExample Cache Check RequestExample Cache Check ResponseProblemsContent Distribution Networks (CDNs)OutlineContent Distribution Networks & Server SelectionServer SelectionApplication BasedNaming BasedHow Akamai WorksHow Akamai WorksHow Akamai WorksSimple HashingHow Akamai WorksAkamai – Subsequent RequestsSummaryCaching Proxies – Sources for MissesNaming Based15-441 Computer NetworkingThe Web2Web history• 1945: Vannevar Bush, “As we may think”, Atlantic Monthly, July, 1945.• describes the idea of a distributed hypertext system.• a “memex” that mimics the “web of trails” in our minds.• 1989: Tim Berners-Lee (CERN) writes internal proposal to develop a distributed hypertext system• connects “a web of notes with links”.• intended to help CERN physicists in large projects share and manage information • 1990: Tim BL writes graphical browser for Next machines.3Web history (cont)• 1992• NCSA server released• 26 WWW servers worldwide• 1993• Marc Andreessen releases first version of NCSA Mosaic Mosaicversion released for (Windows, Mac, Unix).• Web (port 80) traffic at 1% of NSFNET backbone traffic.• Over 200 WWW servers worldwide.• 1994• Andreessen and colleagues leave NCSA to form "Mosaic Communications Corp" (Netscape).4Design the Web• How would a computer scientist do it? • What are the important considerations?• What are NOT important? • What should be the basic architecture? • What are the components? • What are the interfaces of components?5Basic Concepts • client/server model• client: browser that requests, receives, “displays” Web objects• server: Web server sends objects in response to requests• HTTP: Web’s application layer protocol• HTTP 1.0: RFC 1945• HTTP 1.1: RFC 2068PC runningExplorerServer runningApache WebserverMac runningNavigatorHTTP requestHTTP requestHTTP responseHTTP response6Basic Concepts • Web page consists of objects• Web page consists of base HTML-file which includes several referenced objects• Object can be HTML file, JPEG image, Java applet, audio file,…• Each page or object is addressable by a URL7Overview of Concepts in This Lecture• HTTP• Interaction between HTTP and TCP• Persistent HTTP• Caching • Content Distribution Network (CDN)• State • What is stateless protocol? Advantages and disadvantages?• What type of states are used in the Web? • Issues of maintaining state8HTTP Basics• HTTP layered over bidirectional byte stream• Almost always TCP• Interaction• Client sends request to server, followed by response from server to client• Requests/responses are encoded in text• Stateless• Server maintains no information about past client requests9HTTP Request10HTTP Request ExampleGET / HTTP/1.1Accept: */*Accept-Language: en-usAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)Host: www.intel-iris.netConnection: Keep-Alive11HTTP Response ExampleHTTP/1.1 200 OKDate: Tue, 27 Mar 2001 03:49:38 GMTServer: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24Last-Modified: Mon, 29 Jan 2001 17:54:18 GMTETag: "7a11f-10ed-3a75ae4a"Accept-Ranges: bytesContent-Length: 4333Keep-Alive: timeout=15, max=100Connection: Keep-AliveContent-Type: text/html…..12HTTP Request• Request line• Method• GET – return URI• HEAD – return headers only of GET response• POST – send data to the server (forms, etc.)• URL (relative)• E.g., /index.html• HTTP version13HTTP Request (cont.)• Request headers• Authorization – authentication info• Acceptable document types/encodings• From – user email• If-Modified-Since• Referrer – what caused this page to be requested• User-Agent – client software• Blank-line• Body14HTTP Response• Status-line• HTTP version• 3 digit response code• 1XX – informational• 2XX – success• 200 OK• 3XX – redirection• 301 Moved Permanently• 303 Moved Temporarily• 304 Not Modified• 4XX – client error• 404 Not Found• 5XX – server error• 505 HTTP Version Not Supported• Reason phrase15HTTP Response (cont.)• Headers• Location – for redirection• Server – server software• WWW-Authenticate – request for authentication• Allow – list of methods supported (get, head, etc)• Content-Encoding – E.g x-gzip• Content-Length• Content-Type• Expires• Last-Modified• Blank-line• Body16How to Mark End of Message?• Size of message Æ Content-Length• Implications: • must know size of transfer in advance• What applications are not appropriate? • Close connection• Only server can do this17Cookies: Keeping “State” (Cont.)clientAmazon serverusual http request msgusual http response +Set-cookie: 1678 usual http request msgcookie: 1678usual http response msgusual http request msgcookie: 1678usual http response msgcookie-specificactioncookie-specificactionservercreates ID1678 for userentry in backend databaseaccessaccessCookie fileamazon: 1678ebay: 8734Cookie fileebay: 8734Cookie fileamazon: 1678ebay: 8734one week later:18Cookies: Keeping “state”Many major Web sites use cookiesFour components:1) Cookie header line in the HTTP response message2) Cookie header line in HTTP request message3) Cookie file kept on user’s host and managed by user’s browser4) Back-end database at Web siteExample:• Susan access Internet always from same PC• She visits a specific e-commerce site for first time• When initial HTTP requests arrives at site, site creates a unique ID and creates an entry in backend database for ID19Outline• Web intro, HTTP• Persistent HTTP • HTTP caching• Content distribution networks20Typical Workload (Web Pages)• Multiple (typically small) objects per page • File sizes• Heavy-tailed• Pareto distribution for tail•
View Full Document