Carnegie Mellon Web Services 15 213 Introduction to Computer Systems 21st Lecture Nov 4 2010 Instructors Randy Bryant and Dave O Hallaron 1 Carnegie Mellon Web History Consider a future device for individual use which is a sort of mechanized private file and library It needs a name and to coin one at random memex will do A memex is a device in which an individual stores all his books records and communications and which is mechanized so that it may be consulted with exceeding speed and flexibility It is an enlarged intimate supplement to his 1945 memory Vannevar Bush As we may think Atlantic Monthly July 1945 Describes the idea of a distributed hypertext system A memex that mimics the web of trails in our minds 2 Carnegie Mellon Web History 1989 Tim Berners Lee CERN writes internal proposal to develop a distributed hypertext system Connects a web of notes with links Intended to help CERN physicists in large projects share and manage information 1990 Tim BL writes a graphical browser for Next machines 3 Carnegie Mellon Web History cont 1992 NCSA server released 26 WWW servers worldwide 1993 Marc Andreessen releases first version of NCSA Mosaic browser Mosaic version released for Windows Mac Unix Web port 80 traffic at 1 of NSFNET backbone traffic Over 200 WWW servers worldwide 1994 Andreessen and colleagues leave NCSA to form Mosaic Communications Corp predecessor to Netscape 4 Carnegie Mellon Internet Hosts How many of the 232 IP addresses have registered domain names 5 Carnegie Mellon Web Servers Clients and servers communicate using the HyperText Transfer Protocol HTTP Client and server establish TCP connection Client requests content Server responds with requested content Client and server close connection eventually Current version is HTTP 1 1 RFC 2616 June 1999 Web client browser HTTP request Web server HTTP response content HTTP TCP IP Web content Streams Datagrams http www w3 org Protocols rfc2616 rfc2616 html 6 Carnegie Mellon Web Content Web servers return content to clients content a sequence of bytes with an associated MIME Multipurpose Internet Mail Extensions type Example MIME types text html text plain application postscript image gif format image jpeg format HTML document Unformatted text Postcript document Binary image encoded in GIF Binary image encoded in JPEG 7 Carnegie Mellon Static and Dynamic Content The content returned in HTTP responses can be either static or dynamic Static content content stored in files and retrieved in response to an HTTP request Examples HTML files images audio clips Request identifies content file Dynamic content content produced on the fly in response to an HTTP request Example content produced by a program executed by the server on behalf of the client Request identifies file containing executable code Bottom line All Web content is associated with a file that is managed by the server 8 Carnegie Mellon URLs Each file managed by a server has a unique name called a URL Universal Resource Locator URLs for static content http www cs cmu edu 80 index html http www cs cmu edu index html http www cs cmu edu Identifies a file called index html managed by a Web server at www cs cmu edu that is listening on port 80 URLs for dynamic content http www cs cmu edu 8000 cgi bin proc 15000 213 Identifies an executable file called proc managed by a Web server at www cs cmu edu that is listening on port 8000 that should be called with two argument strings 15000 and 213 9 Carnegie Mellon How Clients and Servers Use URLs Example URL http www cmu edu 80 index html Clients use prefix http www cmu edu 80 to infer What kind of server to contact Web server Where the server is www cmu edu What port it is listening on 80 Servers use suffix index html to Determine if request is for static or dynamic content No hard and fast rules for this Convention executables reside in cgi bin directory Find file on file system Initial in suffix denotes home directory for requested content Minimal suffix is which all servers expand to some default home page e g index html 10 Carnegie Mellon Anatomy of an HTTP Transaction unix telnet www cmu edu 80 Trying 128 2 10 162 Connected to www cmu edu Escape character is GET HTTP 1 1 host www cmu edu Client open connection to server Telnet prints 3 lines to the terminal Client request line Client required HTTP 1 1 HOST header Client empty line terminates headers HTTP 1 1 301 Moved Permanently Server response line Location http www cmu edu index shtml Client should try again Connection closed by foreign host Server closes connection unix Client closes connection and terminates 11 Carnegie Mellon Anatomy of an HTTP Transaction Take 2 unix telnet www cmu edu 80 Trying 128 2 10 162 Connected to www cmu edu Escape character is GET index shtml HTTP 1 1 host www cmu edu Client open connection to server Telnet prints 3 lines to the terminal Client request line Client required HTTP 1 1 HOST header Client empty line terminates headers HTTP 1 1 200 OK Server responds with web page Date Fri 29 Oct 2010 19 41 08 GMT Server Apache 1 3 39 Unix mod pubcookie 3 3 3 Transfer Encoding chunked Content Type text html Lots of stuff Connection closed by foreign host Server closes connection unix Client closes connection and terminates 12 Carnegie Mellon HTTP Requests HTTP request is a request line followed by zero or more request headers Request line method uri version version is HTTP version of request HTTP 1 0 or HTTP 1 1 uri is typically URL for proxies URL suffix for servers A URL is a type of URI Uniform Resource Identifier See http www ietf org rfc rfc2396 txt method is either GET POST OPTIONS HEAD PUT DELETE or TRACE 13 Carnegie Mellon HTTP Requests cont HTTP methods GET Retrieve static or dynamic content Arguments for dynamic content are in URI Workhorse method 99 of requests POST Retrieve dynamic content Arguments for dynamic content are in the request body OPTIONS Get server or file attributes HEAD Like GET but no data in response body PUT Write a file to the server DELETE Delete a file on the server TRACE Echo request in response body Useful for debugging Request headers header name header data Provide additional information to the server 14 Carnegie Mellon HTTP Versions Major differences between HTTP 1 1 and HTTP 1 0 HTTP 1 0 uses a new connection for each transaction HTTP 1 1 also supports persistent connections multiple transactions over the same connection Connection Keep Alive HTTP 1 1 requires HOST header Host www cmu edu Makes it
View Full Document