Announcements Project 1 Milestone 1 Due 11pm Tonight No slip days The World Wide Web Homework 2 EE 122 Intro to Communication Networks Out now Hard deadline of 11PM Tues Oct 9 Fall 2007 WF 4 5 30 in Cory 277 Lisa Fowler Vern Paxson TAs Lisa Fowler Daniel Killebrew Jorge Ortiz First Midterm coming soon http inst eecs berkeley edu ee122 Materials with thanks to Vern Paxson Jennifer Rexford Ion Stoica and colleagues at Princeton and UC Berkeley Fri Oct 12 1 2 Goals of Today s Lecture The World Wide Web Main ingredients of the Web URIs HTML HTTP Key properties of HTTP Request response stateless and resource meta data Performance of HTTP Parallel connections persistent connections pipelining Web components Clients proxies and servers Caching vs replication 4 3 Web Components HTML Content A Web page has several components Objects Base HTML file Referenced objects e g images Clients Send requests Receive responses HyperText Markup Language HTML Servers Representation of hypertext documents in ASCII format Web browsers interpret HTML when rendering a page Several functions Receive requests Send responses Store or generate the responses Format text reference images embed hyperlinks HREF Proxies Straight forward to learn Placed between clients and servers Act as a server for the client and a client to the server Provide extra functions Caching anonymization logging transcoding filtering access Explicit or transparent interception Content How 5 Syntax easy to understand Authoring programs can auto generate HTML Source almost always available 6 1 URI URL Syntax Content How Uniform Resource Identifier URI Content How protocol hostname port directorypath resource Uniform Resource Locator URL Provides a means to get the resource http www ietf org rfc rfc3986 txt protocol http ftp https smtp rtsp etc hostname FQDN IP address port Defaults to protocol s standard port e g http 80 tcp https 443 tcp directory path Hierarchical often reflecting file system resource Identifies the desired resource Uniform Resource Name URN Can also extend to program executions Names a resource independent of how to get it urn ietf rfc 3986 is a standard URN for RFC 3986 http us f413 mail yahoo com ym ShowLetter box 4 0B 40Bulk MsgId 2604 1744106 29699 1123 1261 0 289 17 3552 1289957100 Search Nhead f YY 31454 order down sort date pos 0 view a head b 7 HTTP Client to Server Communication Client Server How HyperText Transfer Protocol HTTP HTTP Request Message Client server protocol for transferring resources Request line method resource and protocol version Request headers provide information or modify request Body optional data e g to POST data to the server Important properties Request response protocol Reliance on a global URI namespace Resource metadata Stateless telnet www icir org 80 ASCII format request line GET vern HTTP 1 0 blank line i e CRLF GET somedir page html HTTP 1 1 Host www someschool edu header User agent Mozilla 4 0 lines Connection close Accept language fr blank line Not optional carriage return line feed indicates end of message 9 Client to Server Communication 10 Server to Client Communication HTTP Request Message HTTP Response Message Request line method resource and protocol version Request headers provide information or modify request Body optional data e g to POST data to the server Status line protocol version status code status phrase Response headers provide information Body optional data status line Request methods include GET Return current value of resource run program HEAD Return the meta data associated with a resource POST Update resource provide input to a program HTTP 1 1 200 OK Connection close Date Thu 06 Aug 2006 12 00 15 GMT Server Apache 1 3 0 Unix header Last Modified Mon 22 Jun 2006 lines Content Length 6821 Content Type text html protocol status code status phrase Headers include blank line Useful info for the server e g desired language 8 data 11 e g requested HTML file data data data data data 12 2 Server to Client Communication Web Server Generating a Response Return a file HTTP Response Message Status line protocol version status code status phrase Response headers provide information Body optional data URL matches a file e g www index html Server returns file as the response Server generates appropriate response header Generate response dynamically Response code classes URL triggers a program on the server Server runs program and sends output to client Similar to other ASCII app protocols like SMTP Code Class Example 1xx Informational 100 Continue 2xx Success 200 OK 3xx Redirection 304 Not Modified 4xx Client error 404 Not Found 5xx Server error 503 Service Unavailable Return meta data with no body 13 HTTP Resource Meta Data 14 HTTP is Stateless Meta data Client Server How Stateless protocol Info about a resource A separate entity Each request response exchange treated independently Servers not required to retain state Examples This is good Size of a resource Last modification time Type of the content Improves scalability on the server side Don t have to retain info across requests Can handle higher rate of requests Order of requests doesn t matter Data format classification e g Content Type text html Enables browser to automatically launch an appropriate viewer Borrowed from e mail s Multipurpose Internet Mail Extensions MIME This is bad Usage example Conditional GET Request Client requests object If modified since If object hasn t changed server returns HTTP 1 1 304 Not Modified No body in the server s response only a header Some applications need persistent state Need to uniquely identify user or store temporary info e g Shopping cart user preferences and profiles usage tracking 15 16 State in a Stateless Protocol State in a Stateless Protocol Cookies HTTP Authentication Client side state maintenance Tool to limit access to server documents Client stores small state on behalf of server Client sends state in future requests to the server Basic HTTP Authentication Client can add an Authorization header to GET request Can provide authentication Base64 encoded concatenation of username a colon password Request If client doesn t provide header server responds with a 401 Unauthorized and a WWW Authenticate header Server does not honor request until valid authorization received Stateless Must happen on each request Response Set Cookie XYZ Is this secure Is this security No Authentication is not security but provides a piece Request Cookie XYZ 17 18 3 Security Sneak Peek
View Full Document