CS162 Operating Systems and Systems Programming Lecture 23 HTTP and Peer to Peer Networks April 20 2011 Ion Stoica http inst eecs berkeley edu cs162 Recap RPC Server Crashes Three cases Crash after execution Crash before execution Crash during the execution Three possible semantics At least once semantics Client keeps trying until it gets a reply At most once semantics Client gives up on failure Exactly once semantics Can this be correctly implemented 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 2 Assume Why Not Use Logging Server can log either before starting or after executing the operation Server restarts after crashing First case Server execute operation first then logs done What semantics does this implement Second case Server logs start and then execute operation What semantics does this implement So can you ensure exactly once semantics 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 3 Today s Lecture Web Hypertext Transport Protocol Peer to Peer networks Distributed Hash Tables DHTs 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 4 The Web Core components Servers store files and execute remote commands Browsers retrieve and display pages Uniform Resource Locators URLs way to refer to pages A protocol to transfer information between clients and servers HTTP 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 5 Uniform Record Locator URL protocol host name port directory path resource E g http www inst eecs berkeley edu cs162 sp11 Extend to program executions as well http www google com sclient psy hl en source hp q cs162 ber keley aq 0 aqi g5 aql oq pbx 1 bav on 2 or r gc r pw fp 1ef 120049c3f5a29 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 6 Hyper Text Transfer Protocol HTTP Client server architecture Synchronous request reply protocol Runs over TCP Port 80 Stateless Server does not keep state about client across requests i e after each request the web server forgets about client Why is this good 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 7 Big Picture Client Establish connection Client request TCP Syn Server k TCP syn ac TCP a ck Request response HTT P GET Close connection 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 8 Hyper Text Transfer Protocol Commands GET transfer resource from given URL HEAD GET resource metadata headers only PUT store modify resource under given URL DELETE remove resource POST provide input for a process identified by the given URL usually used to post CGI parameters 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 9 Client Request Steps to get the resource http www inst eecs berkeley edu cs162 sp11 1 Use DNS to obtain the IP address of www inst eecs berkeley edu 2 Send an HTTP request to IP address and port GET cs162 sp11 HTTP 1 0 80 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 10 Server Response HTTP 1 0 200 OK Content Type text html Content Length 1234 Last Modified Mon 19 Nov 2010 15 31 20 GMT HTML HEAD TITLE EECS Home Page TITLE HEAD BODY HTML 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 11 HTTP 1 0 Example Server Client Request imag e1 age 1 im r e f s n a Tr Request imag e2 age 2 Transfer im Request text Finish display page 4 20 Transfer text Ion Stoica CS162 UCB Spring 2011 Lec 23 12 HTTP 1 0 Performance Create a new TCP connection for each resource Large number of embedded objects in a web page Many short lived connections TCP transfer Too slow for small object It takes time to establish a connection and ramp up i e exit slow start phase Connections may be set up in parallel 5 is default in most browsers 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 13 HTTP 1 0 Caching Support A modifier to the GET request If modified since return a not modified response if resource was not modified since specified time A response header Expires specify to the client for how long it is safe to cache the resource A request directive No cache ignore all caches and get resource directly from server These features can be best taken advantage of with HTTP proxies Locality of reference increases if many clients share a proxy 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 14 HTTP 1 1 1996 Performance Persistent connections Pipelined requests responses Efficient caching support Network Cache assumed more explicitly in the design Gives more control to the server on how it wants data cached Support for virtual hosting Allows to run multiple web servers on the same machine 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 15 Persistent Connections Allow multiple transfers over one connection Avoid multiple TCP connection setups Avoid multiple TCP slow starts i e TCP ramp ups 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 16 Pipelined Requests Responses Buffer requests and responses to reduce the number of packets Multiple requests can be contained in one TCP segment Note order of responses has to be maintained 4 20 Server Client Ion Stoica CS162 UCB Spring 2011 Reques t1 Reques t2 Reques t3 r1 Transfe r2 Transfe r3 Transfe Lec 23 17 Achieving Scale and Availability Problem You are a web content provider How do you handle millions of web clients How do you ensure that all clients experience good performance How do you maintain availability in the presence of server and network failures Solutions Add more servers at different locations If you are CNN this might work Caching Content Distribution Networks Replication 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 18 Base line Many clients transfer same information Generate unnecessary server and network load Clients experience unnecessary latency Server ISP 1 Backbone ISP ISP 2 Clients 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 19 Reverse Caches Cache documents close to server decrease server load Typically done by content providers Server Reverse caches Backbone ISP ISP 1 ISP 2 Clients 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 20 Forward Proxies Cache documents close to clients reduce network traffic and decrease latency Typically done by ISPs or corporate LANs Server Reverse caches Backbone ISP Forward caches ISP 1 ISP 2 Clients 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 21 Content Distribution Networks CDNs Integrate forward and reverse caching functionalities into one overlay network usually administrated by one entity Example Akamai Documents are cached both As a result of clients requests pull Pushed in the expectation of a high access rate Beside caching do processing e g Handle dynamic web pages Transcoding 4 20 Ion Stoica CS162 UCB Spring 2011 Lec 23 22 Example Akamai Akamai creates new domain names for each client content
View Full Document
Unlocking...