Proxy Lab Recitation IOutlineWhat is a proxy?Brief HTTP TutorialHTTP RequestSlide 6Slide 7HTTP ResponseSlide 9Concurrent ProxyConcurrent ProxyCaching ProxyCaching: Implementation IssuesGeneral adviceProxy Lab Recitation IMonday Nov 20, 2006Outline•What is a HTTP proxy?•HTTP Tutorial–HTTP Request–HTTP Response•Sequential vs. concurrent proxies•CachingWhat is a proxy?•Why a proxy?–Access control (allowed websites)–Filtering (viruses, for example)–Caching (multiple people request CNN)Client ServerProxyBrowserwww.google.comBrief HTTP Tutorial•Hyper-Text Transfer Protocol–Protocol spoken between a browser and a web-server•From browser web-server: REQUEST–GET http://www.google.com/ HTTP/1.0•From web-server browser: RESPONSE–HTTP 200 OK–Other stuff…HTTP RequestGET http://csapp.cs.cmu.edu/simple.html HTTP/1.1Host: csapp.cs.cmu.eduUser-Agent: Mozilla/5.0 ... Accept: text/xml,application/xml ...Accept-Language: en-us,en;q=0.5 ...Accept-Encoding: gzip,deflate ...Request Type PathHost VersionAn empty line terminates a HTTP requestHTTP RequestGET http://csapp.cs.cmu.edu/simple.html HTTP/1.1Host: csapp.cs.cmu.eduUser-Agent: Mozilla/5.0 ... Accept: text/xml,application/xml ...Accept-Language: en-us,en;q=0.5 ...Accept-Encoding: gzip,deflate ...The Host header is optional in HTTP/1.0 but we recommend that it be always includedHTTP RequestGET http://csapp.cs.cmu.edu/simple.html HTTP/1.1Host: csapp.cs.cmu.eduUser-Agent: Mozilla/5.0 ... Accept: text/xml,application/xml ...Accept-Language: en-us,en;q=0.5 ...Accept-Encoding: gzip,deflate ...The User agent identifies the browser type. Some websites use it to determine what to send. And reject you if you sayyou use MyWeirdBrowser Proxy must send this and all other headers through…HTTP ResponseHTTP/1.1 200 OKDate: Mon, 20 Nov 2006 03:34:17 GMTServer: Apache/1.3.19 (Unix) …Last-Modified: Mon, 28 Nov 2005 23:31:35 GMTContent-Length: 129Connection: Keep-AliveContent-Type: text/htmlStatusStatus indicates whether it was successful or not, if it is a “redirect”, etc.The complete response should be transparently sent back to the client by the proxy.HTTP ResponseHTTP/1.1 200 OKDate: Mon, 20 Nov 2006 03:34:17 GMTServer: Apache/1.3.19 (Unix) …Last-Modified: Mon, 28 Nov 2005 23:31:35 GMTContent-Length: 129Connection: Keep-AliveContent-Type: text/htmlThis field identifies how many bytes are there in the response.Not sent by all web-servers. DO NOT RELY ON IT !Concurrent Proxy•Need to handle multiple requests simultaneously–From different clients–From the same client •E.g., each individual image in a HTML document needs to be requested separately•Serving requests sequentially decreases throughput –Server is waiting for I/O most of the time–This time can be used to start serving other clients–Multiple outstanding requestsConcurrent Proxy •Use threads for making proxy concurrent–Create one thread for each new client request–The thread finishes and exists after serving the client request–Use pthread library•pthread_create(), pthread_detach(), etc.•Can use select() as well for adding concurrency–Much more difficult to get rightCaching Proxy•Most geeks visit http://slashdot.org/ every 2 minutes–Why fetch the same content again and again?–(If it doesn’t change frequently)•The proxy can cache responses –Serve directly out of its cache–Reduces latency, network-loadCaching: Implementation Issues •Use the GET URL (host/path) to locate the appropriate cache entry•THREAD SAFETY–A single cache is accessed by multiple threads–Easy to create bugs: thread 1 is reading an entry, while thread 2 is deleting the same entryGeneral advice•Use RIO routines–rio_readnb, rio_readlineb–Be very careful when you are reading line-by-line (HTTP request), versus just a stream of bytes (HTTP response)•When to use strcpy() vs. memcpy() •gethostbyname(), inet_ntoa() are not thread-safe!•Path: sequential + concurrency +
View Full Document