CS 213 Fall 2010 Lab Assignment L7 Writing a Caching Web Proxy Assigned Tue Nov 09 Due Tue Nov 23 11 59 PM Last Possible Time to Turn in Fri Nov 26 11 59 PM Theodore Martin tdmartin andrew cmu edu is the lead TA for this lab 1 Introduction A web proxy is a program that acts as a middleman between a web server and browser Instead of contacting the server directly to get a web page the browser contacts the proxy which forwards the request on to the server When the server replies to the proxy the proxy sends the reply on to the browser Proxies are used for many purposes Sometimes proxies are used in firewalls such that the proxy is the only way for a browser inside the firewall to contact a server outside The proxy may do translation on the page for instance to make it viewable on a web enabled cell phone Proxies are used as anonymizers by stripping a request of all identifying information a proxy can make the browser anonymous to the server Proxies can even be used to cache web objects by storing a copy of say an image when a request for it is first made and then serving that image directly in response to future requests rather than going to the server In this lab you will write a simple proxy that caches web objects In the first part of the lab you will set up the proxy to accept a request forward the request to the server and return the result back to the browser In this part you will learn how to write programs that interact with each other over a network socket programming as well as some basic HTTP In the second part you will upgrade your proxy to deal with multiple open connections at once Your proxy should spawn a separate thread to deal with each request This will give you an introduction to dealing with concurrency a crucial systems concept Finally you will turn your proxy into a proxy cache by adding a simple main memory cache of recently accessed web pages 2 Logistics Unlike previous labs you can work individually or in a group of two on this assignment The lab is designed to be doable by a single person so there is no penalty for working alone You are however welcome to team up with another student if you wish 1 We will not be releasing an autograder for this lab nor will autolab run the autograder for you The autograde will be determined by a tool called dbug which we will explain in a later section The majority of your grade will be determined by giving a demo of your proxy to a member of the course staff in the days following the due date for this lab Every student is required to attend an interview with a TA groups should attend an interview together as a group You will not receive a grade on this assignment unless you sign up for and attend an interview with a member of the course staff A link to demo sign ups will be posted on the course web page soon All clarifications and revisions to the assignment will be posted to the Autolab message board Partner signups will be through autolab You will receive directions for signing up in recitation DBUG Jiri Simsa has created a tool for checking concurrent code for race conditions He has adapted this tool to create a grade for your proxy for eliminating race conditions 30 of your grade on this lab will be based on this tool He will be hosting an explanation of DBUG on November 20th as well as making a virtual image available for your use in debugging your proxy This is tentatively scheduled for 3PM in McConomy Auditorium Grace Days You may use this function to calculate the number of late days you may use on this lab min 1 your late days remaining your partner s late days remaining 3 Hand Out Instructions Start by downloading proxylab handout tar from Autolab to a protected directory in which you plan to do your work Then give the command tar xvf proxylab handout tar This will cause a number of files to be unpacked in the directory The three files you will be modifying and turning in are proxy c csapp c and csapp h You may add any files you wish to this directory as you will be submitting the entire directory NOTE Transfer the tarball to a shark machine before unpacking it Some operating systems and file transfer clients wipe out Unix file permission bits The proxy c file should eventually contain the bulk of the logic for your proxy The csapp c and csapp h files are described in your textbook The csapp c file contains error handling wrappers and helper functions such as the RIO functions Section 11 4 the open clientfd function Section 12 4 4 and the open listenfd function Section 12 4 7 4 Part I Implementing a Sequential Web Proxy The first step is implementing a basic sequential proxy that handles requests one at a time When started your proxy should open a socket and listen for connection requests on the port number that is passed in on the command line See the section Port Numbers below When the proxy receives a connection request from a client typically a web browser the proxy should accept the connection read the request verify that it is a valid HTTP request and parse it to determine the server that the request was meant for It should then open a connection to that server send it the request receive the reply and forward the reply to the browser 2 Notice that since your proxy is a middleman between client and server it will have elements of both It will act as a server to the web browser and as a client to the web server Thus you will get experience with both client and server programming Processing HTTP Requests When an end user enters a URL such as http www yahoo com news html into the address bar of the browser the browser sends an HTTP request to the proxy that begins with a line looking something like this GET http www yahoo com news html HTTP 1 0 In this case the proxy will parse the request open a connection to www yahoo com and then send an HTTP request starting with a line of the form GET news html HTTP 1 0 to the server www yahoo com Please note that all lines end with a carriage return r followed by a line feed n and that HTTP request headers are terminated with an empty line Since a port number was not specified in the browser s request in this example the proxy connects to the default HTTP port port 80 on the server The web browser may specify a port that the web server is listening on if it is different from the default of 80 This is encoded in a URL as follows http www example …
View Full Document