WSU PSYCH 105 - IP-Based Applications and HTTP

Unformatted text preview:

ELEX 4550 : Wide Area Networks2015 Winter SessionIP-Based Applications and HTTPis lecture explains how most IP-based applications work by using HTTP, a typical application-level protocol, as an ex-ample. HTTP, Hypertext Transfer Protocol, is the application-level protocol used to retrieve hypertext information (“webpages”) from a “web server”.Aer this lecture you should be able to: parse a URL URI into its components, URL-encode an arbitrary string, parse amedia (MIME) content type into its components, generate the text for an HTTP 1.1 request given the URL and headervalues, generate the text for an HTTP response given the content and header values, and add A and B HTML tags to textto create a hypertext document.IntroductionHypertext Transfer Protocol, HTTP, is the protocolused to retrieve documents, usually in the HyperTextMarkup Language (HTML), from a server. We willstudy HTTP because it is similar to many other IPapplication protocols such as:FTP (File Transfer Protocol) - used to transfer filesSMTP (Simple Mail Transfer Protocol) - used tosend mail to a mail serverIMAP (Internet Message Access Protocol) and POP(Post Office Protocol) - used to retrieve e-mailfrom a serverSIP (Session Initiation Protocol) - a signalling pro-tocol used to set up voice and video callsIn each of these protocols, a client application es-tablishes a TCP connection to a server and sendsa request. e server responds with a result code,typically numeric and the requested data. e re-quest/response sequence can be repeated (typically,but not always, using the same TCP connection) untilthe client or server terminates the session with a ter-minating request or response or by closing the TCPconnection.Messages exchanged between the client and serverare lines of text, each terminated with a CR-LF pair.Each request or response consists of one line in aprotocol-specific format followed by a sequence ofheader lines. A blank line is used to separate the head-ers from the data that oen follows a request or re-sponse. e header lines typically consist of a headername, a colon and header-specific data. e data isterminated in a protocol-specific manner (e.g. a ter-minator line or reaching a byte count specified in aheader).For example, here is the exchange required to senda short e-mail using the SMTP protocol:MAIL FROM: [email protected] 2.1.0 OkRCPT TO: [email protected] 2.1.5 OkDATA354 End data with <CR><LF>.<CR><LF>Subject: TestHello!.250 2.0.0 Ok: queued as 1A4FE1AE002DQUIT221 2.0.0 ByeExercise 1: Which of these lines were sent by the client andwhich by the server? What are the header lines and how arethey terminated? What is the data and how is it terminated?How many different result codes are there? What caused theconnection to terminate?Exercise 2: What are some advantages and disadvantages ofusing text-based protocols?HTTP Protocol OverviewWeb browsing is one of the most common Internetapplications. A web browser is a program that dis-plays hypertext. Hypertext is text that has been com-bined with “markup” instructions that can embed re-sources such as images and also allows the reader tolec17.tex 1follow “links” to other resources, oen other hyper-text documents.We will look in more detail at HTTP as an exampleof an IP-based protocol. HTTP is defined in variousRFCs. e most commonly-used version, HTTP 1.1,is defined in RFC 2616 and follows the general Inter-net application protocol described above.For example the request for a web page to the hostwww.example.net might involve the client setting upa TCP connection to the server and sending the linesshown in Listing 1 where the first line (GET) is the re-quest, lines following that are headers with additionalinformation about the request and the blank line ter-minates the request.HTTP request oen that indicate the client so-ware, the languages and types of content the client canhandle, and session context information in the formof “cookies.”ere are other requests in addition to GET. Forexample, PUT and POST request can be used to senddata to the server.Exercise 3: What might some of the headers in Listing 1mean?e response consists of one line with the statuscode (200), additional headers, a blank line and thethe data associated with the response. Listing 2 showsa possible response to the previous request.Exercise 4: What might some of the headers in Listing 2mean?URIs and URLsA Universal Resource Identifier (URI) and Univer-sal Resource Locator (URL) are the syntax used toidentify resources which are typically, but not always,files that can be retrieved over the Internet. URIs andURLs are defined in RFC 3986. e syntax of a URIis:scheme “:” hier-part [ “?” query ] [ “#” fragment ]A URL is a specific type of URI. e format of aURL is:scheme://domain:port/path?query_string#fragment_ide fields are:scheme - oen called the protocol, this defines boththe syntax of the rest of the URL and, in most casesthe IP protocol used to retrieve the information(e.g. http, p, etc)domain - this is the host or IP addressport - the TCP port number (defaults to well-knownvalues)path - the (virtual) location of the resource on theserverquery - additional data to be passed to the webserver, typically a something specific to this requestsuch as text to be searched forfragment - the portion of the requested document,typically a section in a documentExercise 5: Parse the URL:https://bcit.ca:85/files/public/?bydate#first-ende web client uses the domain and port informa-tion to set up a connection to the server but the serverinterprets the remainder of the URL (possibly includ-ing the domain) as it sees fit.e server typically also has access to additional in-formation supplied by the client in protocol headers(IP, TCP or HTTP) such as the IP address, “cookies”and the history of previous requests. is means thatthe server’s response to a particular query could de-pend on many factors, not just the URL itself.URL EncodingAs shown in the above syntax, various characters (/,:, ?, #) are used to separate the URL into its parts andthus cannot be used in the content of the URL. Sincespaces are used as terminators they also cannot beused.Escape sequences beginning with the percent (%)character are used to include these special charactersin URLs. e escape character is followed by two hex-adecimal digits which define the value of the charac-ter. e byte sequence can also be a


View Full Document

WSU PSYCH 105 - IP-Based Applications and HTTP

Download IP-Based Applications and HTTP
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view IP-Based Applications and HTTP and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view IP-Based Applications and HTTP 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?