DOC PREVIEW
Berkeley COMPSCI 61A - Client-Server Paradigm, MapReduce

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Threads, CallbacksThe 3-Way HandshakeMapReduceCLIENT-SERVER PARADIGM, MAPREDUCE 22GEORGE [email protected] of Electrical Engineering and Computer SciencesUniversity of California, BerkeleyJuly 29, 20101 Threads, Callbacks(define (im-server-start);;;Start the server.;;Set! server-socket variable;Set thunk for handling handshake with new client;(format logging "~%Server starting...~%")(set! server-socket (make-server-socket))(format #t "Server IP address: ~A, server port: ~A~%"(get-ip-address-as-string)(socket-port-number server-socket))(when-socket-ready server-socket(lambda ()(begin(format logging "New client connecting.~%")(handshake (socket-dup server-socket)))))(format logging "(im-server-start) done.~%~%")’okay)2 The 3-Way Handshake(define (handshake sock);;;Handle the three-way handshake with a client.;;Handshaking should go as follows:1;client->server:; request from CLIENT to server with request "hello" and data nil;server->client:; request from server to CLIENT with request "welcome" and data nil;client->server:; request from CLIENT to server with request "thanks" and data nil;;;Accept the socket connection(socket-accept-connection sock)(format logging "Connection accepted for ~A...~%" sock)(let*((port-from-client (socket-input sock))(port-to-client (socket-output sock))(req (get-request port-from-client)))(if (not req)(socket-shutdown sock #f)(begin(format logging "Request received: ~S~%" req);; Check message is "hello".(cond ((not (equal? ’hello (request-action req)))(format #t "Bad request from client: ~S"req)(socket-shutdown sock #f))((member (request-src req) (get-clients-list));; name already exists, send "sorry" to client(format logging "Sending ’sorry’ to client~%")(send-request (make-request ’server(request-src req)’sorrynil)port-to-client)(format #t "Name ~A already exists."(request-src req))(socket-shutdown sock #f))(else;;Send "welcome" message back.(format logging "Sending welcome message.~%")(if (not(send-request (make-request ’server(request-src req)’welcomenil)port-to-client))(socket-shutdown sock #f)(begin;; Check response is "thanks"(set! req (get-request port-from-client))(if (not req)(socket-shutdown sock #f)(begin2(format logging "Response received: ~S~%" req)(if (not (equal? ’thanks (request-action req)))(begin (format #t "Bad response from client: ~S" req)(socket-shutdown sock #f))(begin;; Finally, we can register the client(format logging "~A has logged on.~%"(request-src req))(register-client (request-src req)sock)(format logging "Finished handshake~%") )) ) ))))) )))’okay)3 MapReduceHere’s the diagram of mapreduce again:ACCUMULATEMAPkvkvkvkvkvkvkeyREDUCEkeyREDUCEREDUCEkeykeykeykey(mapreduce f1 f2 base dataname)f1f2()MAPMAPMAPMAPSORTvaluevaluevaluevaluevaluevalueresultresultresultf1f2(accumulate f2 base (map f1 data))The seemingly unpoetic names f1 and f2 serve to remind you of two things: f1 (the mapper) is usedbefore f2 (the reducer), and f1 takes one argument while f2 takes two arguments (just like the functionsused with ordinary map and accumulate respectively).mapper: kv-pair→list-of-kv-pairsreducer: value, partial-result → resultAll data are in the form of key-value pairs. Ordinary map doesn’t care what the elements of the datalist argument are, but mapreduce works only with data each of which is a key-value pair. In the Schemeinterface to the distributed filesystem, a file is a stream. Every line of the file is a key-value pair whose keyis the filename and whose value is a list of words, representing the text of the line.(Actually, “a file is a stream” is a slight handwave. The STk running on the parallel cluster has specialversions of stream-car, etc., that accept both ordinary streams and special file streams that are reallypointers to data in the distributed file system, but that simulate the behavior of streams. So if you print oneof these special streams directly, rather than using show-stream, you’ll see something that doesn’t lookfamiliar.)Each processor runs a separate stream-map. The overlapping squares at the left of the mapreduce picturerepresent an entire stream. (How is a large distributed file divided among map processes? It doesn’t reallymatter, as far as the mapreduce user is concerned; mapreduce tries to do it as efficiently as possible given3the number of processes and the location of the data in the filesystem.) The entire stream is the input to amap process; each element of the stream (a kv-pair) is the input to your mapper function f1.For each key-value pair in the input stream, the mapper returns a list of key-value pairs. In the simplestcase, each of these lists will have one element; the code will look something like(define (my-mapper input-kv-pair)(list (make-kv-pair ... ...)))The interface requires that you return a list to allow for the non-simplest cases:(1) Each input key-value pair may give rise to more than one output key-value pair. For example, you maywant an output key-value pair for each word of the input file, whereas the input key-value pair representsan entire line:(define (my-mapper input-kv-pair)(map (lambda (wd) (make-kv-pair ... ...))(kv-value input-kv-pair)))(2) There are three commonly used higher order functions for sequential data, map, accumulate/reduce,and filter. The way mapreduce handles the sort of problem for which filter would ordinarily beused is to allow a mapper to return an empty list if this particular key-value pair shouldn’t contribute tothe result:(define (my-mapper input-kv-pair)(if ...(list input-kv-pair)’()))Of course it’s possible to write mapper functions that combine these three patterns for more complicatedtasks.The keys in the kv-pairs returned by the mapper need not be the same as the key in the input kv-pair.Instead of one big accumulation, there’s a separate accumulation of values for each key. The non-parallelcomputation in the left half of the picture has two steps, a map and an accumulate. But the mapreducecomputation has three steps; the middle step sorts all the key-value pairs produced by all the mapper pro-cesses by their keys, and combines all the kv-pairs with the same key into a single aggregate structure,which is then used as the input to a reduce process.This is why the use of key-value pairs is important! If the data had no such structure imposed on them, therewould be no way for us to tell mapreduce which data should be combined in each reduction.Although it’s shown as one big box, the


View Full Document

Berkeley COMPSCI 61A - Client-Server Paradigm, MapReduce

Documents in this Course
Lecture 1

Lecture 1

68 pages

Midterm

Midterm

5 pages

Midterm

Midterm

6 pages

Lecture 35

Lecture 35

250 pages

Lecture 14

Lecture 14

125 pages

Lecture 2

Lecture 2

159 pages

Lecture 6

Lecture 6

113 pages

Lecture 3

Lecture 3

162 pages

Homework

Homework

25 pages

Lecture 13

Lecture 13

117 pages

Lecture 29

Lecture 29

104 pages

Lecture 11

Lecture 11

173 pages

Lecture 7

Lecture 7

104 pages

Midterm

Midterm

6 pages

Midterm

Midterm

6 pages

Lecture 8

Lecture 8

108 pages

Lab 4

Lab 4

4 pages

Lecture 7

Lecture 7

52 pages

Lecture 20

Lecture 20

129 pages

Lecture 15

Lecture 15

132 pages

Lecture 9

Lecture 9

95 pages

Lecture 30

Lecture 30

108 pages

Lecture 17

Lecture 17

106 pages

Load more
Download Client-Server Paradigm, MapReduce
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Client-Server Paradigm, MapReduce and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Client-Server Paradigm, MapReduce 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?