15-213 Introduction to Computer SystemsFinal ExamMay 3, 2006Name:Andrew User ID:Recitation Section:• This is an open-book exam.• Notes and calculators are permitted, but not computers.• Write your answers legibly in the space provided.• You have 180 minutes for this exam.Assembly LanguageOut-of-Order ExecutionPointer ArithmeticCachingSignalsSemaphoresServersSystem-Level I/OProblem Max Score1 152 203 104 205 156 307 208 20Total 15011. Assembly Language (15 points)Consider the following declaration of a binary search tree data structure.struct TREE {int data;struct TREE*left;struct TREE*right;};Describe the layout of this structure on an x86-64 architecture.1. (3 pts) I f a is the address of the beginning of the structure, then•is the a d d ress of the data field,•is the a d d ress of the left field, an d• is the a d d ress of the right field.Give your address calculations in bytes in decimal.2. (1 pts) a will be aligned to 0 modulo.3. (1 pts) The total size of a TREE structure will bebytes.2Next we consider the tree a s a binary search tree, where elements to the left of a nodeare smaller a nd elements to the right of a node are larger than the data stored in the node.The following function checks whether a given integer x is stored in the global tree root.typedef struct TREE tree;tree*root;int member(int x) {tree*t = root;while (t != NULL) {if (x == t->data)return 1;if (x < t->data)t = t->left;elset = t->right;}return 0;}This function might compile to the following piece of assembly code, omitting somecode alignment directives and three lines for you to fill in.member:movq root(%rip), %raxtestq %rax, %raxje .L9.L14:___________________________je .L13jle .L5movq 8(%rax), %rax.L11:testq %rax, %raxjne .L14.L9:___________________________ret.L5:movq 16(%rax), %raxjmp .L11.L13:___________________________ret34. (4 p ts) Complete the following table, a ssociating C variables with machine registersor assembly expressions.C variable Assembly expressionxroottreturn value5. (6 pts) Fill in the missing three lines of assembly code.42. Out-of-Order Execution (20 points)We continue the code from the previous problem1. (15 pts) On a machine with pipelining and out-of-order execution as the machinesused in this course, the efficiency of the inner loop can be i mp roved by the use ofconditional move instructions. Rewrite the code between .L14 and .L9 by usingonly instructions from the original program, ordinary move instructions, and oneor more of the following conditional move instructions.cmovl S, Dcmovle S, Dcmove S, Dcmovge S, Dcmovg S, DAs usual, S stands for the source, D for the de stination, and the suffix l, le, e, ge,g has the same meaning as for conditiona l branches..L14:testq %rax, %raxjne .L142. (5 pts) Explain why the code with conditional moves can be more efficient than theoriginal code produced by the compiler.53. Pointer Arithmetic (10 points)A desparate student decided to write a dynamic memory allocator for an x86-64 machinein which each block has the following form:Header Id string Payload Footerwhere• Header is a 4 byte header• Id string is an 8 byte string• Payload is arbitrary size, including padding• Footer is a 4 byte footerAssume the student wants to print the Id string with the following functionvoid print_block(void*bp) {printf("Found block ID: %s\n", GET_ID(bp));}where bp points to the beginning of the payload and is aligned to 0 modulo 8. Circle eachof the following letters A–J for which the macros will correctly print the id string.A #define GET_ID(bp) ((char*)(((long)bp)-8))B #define GET_ID(bp) ((char*)(((int)bp)-8))C #define GET_ID(bp) ((char*)(((char)bp)-8))D #define GET_ID(bp) ((char*)(((long*)bp)-1))E #define GET_ID(bp) ((char*)(((int*)bp)-2))F #define GET_ID(bp) ((char*)(((char*)bp)-8))G #define GET_ID(bp) ((char*)(((char**)bp)-1))H #define GET_ID(bp) ((char*)(((char**)bp)-2))I #define GET_ID(bp) ((char*)(((char**)bp)-4))J #define GET_ID(bp) ((char*)(((char**)bp)-8))64. Caching (20 points)Assume the following situation.• All caches are fully associative, with LRU eviction policy.• The cache is write-back, write-allocate.• All caches are empty at the beginning of an execution.• Variables i, j, a n d k are stored in registers.• A float is 4 bytes.The function mm_ijk multiplies two N × N arrays A and B and puts the result in R.For simplicity, we assume R is initialized to all zeros.void mm_ijk (float A[N][N], float B[N][N], float R[N][N]) {int i,j,k;for (i = 0; i < N; i++)for (j = 0; j < N; j++)for (k = 0; k < N; k++)R[i][j] += A[i][k]*B[k][j];}1. (6 pts) Consider the executions of mm_ijk with N=2 and N=4 or a 64-byte fullyassociative L RU cache with 4-byte lines (the cache holds 16 lines). Fill in the tablebelow with the number of cache misses caused by accesses to each of the arrays A,B, and R, assuming that the a rgument arrays are 16-byte aligned.N A B R242. (6 pts) Now suppose we consider the previous experiment on a 64-byte fully asso-ciative LRU cache with 16 byte lines (the cache holds 4 lines). Fill in the table belowwith the numbe r of cache misses due to each array, assuming that the argumentarrays are 16 byte aligned.N A B R2473. (5 pts) Even if R is initialized to all ze ros, and even if the program is single-threadedand no signals occur, after the execution of the mm_ijk function, R will not ne ce s-sarily contain the product of A and B. Give a concrete counterexample.4. (3 pts) U n d er which add itional assumption is mm_ijk correct?85. Signals (15 points)For each code segment below, give the largest value that could be printed to stdout. Re-member that when the system executes a signal handler, it blocks signals of the typecurrently being handled (and no others)./*Version A*/int i = 0;void handler(int s) {if (!i) {kill(getpid(), SIGINT);}i++;}int main() {signal(SIGINT, handler);kill(getpid(), SIGINT);printf("%d\n", i);return 0;}1. (5 pts) Largest value for version A: ./*Version B*/int i = 0;void handler(int s) {if (!i) {kill(getpid(), SIGINT);kill(getpid(), SIGINT);}i++;}int main() {signal(SIGINT, handler);kill(getpid(), SIGINT);printf("%d\n", i);return 0;}2. (5 pts) Largest value for version B:.9/*Version C*/int i = 0;void handler(int s) {if (!i) {kill(getpid(), SIGINT);kill(getpid(), SIGUSR1);}i++;}int main() {signal(SIGINT, handler);signal(SIGUSR1, handler);kill(getpid(), SIGUSR1);printf("%d\n", i);return 0;}3. (5 pts) Largest value for version C:.106. Semaphores (30 points)We would now like to use binary se
View Full Document