15 213 Introduction to Computer Systems Final Exam May 3 2006 Name Andrew User ID Recitation Section This is an open book exam Notes and calculators are permitted but not computers Write your answers legibly in the space provided You have 180 minutes for this exam Problem Max Score Assembly Language 1 15 Out of Order Execution 2 20 Pointer Arithmetic 3 10 Caching 4 20 Signals 5 15 Semaphores 6 30 Servers 7 20 System Level I O 8 20 Total 150 1 1 Assembly Language 15 points Consider the following declaration of a binary search tree data structure struct TREE int data struct TREE left struct TREE right Describe the layout of this structure on an x86 64 architecture 1 3 pts If a is the address of the beginning of the structure then is the address of the data field is the address of the left field and is the address of the right field Give your address calculations in bytes in decimal 2 1 pts a will be aligned to 0 modulo 3 1 pts The total size of a TREE structure will be 2 bytes Next we consider the tree as a binary search tree where elements to the left of a node are smaller and elements to the right of a node are larger than the data stored in the node The following function checks whether a given integer x is stored in the global tree root typedef struct TREE tree tree root int member int x tree t root while t NULL if x t data return 1 if x t data t t left else t t right return 0 This function might compile to the following piece of assembly code omitting some code alignment directives and three lines for you to fill in member movq testq je root rip rax rax rax L9 L14 je L13 jle L5 movq 8 rax rax L11 testq jne rax rax L14 L9 ret L5 movq jmp 16 rax rax L11 L13 ret 3 4 4 pts Complete the following table associating C variables with machine registers or assembly expressions C variable Assembly expression x root t return value 5 6 pts Fill in the missing three lines of assembly code 4 2 Out of Order Execution 20 points We continue the code from the previous problem 1 15 pts On a machine with pipelining and out of order execution as the machines used in this course the efficiency of the inner loop can be improved by the use of conditional move instructions Rewrite the code between L14 and L9 by using only instructions from the original program ordinary move instructions and one or more of the following conditional move instructions cmovl cmovle cmove cmovge cmovg S D S D S D S D S D As usual S stands for the source D for the destination and the suffix l le e ge g has the same meaning as for conditional branches L14 testq jne rax rax L14 2 5 pts Explain why the code with conditional moves can be more efficient than the original code produced by the compiler 5 3 Pointer Arithmetic 10 points A desparate student decided to write a dynamic memory allocator for an x86 64 machine in which each block has the following form Header Id string Payload Footer where Header is a 4 byte header Id string is an 8 byte string Payload is arbitrary size including padding Footer is a 4 byte footer Assume the student wants to print the Id string with the following function void print block void bp printf Found block ID s n GET ID bp where bp points to the beginning of the payload and is aligned to 0 modulo 8 Circle each of the following letters A J for which the macros will correctly print the id string A define GET ID bp char long bp 8 B define GET ID bp char int bp 8 C define GET ID bp char char bp 8 D define GET ID bp char long bp 1 E define GET ID bp char int bp 2 F define GET ID bp char char bp 8 G define GET ID bp char char bp 1 H define GET ID bp char char bp 2 I define GET ID bp char char bp 4 J define GET ID bp char char bp 8 6 4 Caching 20 points Assume the following situation All caches are fully associative with LRU eviction policy The cache is write back write allocate All caches are empty at the beginning of an execution Variables i j and k are stored in registers A float is 4 bytes The function mm ijk multiplies two N N arrays A and B and puts the result in R For simplicity we assume R is initialized to all zeros void mm ijk float A N N float B N N float R N N int i j k for i 0 i N i for j 0 j N j for k 0 k N k R i j A i k B k j 1 6 pts Consider the executions of mm ijk with N 2 and N 4 or a 64 byte fully associative LRU cache with 4 byte lines the cache holds 16 lines Fill in the table below with the number of cache misses caused by accesses to each of the arrays A B and R assuming that the argument arrays are 16 byte aligned N A B R 2 4 2 6 pts Now suppose we consider the previous experiment on a 64 byte fully associative LRU cache with 16 byte lines the cache holds 4 lines Fill in the table below with the number of cache misses due to each array assuming that the argument arrays are 16 byte aligned N A B R 2 4 7 3 5 pts Even if R is initialized to all zeros and even if the program is single threaded and no signals occur after the execution of the mm ijk function R will not necessarily contain the product of A and B Give a concrete counterexample 4 3 pts Under which additional assumption is mm ijk correct 8 5 Signals 15 points For each code segment below give the largest value that could be printed to stdout Remember that when the system executes a signal handler it blocks signals of the type currently being handled and no others Version A int i 0 void handler int s if i kill getpid SIGINT i int main signal SIGINT handler kill getpid SIGINT printf d n i return 0 1 5 pts Largest value for version A Version B int i 0 void handler int s if i kill getpid SIGINT kill getpid SIGINT i int main signal SIGINT handler kill getpid SIGINT printf d n i return 0 2 5 pts Largest value for version B 9 Version C int i 0 void handler int s if i kill getpid SIGINT kill getpid SIGUSR1 i int main signal SIGINT handler signal SIGUSR1 handler kill getpid SIGUSR1 printf d n i return 0 3 …
View Full Document