## exam196

Previewing page
*1*
of
actual document.

**View the full content.**View Full Document

## exam196

0 0 69 views

- Pages:
- 3
- School:
- Washington University in St. Louis
- Course:
- Ese 523 - Information Theory

**Unformatted text preview:**

EE 553 Fall 1996 October 17 1996 EE 553 EXAM 1 This is a closed book closed notes exam scheduled to last 90 minutes Calculators and computers are not allowed NAME Score 1 2 3 4 5 Total 1 10 points Prove that 0 7 log 0 7 0 3 log 0 3 0 2 log 0 2 0 8 log 0 8 2 0 75 log 0 75 0 25 log 0 25 You must do this without using a calculator that is without computing any of the logarithms indicated using the theoretical tools developed in this class Try to give some details to convince me that you know what you are doing 2 50 points In many sporting events such as the National League Championship Baseball Series going on now involving our very own St Louis Cardinals the teams compete in several games with the first team winning K games declared the winner of the series Note that this implies a maximum of 2K 1 games Such series are often called a best K out of 2K 1 series For example the Cardinals and Braves will play the seventh game of their best 4 out of 7 series this evening EE 553 2 Exam 1 For each of the parts of this question assume that the outcomes of all games are described by a 1 or a 0 depending on whether team one wins or loses Assume that the outcomes of all games are independent and identically distributed with P outcome 1 p a For a best 2 out of 3 series find a good lower bound on the average number of bits needed to describe the outcomes of all games played Note that we are considering the set of outcomes of all games played in a series as the random variable b Suppose that p 0 4 Find a Huffman code to describe the outcomes of all games played in a best 2 out of 3 series Find the expected codeword length c We now change the random variable definition slightly Suppose that at the end of a series we only care how many games each team won in a best 2 out of 3 series Thus if team one wins the series then team one wins two games and the other team wins either zero games or one game The random variable describing the series is then the pair l m where l denotes the number of games that team one wins and m denotes the number of games the other team wins Again let P outcome 1 p Suppose n independent series are played Find the typical set d Now suppose that you place bets on which team wins the series Suppose that you find a bookie who will pay uniform fair odds For best 2 out of 3 series determine the optimal betting strategy and the resulting doubling rate 3 15 points Suppose that a rectangle in n dimensions has sides X 1 X 2 X n so that the volume measure of the interior of the rectangle is n V n Xi i 1 Suppose that the lengths X i are i i d distributed according to p X x Find the doubling rate for the volume of the rectangles as the dimension increases 4 15 points In many image processing problems objects are described in terms of their boundaries While this may not be the optimal strategy in this problem we examine the Kolmogorov complexity of such descriptions An object is a closed and connected region in an image An image is an m m array of numbers a location in the array is called a pixel the number associated with a pixel is called a pixel value A boundary is a list of pixels that surround an object The boundary has the property that each pixel in the list is a neighbor of the previous pixel and the last pixel is a neighbor of the first pixel A neighbor pixel is one of the four up down left right pixels An example of a boundary is given in the attached figure a Let n be the length of a boundary B the length of the list of pixels Find an upper bound on the Kolmogorov complexity of a boundary given n and given m the size of the array Denote this complexity by K B n m b Find an upper bound on the Kolmogorov complexity of a boundary given m K B m 5 10 points One of the goals in developing models is to more accurately describe the statistics of empirically observed data Often more complicated models are better in many senses including allowing greater data compression Suppose that a set of data is available and one wants to model these data A model will be called admissible if the typical set for that model includes the data set Model 2 is better than model 1 if the typical set for model 2 is smaller than the typical set for model 1 and model 2 is admissible Since the typical set is smaller a code developed using EE 553 3 Exam 1 model 2 will require fewer bits on average than the code developed using model 1 We now explore an example of this Suppose that the data set consists of n binary random vectors each of length m There are three possible models explored State whether each is admissible Determine an ordering including which is best Justify your answers Model 1 The bits are i i d with probability 1 2 that each bit is 1 Model 2 The bits are i i d with probability p that each bit is 1 where p is the empirical average number of 1 s in the entire data set count the number of 1 s out of all mn entries in the vectors and divide by mn Model 3 Each random vector comes from a Markov chain with stationary probability p that each bit is 1 where p is determined as in Model 2 and with transition probabilities determined according to the empirical transition probabilities Let the resulting transition matrix be r 1 r P q 1 q Bonus If by any chance you make it this far I don t think you will argue that the analysis in this problem is far from complete In fact for data compression the complexity of the model used must also be taken into account If the model becomes too complex it may represent the data in the data set well but not any data not already int he set An alternative argument would be based on Kolmogorov complexity There is a complexity of the data set given the model for the data and the complexity of the model

View Full Document