2/18/10 1 MS in Telecommunications TCOM 500: Modern Telecommunications Dr. Bernd-Peter Paris George Mason University Spring 2009 MS in Telecommunications Outline • Defining and Measuring Information. • Representing Information efficiently: • Sidebar: Morse code • Huffman Coding • Lempel-Ziv coding • Lossy compression: • JPEG and MPEG Paris 2 TCOM 500: Modern Telecommunications2/18/10 2 MS in Telecommunications Context • Until now, we have discussed how signals can be converted into digital form: • Sampling and A/D conversion • Result is a sequence of bits. • Today, we will look at means to reduce the number of bits required to represent information. • This is referred to as source coding or data compression. • Purpose: reduce the number of bits that need to be transmitted in order to increase information throughput. Paris 3 TCOM 500: Modern Telecommunications Hi! How are you? ADC Compression 01011… 1011… bits fewer bits MS in Telecommunications MEASURING INFORMATION Paris 4 TCOM 500: Modern Telecommunications2/18/10 3 MS in Telecommunications Information Theory • The word information is often used colloquially to refer to news, messages, etc. • Information theorists have come up with means to measure information. • We will relate the everyday notion of information to the theoretical concepts developed by information theorists. • We will discover the central role played by Entropy. • We will see that the amount of information in a message is closely related to the shortest binary representation of that message. • To measure information one could think of a variety of methods, e.g., one could count the number of characters in a message. • However, we will see that a much better measure for the amount of information is the smallest average number of yes-or-no questions to find the contents of the message. Paris 5 TCOM 500: Modern Telecommunications MS in Telecommunications Thought Experiment Scenario: • You are going to a train station. • The station has 16 tracks. • You have no idea which track your train is leaving from. • With five minutes to spare, you turn to the “Information” booth. • There is a monosyllabic attendant in the booth who only answers yes-or-no questions. • Furthermore, he does not answer more than one question per minute. Question: Will you make your train? Problem: Think of the shortest (on average) sequence of yes-or-no questions that allows you to find your track. Paris 6 TCOM 500: Modern Telecommunications2/18/10 4 MS in Telecommunications Solution to Thought Experiment • Solution: • Assume the train leaves from track 9 • The following sequence of questions yields the answer with 4 questions: – Is the track number greater than 8? — Answer: yes – Is the track number greater than 12? — Answer: no – Is the track number greater than 10? — Answer: no – Is the track number equal to 9 — Answer: yes • Notice: • If we set yes=1 and no=0, then we could represent the sequence of answers as 1001. • This is equal to the binary coded decimal representation of the number 9. • This strategy will always find the track with 4 questions. • This is because 24 =16 or log216=4. Paris 7 TCOM 500: Modern Telecommunications MS in Telecommunications Another Thought Experiment • Scenario: • Another strange train station with a monosyllabic information booth attendant. • This one has 12 tracks. • From experience you know that the train is twice as likely to leave from tracks 1-4 than from tracks 5-12. • In other words, the probability that the train leaves from track 1 is 1/8 (same for tracks 2, 3 , 4). • The probability that the train leaves from track 5 is 1/16 (same for tracks 5-12). • Question: What is the best (shortest on average) sequence of questions now? Paris 8 TCOM 500: Modern Telecommunications2/18/10 5 MS in Telecommunications Solution to Thought Experiment • Solution: • Assume your train leaves on track 3. • You should ask the following sequence of questions: • Is the track number greater than 4? — Answer: no • Is the track number greater than 2? — Answer: yes • Is the track number equal to 4? — Answer: no • In general, the first question should always be “Is the track number greater than 4?”. • This question, divides the possibilities into two equally likely halves: • tracks 1-4 have a combined probability of 0.5, and so do tracks 5-12. • Depending on the answer, sub-divide either the range from 1-4 or the range from 5-12. • If the train arrives on one of the first four tracks, three questions are required. • If the train arrives on one of other tracks, four questions are required. • On average, 3.5 questions are needed. • To compute the average, weighting by the probabilities is applied. Paris 9 TCOM 500: Modern Telecommunications MS in Telecommunications Information versus Data • In the two thought experiments, the answer was obtained by the fewest possible number of yes-no questions. • We will see, that this minimal number of questions equals the amount of information gained. • Other strategies are possible: • For example one could ask “Does the train leave on track 1?”, “Does the train leave on track 2?”, and so on until a “yes” is received. • In this case, on average 8 questions are needed in the first experiment. • Notice also, that the number of questions that need to be asked is random! • With this strategy, (on average) more data (answers) are collected but the information is the same! • Information is NOT the same as data. • Extracting information from a given set of data is at the core of data compression. Paris 10 TCOM 500: Modern Telecommunications2/18/10 6 MS in Telecommunications A few Concepts from Probability • Observation (data) providing information are inherently random (think unpredictable). • Put differently: perfectly predictable data don’t provide information. • We need the following ideas from probability: • Experiment: The process that produces the data we observe, e.g., flipping a coin, asking yes-no questions, or sampling and A-to-D conversion. • Outcome: The results of the experiment; denoted by X. • In our thought
View Full Document