Unformatted text preview:

LECTURE 1 Introduction 2 Handouts Lecture outline Goals and mechanics of the class • notation • entropy: definitions and properties • mutual information: definitions and prop-• erties Reading: Ch. 1, Scts. 2.1-2.5.Goals Our goals in this class are to establish an understanding of the intrinsic properties of transmission of information and the rela-tion between coding and the fundamental limits of information transmission in the context of communications Our class is not a comprehensive introduc-tion to the field of information theory and will not touch in a significant manner on such important topics as data compression and complexity, which belong in a source-coding class Notation – random variable (r.v.) : X – sample value of a random variable : x – set of possible sample values x of the r.v. X : X – Probability mass function (PMF) of a discrete r.v. X : PX(x) – Probability density function (pdf) of a continuous r.v. : pX(x)Entropy Entropy is a measure of the average un-• certainty associated with a random vari-able The entropy of a discrete r.v. X is H(X) = • PX(x)log2(PX(x)) − �x∈X entropy is always non-negative • Joint entropy: the entropy of two dis-• crete r.v.s X, Y with joint PMF PX,Y (x, y) is: H(X, Y ) = − �x∈X ,y∈Y PX,Y (x, y)log2 �PX,Y (x, y)� Conditional entropy: expected value of • entropies calculated according to condi-tional distributions H(Y X) = EZ[H(Y X =| |Z)] for r.v. Z independent of X and identically distributed with X. Intuitively, this is the average of the entropy of Y given X over all possible values of X.Conditional entropy: chain rule H(Y X) = EZ[H(Y X = Z)] = − �|PX(x) � PY ||X(y|x)log2[PY |X(y|x)] x∈X y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)] x∈X ,y∈Y Compare with joint entropy: H(X, Y ) = − � PX,Y (x, y) log2[PX,Y (x, y)] x∈X ,y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)PX(x)] x∈X ,y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)] x∈X ,y∈Y − � PX,Y (x, y) log2[PX(x)] x∈X ,y∈Y = H(Y X) + H(X) |This is the Chain Rule for entropy: H(X1, . . . , Xn) = �ni=1 H(Xi|X1 . . . Xi−1). Ques-tion: H(Y X) = H(X Y )?| |Relative entropy Relative entropy is a measure of the dis-tance between two distributions, also known as the Kullback Leibler distance between PMFs PX(x) and PY (y). Definition: D(PX||PY ) = �x∈X PX(x) log �PX (x)� PY (x) in effect we are considering the log to be a r.v. of which we take the mean (note that we assume 0 log(0) = 0 and p log(p ) = ∞p 0Mutual information Mutual Information: let X, Y be r.v.s with joint PMF PX,Y (x, y) and marginal PMFs PX(x) and PY (y) Definition: I(X; Y ) � PX,Y (x, y) � = � PX,Y (x, y) log PX(x)PY (y)x∈X ,y∈Y = D �PX,Y (x, y)||PX(x)PY (y)� intuitively: measure of how dependent the r.v.s are Useful expression for mutual information: I(X; Y ) = H(X) + H(Y ) − H(X, Y ) = X) H(Y ) − H(Y | = Y ) H(X) − H(X| = I(Y ; X) Question: what is I(X; X)?� Mutual information chain rule Conditional mutual information: I(X; Y Z) = |H(X|Z) − H(X|Y, Z) I(X1, . . . , Xn; Y ) = H(X1, . . . , Xn) − H(X1, . . . , Xn|Y ) = Hn(X1, . . . , Xn) − H(X1, . . . , Xn|Y ) = H(Xi|X1 . . . Xi−1) i=1 n� − n� H(Xi|X1 . . . Xi−1, Y ) i=1 = I(Xi; Y |X1 . . . Xi−1) i=1 Look at 3 r.v.s: I(X1, X2; Y ) = I(X1; Y ) + I(X2; Y X1) where I(X2; Y X1) is the extra | |information about Y given by X2, but not given by X1MIT OpenCourseWarehttp://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit:


View Full Document

MIT 6 441 - LECTURE 1 Introduction

Download LECTURE 1 Introduction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE 1 Introduction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE 1 Introduction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?