MIT 6 441 - LECTURE 1 Introduction - D2014641

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 441> LECTURE 1 Introduction

MIT 6 441 - LECTURE 1 Introduction

School name Massachusetts Institute of Technology

Course 6 441- Information Theory

Pages 8

Download Save

Unformatted text preview:

LECTURE 1 Introduction 2 Handouts Lecture outline Goals and mechanics of the class • notation • entropy: deﬁnitions and properties • mutual information: deﬁnitions and prop-• erties Reading: Ch. 1, Scts. 2.1-2.5.Goals Our goals in this class are to establish an understanding of the intrinsic properties of transmission of information and the rela-tion between coding and the fundamental limits of information transmission in the context of communications Our class is not a comprehensive introduc-tion to the ﬁeld of information theory and will not touch in a signiﬁcant manner on such important topics as data compression and complexity, which belong in a source-coding class Notation – random variable (r.v.) : X – sample value of a random variable : x – set of possible sample values x of the r.v. X : X – Probability mass function (PMF) of a discrete r.v. X : PX(x) – Probability density function (pdf) of a continuous r.v. : pX(x)Entropy Entropy is a measure of the average un-• certainty associated with a random vari-able The entropy of a discrete r.v. X is H(X) = • PX(x)log2(PX(x)) − �x∈X entropy is always non-negative • Joint entropy: the entropy of two dis-• crete r.v.s X, Y with joint PMF PX,Y (x, y) is: H(X, Y ) = − �x∈X ,y∈Y PX,Y (x, y)log2 �PX,Y (x, y)� Conditional entropy: expected value of • entropies calculated according to condi-tional distributions H(Y X) = EZ[H(Y X =| |Z)] for r.v. Z independent of X and identically distributed with X. Intuitively, this is the average of the entropy of Y given X over all possible values of X.Conditional entropy: chain rule H(Y X) = EZ[H(Y X = Z)] = − �|PX(x) � PY ||X(y|x)log2[PY |X(y|x)] x∈X y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)] x∈X ,y∈Y Compare with joint entropy: H(X, Y ) = − � PX,Y (x, y) log2[PX,Y (x, y)] x∈X ,y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)PX(x)] x∈X ,y∈Y = − � PX,Y (x, y) log2[PY |X(y|x)] x∈X ,y∈Y − � PX,Y (x, y) log2[PX(x)] x∈X ,y∈Y = H(Y X) + H(X) |This is the Chain Rule for entropy: H(X1, . . . , Xn) = �ni=1 H(Xi|X1 . . . Xi−1). Ques-tion: H(Y X) = H(X Y )?| |Relative entropy Relative entropy is a measure of the dis-tance between two distributions, also known as the Kullback Leibler distance between PMFs PX(x) and PY (y). Deﬁnition: D(PX||PY ) = �x∈X PX(x) log �PX (x)� PY (x) in eﬀect we are considering the log to be a r.v. of which we take the mean (note that we assume 0 log(0) = 0 and p log(p ) = ∞p 0Mutual information Mutual Information: let X, Y be r.v.s with joint PMF PX,Y (x, y) and marginal PMFs PX(x) and PY (y) Deﬁnition: I(X; Y ) � PX,Y (x, y) � = � PX,Y (x, y) log PX(x)PY (y)x∈X ,y∈Y = D �PX,Y (x, y)||PX(x)PY (y)� intuitively: measure of how dependent the r.v.s are Useful expression for mutual information: I(X; Y ) = H(X) + H(Y ) − H(X, Y ) = X) H(Y ) − H(Y | = Y ) H(X) − H(X| = I(Y ; X) Question: what is I(X; X)?� Mutual information chain rule Conditional mutual information: I(X; Y Z) = |H(X|Z) − H(X|Y, Z) I(X1, . . . , Xn; Y ) = H(X1, . . . , Xn) − H(X1, . . . , Xn|Y ) = Hn(X1, . . . , Xn) − H(X1, . . . , Xn|Y ) = H(Xi|X1 . . . Xi−1) i=1 n� − n� H(Xi|X1 . . . Xi−1, Y ) i=1 = I(Xi; Y |X1 . . . Xi−1) i=1 Look at 3 r.v.s: I(X1, X2; Y ) = I(X1; Y ) + I(X2; Y X1) where I(X2; Y X1) is the extra | |information about Y given by X2, but not given by X1MIT OpenCourseWarehttp://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit:

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 6 441 - LECTURE 1 Introduction

Sign up for free to view:

Please select your school