DOC PREVIEW
MIT 6 454 - Relationship Between Information Inequalities and Group Theory

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A Relationship Between Information Inequalitiesand Group TheoryDesmond S. Lun23 October 20021 OutlineThis report presents the result recently published in [1] that establishes a one-to-one correspondence between information inequalities and group inequalities.Our aim in this report is to present this result in as concise a manner as possiblewhilst not excluding any steps in the derivation, thus making it suitable for briefperusal. In Sections 2–4, we introduce the notions of entropy functions, group-characterizable functions, information inequalities, and group inequalities. Weconfine most of the technical details to Section 5 — a section that may weskipped if one is interested only in the general idea of the method in whichthe result is demonstrated. We complete the demonstration in Section 6 andconclude in Section 7.2 Entropy functionsLet X1,X2,... ,Xnbe n discrete random variables whose sample spaces areX1, X2,... ,Xnrespectively. Let N = {1,... ,n} and Ω be the collection of allnon-empty subsets of N ,so|Ω| =2n−1. For α ∈ Ω, we define Xαto be the jointrandom variable (Xi)i∈αwith sample space Xα=i∈αXi(i.e. the Cartesianproduct of Xifor i ∈ α). For example, suppose α = {1, 2}; then X{1,2}is therandom pair (X1,X2).For a given set of random variables X1,X2,... ,Xn, we can define a realfunction over Ω. Since |Ω| is finite, such a function is completely described bythe set of values it takes on Ω, i.e. by a point in R|Ω|. An entropy function,therefore, is defined a follows.Definition 1 Let g be a vector in R|Ω|with components gαindexed by α ∈Ω.Theng is an entropy function if there exists a set of random variablesX1,X2,... ,Xnsuch that gα= H(Xα) for all α.Let Γ∗nbe the set of all entropy functions associated with n random variables;that isΓ∗n= {g ∈ R|Ω|| g is an entropy function}. (1)13 Group-characterizable functionsWe now define functions over Ω that can be characterized by a finite group Gand subgroups G1,G2,... ,Gnin a particular way. Our ultimate aim will be torelate such group-characterizable functions with entropy functions. For a givenα ∈ Ω, we denote the intersection of the subgroups Gifor i ∈ α by Gα,whichis also a subgroup a G.Definition 2 Let h be a vector in R|Ω|with components hαindexed by α ∈ Ω.Then h is called group-characterizable if there exist subgroups G1,G2,... ,Gnof a group G such that hα= log(|G|/|Gα|) for all α.Let Υnbe the set of all group-characterizable functions associated with ngroups; that isΥn= {h ∈ R|Ω|| h is group-characterizable}. (2)4 Information inequalities and group inequali-tiesIt was noted in [3] that an information inequalityα∈ΩbαH(Xα) ≥ 0 (3)is valid if and only ifΓ∗n⊂{h ∈ R|Ω|| bh ≥ 0}, (4)where b is the column vector with components bαindexed by α ∈ Ω. Thisobservation, which follows immediately from the definition of Γ∗n, highlightsthe importance of characterizing Γ∗nto the study of information inequalities, orequivalently its closureΓ∗n, since because {h ∈ R|Ω|| bh ≥ 0} is closed, thecondition (4) is equivalent toΓ∗n⊂{h ∈ R|Ω|| bh ≥ 0}. (5)The characterization Γ∗nor its closure appears to be a highly non-trivial problemin general [6].Likewise, a group inequality of the formα∈Ωbαlog|G||Gα|≥ 0 (6)is valid if and only ifΥn⊂{h ∈ R|Ω|| bh ≥ 0}. (7)2Thus, we are interested in the relationship between Υnand Γ∗n. We observethat all the function values of group-characterizable functions are in one-to-onecorrespondence with rational numbers, so there are at most countably manygroup-characterizable functions. There are, however, uncountably many entropyfunctions. Hence, there are far fewer group-characterizable functions than thereare entropy functions. It turns out, however, that Υnis almost good enough tocharacterize Γ∗n, and in fact conv(Υn), the convex closure of Υn, is equal to Γ∗n,the closure of Γ∗n, as we shall establish in the next section.5 The equivalence of conv(Υn) and Γ∗nTheorem 1conv(Υn)=Γ∗nfor all natural numbers n.To establish this theorem, we require a number of lemmas.Lemma 1 [1, Theorem 3.1] If h is group-characterizable, then it is an entropyfunction, i.e. h ∈ Γ∗n.Proof. Let Λ be a uniformly-distributed discrete random variable with samplespace G. For any i ∈N, let random variable Xibe a function of Λ such thatXiis the left coset aGiif Λ = a.Letα be an element of Ω. ThenP ((Xi= aiGi)i∈α)=P (Λ ∈∩i∈αaiGi)=|∩i∈αaiGi||G|. (8)Consider ∩i∈αaiGi. If it is non-empty, then let b be an element of ∩i∈αaiGi,and we have∩i∈αaiGi= ∩i∈αbGi= b ∩i∈αGi= bGα.(9)Therefore, the intersection ∩i∈αaiGiis either empty or of size |Gα|.Moreover,by Lagrange’s theorem, there are |G|/|Gα| distinct such non-empty intersections.Hence, we haveP ((Xi= aiGi)i∈α)= |Gα||G|if ∩i∈αaiGi= ∅,0 otherwise.(10)And it is evident that the entropy of (Xi)i∈αis log(|G|/|Gα|)=hα. Therefore,h is an entropy function. Lemma 2 [5, Theorem 1]Γ∗nis a convex cone.Proof. We shall first demonstrate thatΓ∗nis convex. Let u, v be the entropyfunctions of two sets of random variables Y1,Y2,... ,Ynand Z1,Z2,... ,Znrespectively. Thus u, v ∈ Γ∗n. It suffices to show that for any 0 <b<1,3bu +(1− b)v ∈ Γ∗n(for it is then straightforward to extend the result to anyconvex combination of points in the closure of Γ∗n).Let (Y(i)1,Y(i)2,... ,Y(i)n)for1≤ i ≤ k be k independent vectors each dis-tributed identically to (Y1,Y2,... ,Yn). Likewise, let (Z(i)1,Z(i)2,... ,Z(i)n)for1 ≤ i ≤ k be k independent vectors each distributed identically to (Z1,Z2,... ,Zn).Let U be a random variable independent of all other random variables havingthe distributionpU(u)=1 − δ − µ if u =0,δ if u =1,µ if u =2.(11)Observe that H(U ) → 0asδ, µ → 0.We now construct the random variables X1,X2,... ,XnbyXi=0ifU =0,(Y(1)i,Y(2)i,... ,Y(k)i)ifU =1,(Z(1)i,Z(2)i,... ,Z(k)i)ifU =2.(12)So for any α ∈ Ω, we haveH(Xα) ≤ H(Xα,U)=H(U)+H(Xα|U)= H(U)+δkH (Yα)+µkH(Xα)(13)andH(Xα) ≥ H(Xα|U)=H(U)+δkH (Yα)+µkH(Xα). (14)Combining the two previous equations, we have0 ≤ H(Xα) − (δkH (Yα)+µkH(Zα)) ≤ H(U ). (15)And by taking δ = b/k and µ =(1− b)/k, we obtain0 ≤ H(Xα) − (bH(Yα)+(1− b)H(Zα)) ≤ H(U ). (16)By taking k sufficiently large, the


View Full Document

MIT 6 454 - Relationship Between Information Inequalities and Group Theory

Documents in this Course
Load more
Download Relationship Between Information Inequalities and Group Theory
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Relationship Between Information Inequalities and Group Theory and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Relationship Between Information Inequalities and Group Theory 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?