A Relationship Between Information Inequalitiesand Group TheoryDesmond S. Lun23 October 20021 OutlineThis report presents the result recently published in [1] that establishes a one-to-one correspondence between information inequalities and group inequalities.Our aim in this report is to present this result in as concise a manner as possiblewhilst not excluding any steps in the derivation, thus making it suitable for briefperusal. In Sections 2–4, we introduce the notions of entropy functions, group-characterizable functions, information inequalities, and group inequalities. Weconfine most of the technical details to Section 5 — a section that may weskipped if one is interested only in the general idea of the method in whichthe result is demonstrated. We complete the demonstration in Section 6 andconclude in Section 7.2 Entropy functionsLet X1,X2,... ,Xnbe n discrete random variables whose sample spaces areX1, X2,... ,Xnrespectively. Let N = {1,... ,n} and Ω be the collection of allnon-empty subsets of N ,so|Ω| =2n−1. For α ∈ Ω, we define Xαto be the jointrandom variable (Xi)i∈αwith sample space Xα=i∈αXi(i.e. the Cartesianproduct of Xifor i ∈ α). For example, suppose α = {1, 2}; then X{1,2}is therandom pair (X1,X2).For a given set of random variables X1,X2,... ,Xn, we can define a realfunction over Ω. Since |Ω| is finite, such a function is completely described bythe set of values it takes on Ω, i.e. by a point in R|Ω|. An entropy function,therefore, is defined a follows.Definition 1 Let g be a vector in R|Ω|with components gαindexed by α ∈Ω.Theng is an entropy function if there exists a set of random variablesX1,X2,... ,Xnsuch that gα= H(Xα) for all α.Let Γ∗nbe the set of all entropy functions associated with n random variables;that isΓ∗n= {g ∈ R|Ω|| g is an entropy function}. (1)13 Group-characterizable functionsWe now define functions over Ω that can be characterized by a finite group Gand subgroups G1,G2,... ,Gnin a particular way. Our ultimate aim will be torelate such group-characterizable functions with entropy functions. For a givenα ∈ Ω, we denote the intersection of the subgroups Gifor i ∈ α by Gα,whichis also a subgroup a G.Definition 2 Let h be a vector in R|Ω|with components hαindexed by α ∈ Ω.Then h is called group-characterizable if there exist subgroups G1,G2,... ,Gnof a group G such that hα= log(|G|/|Gα|) for all α.Let Υnbe the set of all group-characterizable functions associated with ngroups; that isΥn= {h ∈ R|Ω|| h is group-characterizable}. (2)4 Information inequalities and group inequali-tiesIt was noted in [3] that an information inequalityα∈ΩbαH(Xα) ≥ 0 (3)is valid if and only ifΓ∗n⊂{h ∈ R|Ω|| bh ≥ 0}, (4)where b is the column vector with components bαindexed by α ∈ Ω. Thisobservation, which follows immediately from the definition of Γ∗n, highlightsthe importance of characterizing Γ∗nto the study of information inequalities, orequivalently its closureΓ∗n, since because {h ∈ R|Ω|| bh ≥ 0} is closed, thecondition (4) is equivalent toΓ∗n⊂{h ∈ R|Ω|| bh ≥ 0}. (5)The characterization Γ∗nor its closure appears to be a highly non-trivial problemin general [6].Likewise, a group inequality of the formα∈Ωbαlog|G||Gα|≥ 0 (6)is valid if and only ifΥn⊂{h ∈ R|Ω|| bh ≥ 0}. (7)2Thus, we are interested in the relationship between Υnand Γ∗n. We observethat all the function values of group-characterizable functions are in one-to-onecorrespondence with rational numbers, so there are at most countably manygroup-characterizable functions. There are, however, uncountably many entropyfunctions. Hence, there are far fewer group-characterizable functions than thereare entropy functions. It turns out, however, that Υnis almost good enough tocharacterize Γ∗n, and in fact conv(Υn), the convex closure of Υn, is equal to Γ∗n,the closure of Γ∗n, as we shall establish in the next section.5 The equivalence of conv(Υn) and Γ∗nTheorem 1conv(Υn)=Γ∗nfor all natural numbers n.To establish this theorem, we require a number of lemmas.Lemma 1 [1, Theorem 3.1] If h is group-characterizable, then it is an entropyfunction, i.e. h ∈ Γ∗n.Proof. Let Λ be a uniformly-distributed discrete random variable with samplespace G. For any i ∈N, let random variable Xibe a function of Λ such thatXiis the left coset aGiif Λ = a.Letα be an element of Ω. ThenP ((Xi= aiGi)i∈α)=P (Λ ∈∩i∈αaiGi)=|∩i∈αaiGi||G|. (8)Consider ∩i∈αaiGi. If it is non-empty, then let b be an element of ∩i∈αaiGi,and we have∩i∈αaiGi= ∩i∈αbGi= b ∩i∈αGi= bGα.(9)Therefore, the intersection ∩i∈αaiGiis either empty or of size |Gα|.Moreover,by Lagrange’s theorem, there are |G|/|Gα| distinct such non-empty intersections.Hence, we haveP ((Xi= aiGi)i∈α)= |Gα||G|if ∩i∈αaiGi= ∅,0 otherwise.(10)And it is evident that the entropy of (Xi)i∈αis log(|G|/|Gα|)=hα. Therefore,h is an entropy function. Lemma 2 [5, Theorem 1]Γ∗nis a convex cone.Proof. We shall first demonstrate thatΓ∗nis convex. Let u, v be the entropyfunctions of two sets of random variables Y1,Y2,... ,Ynand Z1,Z2,... ,Znrespectively. Thus u, v ∈ Γ∗n. It suffices to show that for any 0 <b<1,3bu +(1− b)v ∈ Γ∗n(for it is then straightforward to extend the result to anyconvex combination of points in the closure of Γ∗n).Let (Y(i)1,Y(i)2,... ,Y(i)n)for1≤ i ≤ k be k independent vectors each dis-tributed identically to (Y1,Y2,... ,Yn). Likewise, let (Z(i)1,Z(i)2,... ,Z(i)n)for1 ≤ i ≤ k be k independent vectors each distributed identically to (Z1,Z2,... ,Zn).Let U be a random variable independent of all other random variables havingthe distributionpU(u)=1 − δ − µ if u =0,δ if u =1,µ if u =2.(11)Observe that H(U ) → 0asδ, µ → 0.We now construct the random variables X1,X2,... ,XnbyXi=0ifU =0,(Y(1)i,Y(2)i,... ,Y(k)i)ifU =1,(Z(1)i,Z(2)i,... ,Z(k)i)ifU =2.(12)So for any α ∈ Ω, we haveH(Xα) ≤ H(Xα,U)=H(U)+H(Xα|U)= H(U)+δkH (Yα)+µkH(Xα)(13)andH(Xα) ≥ H(Xα|U)=H(U)+δkH (Yα)+µkH(Xα). (14)Combining the two previous equations, we have0 ≤ H(Xα) − (δkH (Yα)+µkH(Zα)) ≤ H(U ). (15)And by taking δ = b/k and µ =(1− b)/k, we obtain0 ≤ H(Xα) − (bH(Yα)+(1− b)H(Zα)) ≤ H(U ). (16)By taking k sufficiently large, the
View Full Document