On the role of MMSE estimation in approaching theinformation-theoretic limits of linear Gaussian channels:Shannon meets WienerG. David Forney, Jr.1Abstract. This paper explains why MMSE estimation arises in lattice-based strategies for approachingthe capacity of linear Gaussian channels, and comments on its properties.1 IntroductionRecently, Erez and Zamir [11, 13, 14, 32] have cracked the long-standing problem of achievingthe capacity of additive white Gaussian noise (AWGN) channels using lattice codes and latticedecoding. Their method uses Voronoi codes (nested lattice codes), dither, and an MMSE estima-tion factor α that had previously been introduced in more complex multiterminal scenarios, suchas Costa’s “dirty-paper channel” [8]. However, they give no fundamental explanation for whyan MMSE estimator, which is seemingly an artifact from the world of analog communications,plays such a key role in the digital communications problem of achieving channel capacity.The principal purpose of this paper is to provide such an explanation, in the lattice-basedcontext of a mod-Λ AWGN channel model. We discuss various properties of MMSE-basedschemes in this application, some of which are unexpected.MMSE estimators also appear as part of capacity-achieving solutions for more complicated dig-ital communication scenarios involving linear Gaussian channels. While some of the explanationfor the role of MMSE estimation in these more complex situations is no doubt information-theoretic (see, e.g., [20]), the observations of this paper may provide some clues as to whyMMSE estimation and lattice-type coding also work well in these more general applications.2 Lattice-based coding for the AWGN channelConsider the real discrete-time AWGN channel Y = X + N, where E[X2] ≤ Sxand N isindependent2zero-mean Gaussian noise with variance Sn. The capacity is C =12log2(1 + SNR)bits per dimension (b/d), where SNR = Sx/Sn. Following Erez and Zamir [11, 13, 14, 32], wewill show that lattice-based transmission systems can approach the capacity of this channel.2.1 Lattices and spheresGeometrically, an N-dimensional lattice Λ is a regular array of points in RN. Algebraically, Λis a discrete subgroup of RNwhich spans RN. A Voronoi region RV(Λ) of Λ represents thequotient group RN/Λ by a set of minimum-energy coset representatives for the cosets of Λ inRN. For any x ∈ RN, “x mod Λ” denotes the unique element of RV(Λ) in the coset Λ + x.Geometrically, RNis the disjoint union of the translated Voronoi regions {RV(Λ) + λ, λ ∈ Λ}.The volume V (Λ) of RV(Λ) is therefore the volume of RNassociated with each point of Λ.1MIT, Cambridge, MA 02139 USA. E-mail: [email protected] that without the independence of N, the “additive” property is vacuous, since for any real-input, real-output channel we may define N = Y − X, and then express Y as Y = X + N. We exploit this idea later.1 As N → ∞, the Voronoi regions of some N-dimensional lattices can become more or less spher-ical, in various senses. As N → ∞, an N-sphere (ball) of squared radius Nρ2has normalizedvolume (per two dimensions)V⊗(Nρ2)2/NN→∞−→ 2πeρ2.The average energy per dimension of a uniform probability distribution over such an N -spheregoes to P⊗(Nρ2) = ρ2. The probability that an iid Gaussian random N-tuple with zero meanand symbol variance Snfalls outside the N -sphere becomes arbitrarily small for any Sn< ρ2.It is known that there exist high-dimensional lattices whose Voronoi regions are quasi-sphericalin the following second moment sense. The normalized second moment of a compact regionR ⊂ RNof volume V (R) is defined asG(R) =P (R)V (R)2/N,where P (R) is the average energy per dimension of a uniform probability distribution over R.The normalized second moment of R exceeds that of an N-sphere. The normalized secondmoment of an N-sphere decreases monotonically with N and approaches12πeas N → ∞. Aresult of Poltyrev, presented by Feder and Zamir [30], is that there exist lattices Λ such thatlog 2πeG(Λ) is arbitrarily small, where G(Λ) denotes the normalized second moment of RV(Λ).Such lattices are said to be “good for quantization,” or “good for shaping.”Poltyrev [26] has also shown that there exist high-dimensional lattices whose Voronoi regionsare quasi-spherical in the sense that the probability that an iid Gaussian noise N-tuple withsymbol variance Snfalls outside the Voronoi region RV(Λ) is arbitrarily small as long asSn<V (Λ)2/N2πe.Such lattices are said to be “good for AWGN channel coding,” or “sphere-bound-achieving” [19].2.2 Mod-lattice transmission and capacityWe now show that the mod-Λ transmission system shown in Figure 1 can approach the channelcapacity C =12log2(1 + Sx/Sn) b/d arbitrarily closely, provided that G(Λ) ≈ 1/(2πe) and f(Y)is a MMSE estimator of X.-X ∈ RV(Λ)Sx+?SnN-Yf-f(Y)mod Λ-Y0∈ RV(Λ)Figure 1. Mod-Λ transmission system over an AWGN channel.This system is based on an N-dimensional lattice Λ whose Voronoi region RV(Λ) has volumeV (Λ), average energy per dimension P (Λ) = Sxunder a uniform probability distribution overRV(Λ), and thus normalized second moment G(Λ) = P (Λ)/V (Λ)2/N.The N-dimensional input vector X is restricted to the Voronoi region RV(Λ). The outputvector Y is mapped by some function f to another vector f(Y) ∈ RN, which is then mappedmodulo Λ to Y0= f(Y) mod Λ, also in the Voronoi region RV(Λ).2 2.3 MMSE estimation and capacityOur main result is that capacity can be approached in the system of Figure 1 if and only if thelattice Λ is “good for shaping” and the function f(Y) is an MMSE estimator. (The sufficiencyof these conditions was shown in [13].)As a first step, we derive a lower bound:Theorem 1 (Mod-Λ channel capacity and MMSE estimation) The capacity C(Λ, f) ofthe mod-Λ transmission system of Figure 1 is lowerbounded byC(Λ, f) ≥ C −12log22πeG(Λ) −12log2Se,fSeb/d,where C =12log2(1 + SNR) b/d is the capacity of the underlying AWGN channel, G(Λ) is thenormalized second moment of RV(Λ), and Se,fand Seare the average energies per dimension ofEf= f(Y) −X and of E =ˆX(Y) −X, respectively, whereˆX(Y) is the linear MMSE estimatorof X given Y.The key to the proof of this theorem is the introduction of a dither variable U that is knownto both transmitter and receiver, and whose
View Full Document