Unformatted text preview:

1354 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 42, NO. 5, SEPTEMBER 1996 Hierarchical Universal Coding Meir Feder, Senior Member, IEEE, and Neri Merhav, Senior Member, IEEE Abstract-In an earlier paper, we proved a strong version of the redundancy-capacity converse theorem of universal coding, stating that for “most” sources in a given class, the universal coding redundancy is essentially lower-bounded by the capacity of the channel induced by this class. Since this result holds for general classes of sources, it extends Rissanen’s strong converse theorem for parametric families. While our earlier result has established strong opthnality only for mixture codes weighted by the capacity-achieving prior, our first result herein extends this finding to a general prior. For some cases our technique also leads to a simplified proof of the above mentioned strong converse theorem. The major interest in this paper, however, is in extending the theory of universal coding to hierarchical structures of classes, where each class may have a different capacity. In this setting, one wishes to incur redundancy essentially as small as that corresponding to the active class, and not the union of classes. Our main result is that the redundancy of a code based on a two-stage mixture (first, within each class, and then over the classes), is no worse than that of any other code for “most” sources of “most” classes. If, in addition, the classes can be efficiently distinguished by a certain decision rule, then the best attainable redundancy is given explicitly by the capacity of the active class plus the normalized negative logarithm of the prior probability assigned to this class. These results suggest some interesting guidelines as for the choice of the prior. We also discuss some examples with a natural hierarchical partition into classes. Index Terms- Universal coding, minimax redundancy, max- imin redundancy, capacity, redundancy-capacity theorem, mix- tures, arbitrarily varying sources. I. INTR~DuC~~N I N THE basic classical setting of the problem of universal coding it is assumed that, although the exact information source is unknown, it is still known to belong to a given class {P(.l@, 0 E A}, e.g., memoryless sources, first-order Markov sources, and so on. The performance of a universal code is measured in terms of the excess compression ratio beyond the entropy, namely, the redundancy rate R, (L, 0)) which depends on the code length function L(.), the source indexed by 0, and the data record length n. The minimax redundancy defined by Davisson [9], is the minimum uniform redundancy rate that can be attained for all sources in the class. Gallager [13] was the first to show (see also, e.g., [ll], [22]) that Manuscript received April 27, 1994; revised January 20, 1996. This work was supported in part by the S. Neaman Institute and the Wolfson Research Awards administered by the Israel Academy of Science and Humanities. M. Feder is with the Department of Electrical Engineering-Systems, Tel Aviv University, Tel Aviv 69978, Israel. N. Merhav is with the Department of Electrical Engineering, Tech- nion-Israel Institute of Technology, Haifa 32000, Israel. Publisher Item Identifier S 0018-9448(96)05451-X. R$ = C,, where C, is the capacity (per symbol) of the “channel” from 6’ to the source string z = (~1,. . . ,x,), i.e., the channel defined by the set of conditional probabilities {P(zlO),B E A}. Th is redundancy rate can be achieved by an encoder whose length function corresponds to a mixture of the sources in the class, where the weighting of each source 0 is given by the capacity-achieving distribution. Thus the capacity C, = Rz actually measures the richness of class from the viewpoint of universal coding. One may argue that the minimax redundancy is a pes- simistic measure for universal coding redundancy since it serves as a lower bound to the redundancy for the worst source only. Nevertheless, for smooth parametric classes of sources, Rissanen [18] has shown that this (achievable) lower bound essentially applies to most sources in the class, namely, for all 6’ except for a subset B whose Lebesgue measure vanishes with n. In a recent paper [16], we have extended this result to general classes of information sources, stating that for any given L, R, (L, 0) is essentially never smaller than C,, simultaneously for every 0 except for a “small” subset B. The subset B is small in the sense of having a vanishing measure w.r.t. the prior w* that achieves (or nearly achieves) capacity.’ The results in [16] strengthen the notion of Shannon capacity in characterizing the richness of a class of sources. In this context, aur first contribution here is in developing a technique that both isimplifies the proof and extends the result of [16] to a general prior, not only the capacity-achieving prior. In light of all these findings, this basic setting of universal coding for classes with uniform redundancy rates is now well understood. Another category of results in universal lossless source coding corresponds to situations where the class of sources is so large and rich, that there are no uniform redundancy rates at all; for example, the class of all stationary and ergodic sources. In these situations, the goal is normally to devise data compression schemes that are universal in the weak sense only, namely, schemeb that asymptotically attain the entropy of every source, but there is no characterization of the redundancy, which might decay arbitrarily slowly for some sources. In fact, this example of the class of all stationary and ergodic sources is particularly interesting because it can be thought of as a “closure” of the union of all classes Ri of i&order Markov sources: every stationary and ergodic source can be


View Full Document

MIT 6 441 - Study Notes

Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?