DOC PREVIEW
Soft decoding, dual BCH codes, and better list-decodable ε-biased codes

## This preview shows page 1-2-24-25 out of 25 pages.

View full document
Do you want full access? Go Premium and unlock all 25 pages.
Do you want full access? Go Premium and unlock all 25 pages.
Do you want full access? Go Premium and unlock all 25 pages.
Do you want full access? Go Premium and unlock all 25 pages.

Unformatted text preview:

IntroductionOur Bounds and Comparison With Previously Best Known BoundsOrganization of the PaperCoding BasicsCoding IngredientsParvaresh-Vardy codes and their soft decodingJohnson BoundCodes on the Gilbert-Varshamov BoundDual BCH CodesWeight Distribution and Duals of Linear CodesMoments of Coset Weight Distribution of Dual BCH CodesBinary Codes List Decodable in Presence of High NoiseConstruction With Near-Cubic Dependence on Construction with Linear Dependence on DimensionA Negative ResultOpen QuestionsProof of Theorem 1Proof of Lemma 8Proof of Lemma 16Complexity-theoretic motivation: Approximating NP-witnessesSoft decoding, dual BCH codes, and betterlist-decodable ε-biased codes∗Venkatesan Guruswami†Atri Rudra‡AbstractExplicit constructions of binary linear codes that are efficiently list-decodable up to a fraction(1/2−ε) of errors are given. The codes encode k bits into n = poly(k/ε) bits and are constructibleand list-decodable in time polynomial in k and 1/ε (in particular, in our results ε need notbe constant and can even be polynomially small in n). These results give the best knownpolynomial dependence of n on k and 1/ε for such codes. Specifically, they are able to achieven 6 O(k3/ε3+γ) or, if a linear dependence on k is required, n 6˜O(k/ε5+γ), where γ > 0is an arbitrary constant. The best previously known constructive bounds in this setting weren 6 O(k2/ε4) and n 6 O(k/ε6). Non-constructively, a random linear encoding of length n =O(k/ε2) suffices, but no sub-exponential algorithm is known for list decoding random codes.The construction with a cubic dependence on ε is obtained by concatenating the recentParvaresh-Vardy (PV) codes with dual BCH codes, and crucially exploits the soft decodingalgorithm for PV codes. The result with the linear dependence on k is based on concatenationof the PV code with an inner code of good minimum distance.In addition to being a basic question in coding theory, codes that are list-decodable from afraction (1/2−ε) of errors for ε → 0 are important in several complexity theory applications. Forexample, the construction with near-cubic dependence on ε yields better hardness results for theproblem of approximating NP witnesses. In addition, the codes constructed have the propertythat all nonzero codewords have relative Hamming weights in the range (1/2 − ε, 1/2 + ε); thisε-biased property is a fundamental notion in pseudorandomness.∗A preliminary conference version of this paper was presented at the 23rd Annual IEEE Conference on Computa-tional Complexity (CCC 2008).†Computer Science Department, Carnegie Mellon University, Pittsburgh, PA. Part of this work was done whilevisiting the School of Mathematics, Institute for Advanced Study, Princeton, NJ. Research supported in part by aPackard fellowship, and NSF grants CCF-0343672, CCF-0953155. [email protected]‡Department of Computer Science and Engineering, University at Buffalo, State University of New York, Buffalo,NY, 14620. Research supported in part by startup funds from University at Buffalo and by NSF CAREER awardCCF-0844796. [email protected] Intro ductionConsider the task of communicating k message bits of information over a channel that is goingto adversarially corrupt a fraction 1/2 − ε of the transmitted bits. What is the fewest numbern = n(k, ε) codeword bits we need to communicate so that no matter which of the (1/2 − ε)n bitsare corrupted, we can recover the k message bits efficiently, i.e., in time polynomial in k and 1/ε?The main goal of this paper is to understand the asymptotic dependence of n on k and ε. Thisnatural coding-theoretic problem, where we want to correct close to the information-theoreticallymaximum possible fraction of errors, also arises in several applications of error-correcting codes incomplexity theory including: construction of hardcore predicates from one-way functions [10, 31],approximating the VC dimension [23], constructions of extractors [28], membership comparability ofNP-complete sets [30], hardness amplifications of Boolean functions [32], worst-case to average-caseconnection for the permanent [6], and approximating NP-witnesses [20, 9].The process of recovering the transmitted message from the received corrupted codeword iscalled decoding. If we want the decoder to uniquely recover the transmitted codeword, then it iswell known that one cannot recover from more than a 1/4 fraction of errors. Unfortunately, allthe applications mentioned above need to work with a fraction 1/2 − ε > 1/4 of errors. In fact, insome cases ε could be polynomially small in n. In such high noise regimes, one has to settle witha relaxation of unique decoding called list decoding. Under list decoding, the decoder is allowed tooutput a (small) list of codewords with the guarantee that the transmitted codeword is present inthe list. The fact that a list decoder can output a small list of codewords could be problematic sincewe need to recover the actual transmitted codeword. Fortunately, in all the applications mentionedabove, the decoder has access to some side information, which allows it to go through the list andpick the transmitted codeword. The only constraint that this introduces is that the size of the listneeds to be bounded by a polynomial in n. Note that this is also an a priori requirement for thelist decoding algorithm to run in polynomial time.Thus, the natural question to ask is: what is the smallest n one can achieve with a code thatneeds to be list decoded from a 1/2 − ε fraction of errors? It is well known that there exist codeswith n = O(k/ε2) such that for every error pattern only O(1/ε2) many codewords need to beoutput; in fact a random code with 2kcodewords of such a block length has this property withhigh probability [33, 7]. Further, this bound is tight, that is, for n = o(k/ε2), for any code, thereexists an error pattern for which a super-polynomial number of codewords need to be output. Thus,n = Θ(k/ε2) is information-theoretically best possible. The upper bound of n = O(k/ε2) above isnon-constructive: a code that is guaranteed to have these properties is not known to be efficientlycomputable, and even if we are content with a randomized Monte Carlo construction of such acode, there is no polynomial time list decoding algorithm known for such codes. This is a seriousdrawback since the applications need efficient constructions and decoding algorithms.In summary, the applications of these