MIT 6 863J - Recognition Complexity - D445907

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 863J> Recognition Complexity

DOC PREVIEW

MIT 6 863J - Recognition Complexity

School name Massachusetts Institute of Technology

Course 6 863j- Natural Language and the Computer Representation of Knowledge

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Recognition ComplexityEric Sven RistadFinal versionIn mathematical linguistics, grammars model linguistic theories, and thesets of strings that grammars generate model natural languages. More pre-cisely, a language L(G) is a set of strings generated by a grammar G. Thepower of a grammar is the difficulty of characterizing its output. The cor-responding formal problem is the recognition problem (RP): is a given stringin a given language or not? Alternately, defining the output of a grammarto be a set of structural descriptions results in the parsing problem: whatstructural descriptions are assigned by a given grammar to a given string?A language may be characterized in extension, by all the grammars thatgenerate it, or constructively, by a particular grammar that generates it.In the first case, the fixed language RP (FLRP) is posed: given an inputstring x, is x in L for some fixed language L? It does not matter whichgrammar generates L: both grammar and language are fixed (ignored) in theproblem statement. In the second case, the grammar is of interest, and theuniversal RP (URP) is posed: Given a grammar G and an input string x,is x in L(G)? Because the URP determines membership with respect to aparticular grammar, it more closely models the parsing problem, which usesa grammar to assign structural descriptions.Problems are solved by algorithms; algorithms run on machines; machinesconsume computational resources, such as time or space. Therefore, the com-plexity of a problem is given indirectly by the algorithms that solve it. Someproblems cannot be solved by any algorithm: they are UNDECIDABLE (notrecursive) → Automata Theory.The complexity of an algorithm is the rate at which it consumes thecomputational resources of time and space, expressed as the order of growthof a function in the size of the problem input. Orders of growth are an upperbound on the resource requirements of an algorithm. They are useful because1they abstract from many irrelevant details of the machine.Algorithm complexity provides an upper bound on problem complexity:the most efficient known algorithm for a problem gives the tightest upperbound. Problem complexity can also be bounded from below by a reduction,according to the theory of computational complexity.Because the complexity of a problem is a function of its input parameters,and b ecause the URP includes the grammar in its input while the FLRPdoes not, the complexities of the URP and FLRP can differ. For example,the URP for generalized phrase structure grammars can require more thanexponential time, while the FLRP requires less than cubic time. The parsingproblem must be at least as hard as the URP, and the URP at least as hardas the FLRP, by definition.Computational complexity theory is a mathematical theory of the intrin-sic lower-bound difficulty of obtaining the solution to a problem no matterhow the solution is obtained. It classifies problems according to the amountof computational resources (in our case, time or space) needed to solve themon a given abstract machine. Four important complexity classes are P, NP,PSPACE, and EXPPOLY.P is the natural and important class of problems solvable in deterministicPolynomial time, that is, on a deterministic Turing machine in time njforsome integer j, where n denotes the size of the problem to be solved. Pis considered to be the class of problems that can be solved efficiently. Forexample, sorting takes n · log n time in the worst case using a variety ofalgorithms, and therefore is efficiently solvable. The URP for both context-free and regular grammars is in P.NP is the class of all problems solvable in Nondeterministic Polynomialtime. Informally, a problem is in NP if one can guess an answer to theproblem and then verify its correctness in polynomial time. For example,the problem of deciding whether a whole number i is composite is in NPbecause it can be solved by guessing a pair of potential divisors, each lessthan "√i$, and then quickly checking if their product equals i.PSPACE is the class of problems solvable in deterministic polynomialspace. PSPACE contains NP because polynomial space allows us to simu-late an entire NP computation, but it is not known if the inclusion is proper.Intuitively, PSPACE is the class of combinatorial two-person games: it in-cludes the problems of winning generalized versions of Checkers, Go, andParker Brothers’ Instant Insanity(TM). Many problems in formal languagetheory are known to be PSPACE-complete, such as finite state automaton2inequivalence and intersection and the FLRP for context-sensitive languages.NSPACE[s(n)] is the class of problems solvable in nondeterministic spacethat grows with s(n). In particular, the URP for Context-Sensitive Gram-mars is NSPACE[n] (requires linear nondeterministic space, see Kuroda 1964).Therefore the closure of CSLs under complementation follows from the theo-rem that NSPACE[s(n)]=co-NSPACE[s(n)] for s(n)≥ log(n), see Immerman(1988), Szelepcs´enyi (1987) .Finally, EXPPOLY is the class of problems solvable in deterministic timecf(n)for any constant c and polynomial f (n) in n. This class includesPSPACE, and EXP (that is, all exponential time problems), and so includesproblems that are provably intractable.We say a problem T is C-hard if T is at least as hard computationally asany problem in the complexity class C. Note that T need not be in C to beC-hard. A problem is C-complete if it is both C-hard and included in C.NP-complete problems can be solved only by methods too slow for eventhe fastest computers. Since it is widely believed, though not proved, thatno faster methods of solution can ever be found for these problems, NP-complete problems are considered the easiest computationally intractableproblems. The URP for many formal linguistic theories is intractable (seechart).Complexity classifications are established with the proof technique of re-duction. A reduction converts instances of a problem T of known complexityinto instances of a problem S whose complexity we wish to determine. Thereduction operates in polynomial time. Therefore, if we had a polynomialtime algorithm for solving S, then we could also solve T in polynomial time,simply by converting instances of T into S. (This follows because the compo-sition of two polynomial time functions is also polynomial time.) Formally,if we choose T to be NP-complete, then a polynomial time reduction from Tto S shows that S is at least as hard as T , or NP-hard. If we were

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

MIT 6 863J - Recognition Complexity

Sign up for free to view:

Please select your school