UF COP 4620/5625 - Parallel Computation of Top Down Context Free Parsers - D2863122

Home> Schools> University of Florida-Gainesville> (COP) > COP 4620/5625> Parallel Computation of Top Down Context Free Parsers

UF COP 4620/5625 - Parallel Computation of Top Down Context Free Parsers

School name University of Florida-Gainesville

Course Cop 4620/5625- PROGRAMMING LANGUAGE TRANSLATORS

Pages 6

Download Save

Unformatted text preview:

Parallel Computation of Top-Down Context-Free ParsersManuel E. BermudezRichardENewman-WolfeUniversity of Florida, Gainesville, FL 32611, U.S.A.George LogothetisAT&T Bell Laboratories, Summit, NJ 07901, U.S.A.ABSTRACTWe present a newtechnique for carrying out the automatic generation of a top-down, recursivedescent parser in a totally parallel fashion. The generation of such aparser consists of constructing aparse table, with one rowper non-terminal symbol, and one column per terminal symbol. Tradition-ally,this is carried out rowbyrow,inwhich case the computation of one rowdepends on (potentially)all others. In contrast, our technique performs the computation by column, and one parallel processoris assigned to each column. Weshowthat the computation is totally independent for each column,making it ideal for parallelization. The computation is sped up by a factor of the number of terminalsymbols, or the number of available parallel processors, whicheverissmaller.1. IntroductionTop-down parsing is a very common technique when attempting to obtain, by hand, a parser for alanguage whose phrase structure is described by a context-free grammar.The LL(1) technique, as pre-sented in manypresent-day textbooks [1,2,6], can be used to generate a hand-coded recursive descentparser,oraparse table that can be used to drive a parsing algorithm that will behave the same way as therecursive descent code.Generally,the strategy for obtaining such a parser from a context-free grammar consists of the fol-lowing steps:1) Rid the grammar of anyinitial problems, such as left-recursive nonterminals (for which thetop-down parser will not be useful), useless nonterminals (which cannot be used to generatesentences in the language), and other problems.2) Compute First and Followsets for each nonterminal in the grammar,and then use them tocompute Select sets, at the rate of one Select set per production rule in the grammar.3) Determine whether the grammar is LL(1): the Select sets for anytwo productions for the samenonterminal should be disjoint.The problems of computing First, Followand Select sets are well known to be graph problems; regardlessof the specific representation used, the ultimate objective ofthe parser-construction algorithm is to build aparse table such as the one depicted in Figure 1.t1t2... tnA1...AmFigure 1. Askeleton parse table.The table in Figure 1 contains one rowper nonterminal symbol in the grammar,and one column perterminal symbol. Each entry TABLE[A,t] is filled in with (we hope) at most one entry,which describesthe proper production rule for the parser to use, wheneversymbol A is on the top of the parser’sstack, andsymbol t is the up-coming symbol on the input.The traditional way of filling the table proceeds by row, i.e. the Select sets are computed for eachproduction A→ω,and the entries in rowAare entered. The Select sets are computed by unioning cer-tain First sets, and by (further) unioning certain Followsets. The computation for one nonterminal (i.e.one row) depends, in general, on all other rows.The approach presented in this paper consists of filling in the parse table by column.For each ter-minal symbol t (i.e. for each column), we use relations that describe how(1) t becomes a member of aFirst set, (2) howamember of a First set becomes a member of a Followset, (3) howamember of oneFollowset becomes a member of another Followset, (4) howamember of a First set becomes a memberof a Select set, and (5) howamember of a Followset becomes a member of a Select set. The computa-tion of each column is completely independent from all other columns; thus one parallel processor can beassigned to the task of filling in each column.Previous studies on parallel parsing have focused on parallelizing the activity of parsing,[3,4,5,9,10] rather than parallelizing the activity of parser generation.Thus there exist languages andcompilers that allowthe development of parallel application programs, but until nowthere have been noenvironments suitable for carrying out in parallel the generation of a compiler component such as theparser.Our approach addresses the latter problem, by providing a technique whereby the task of comput-ing the parse table for an LL(1) grammar can be completely and independently decomposed into as manyidentical subtasks as there are terminal symbols in the user’sgrammar.Onacomputer system with N par-allel processors, and for a grammar with T terminal symbols, the time required to generate the parser issped up by a factor of min(N,T). LL parser generation for ‘‘real-life’’languages is already a reasonablyfast activity.Thus, these results are more significant from the standpoint of having a fundamental under-standing of computing LL parsers in parallel, than from the standpoint of efficiency.The remainder of this paper is organized as follows. In section 2 we present an overviewofLL(1)parsing. In section 3 we present a discussion of the computation of First, Follow, and Select sets. In sec-tion 4 we present the parallel algorithm for generating LL(1) parsers. Finally,insection 5 we present con-clusions.2. Overview of LL(1) ParsingHere we briefly reviewcontext-free grammars and LL(1) parsers. Detailed presentations are givenin[2,6,7].A context-free grammar (CFG) is a quadruple G=(N,T,P,S), where Nand T are finite disjoint sets ofnonterminals and terminals respectively,S∈Nisthe start symbol, and P is a finite set of productions ofthe form A→ω,where A∈Nandω∈(N∪T)∗.Weadhere to the following (usual) conventions:A,B,C,...∈N;t,a,b,c,...∈T;∈T, the ‘‘end-of-file’’marker;...,X,Y,Z∈(N∪T);...,x,y,z∈T∗;α,β,γ,...∈(N∪T)∗;εis the empty string.Relations ==> a nd ==>∗,pronounced ‘‘immediately left derives’’and ‘‘left derives’’respectively,aredefined in the usual way.The language generated by G, denoted L(G), is the set L(G)={z∈T∗|S== >∗z}.An element of L(G) is called a sentence. We assume all CFG’stobe reduced,i.e. every production isused in the derivation of some sentence.The ‘‘First’’set of a nonterminal is defined as First (A)={t | A==>∗tβ,for someβ}.The Followset of a nonterminal is defined as Follow(A)={t |S==>∗αAtβ,for someα,β}.The Select set of a production is defined by the following: First(ω)⊆Select (A →ω).Follow(A)⊆Select (A →ω), ifω== >∗ε.Finally,agrammar is defined to be LL(1) ifffor anytwo productions A →α,A→β,withα≠β,Select (A→α)and Select

View Full Document


School:
Email:
New Password:
Confirm Password:

UF COP 4620/5625 - Parallel Computation of Top Down Context Free Parsers

Sign up for free to view:

Please select your school