DOC PREVIEW
Penn CIT 594 - Recognizers

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Recognizers Jan 14 2019 Parsers and recognizers Given a grammar say in BNF and a string A recognizer will tell whether the string belongs to the language defined by the grammar A parser will try to build a binary tree corresponding to the string according to the rules of the grammar Input string Recognizer result 2 3 4 true 2 3 false Parser result Error 2 Building a recognizer One way of building a recognizer from a grammar is called recursive descent Recursive descent is pretty easy to implement once you figure out the basic ideas Recursive descent is a great way to build a quick and dirty recognizer or parser Production quality parsers use much more sophisticated and efficient techniques In the following slides I ll talk about how to do recursive descent and give some examples in Java 3 Recognizing simple alternatives I Consider the following BNF rule add operator That is an add operator is a plus sign or a minus sign To recognize an add operator we need to get the next token and test whether it is one of these characters If it is a plus or a minus we simply return true But what if it isn t We not only need to return false but we also need to put the token back because it doesn t belong to us and some other grammar rule probably wants it Our tokenizer needs to be extended to take back characters We will make do with putting back only one token at a time 4 Recognizing simple alternatives II Our rule is add operator Our method for recognizing an add operator which we will simply call addOperator looks like this public boolean addOperator Get the next token call it t If t is a return true If t is a return true If t is anything else put the token back return false 5 Helper methods We could turn the preceding pseudocode directly into Java code But we will be writing a lot of very similar code and it won t be very readable code We should write some auxiliary or helper methods to hide some of the details for us First helper method private boolean symbol String expectedSymbol Gets the next token and tests whether it matches the expectedSymbol If it matches return true If it doesn t match put the symbol back and return false We ll look more closely at this method in a moment 6 Recognizing simple alternatives III Our rule is add operator Our pseudocode is public boolean addOperator Get the next token call it t If t is a return true If t is a return true If t is anything else put the token back return false Thanks to our helper method our actual Java code is public boolean addOperator return symbol symbol 7 Implementing symbol symbol gets a token makes sure it s a symbol compares it to the desired value possibly puts the token back and returns true or false We will want to do something similar for numbers names end of lines and maybe end of input It would be foolish to write and debug all of these separately Again we should use an auxiliary method private boolean symbol String expectedSymbol return nextTokenMatches Token SYMBOL expectedSymbol 8 nextTokenMatches 1 The nextTokenMatches method should Get a token Compare types and values Return true if the token is as expected Put the token back and return false if it doesn t match private boolean nextTokenMatches int type String value Token t tokenizer next if type t getType value equals t getValue return true else tokenizer pushBack t return false Wait a minute pushBack This is a method that you will have to add to your Tokenizer class Real code changes 9 nextTokenMatches 2 The previous method is fine for symbols but what if we only care about the type For example we want to get a number any number We need to compare only type not value private boolean nextTokenMatches int type String value Token t tokenizer next omit this parameter if type t getType value equals t getValue return true else tokenizer pushBack t omit this test return false The two versions of nextTokenMatches are difficult to combine and fairly small so we won t worry about the code duplication too much 10 addOperator reprise public boolean addOperator return symbol symbol private boolean symbol String expectedSymbol return nextTokenMatches Token SYMBOL expectedSymbol private boolean nextTokenMatches int type String value Token t tokenizer next if type t getType value equals t getValue return true else tokenizer pushBack t return false 11 Sequences I Suppose we want to recognize a grammar rule in which one thing follows another for example The code for this would be fairly simple empty list public boolean emptyList return symbol symbol except for one thing What happens if we get a and don t get a The above method won t work why not Only the second call to symbol failed and only one token gets pushed back 12 Sequences II The grammar rule is empty list And the token string contains 5 Solution 1 Write a pushBack method that can keep track of more than one token at a time say in a Vector Solution 2 Call it an error You might be able to get away with this depending on the grammar For example for any reasonable grammar 2 3 is clearly an error Solution 3 Change the grammar This will allow you to put the back both the and the 5 The code gets pretty messy You have to be very careful of the order in which you return tokens Tricky and may not be possible Solution 4 Combine rules See the next slide 13 Sequences III Suppose the grammar really says list number Now your pseudocode should look something like this public boolean list if first token is if second token is return true else if second token is a number if third token is return true else error else put back first token Revised grammar list rest of list rest of list number 14 Simple sequences in Java Suppose you have this rule factor expression A good way to do this is often to test whether the grammar rule is not met public boolean factor if symbol if expression error Error in parenthesized expression if symbol error Unclosed parenthetical expression return true return false 15 Sequences and alternatives Here s the real grammar rule for factor factor name number expression And here s the actual code public boolean factor if name return true if number return true if symbol if expression error Error in parenthesized expression if symbol error Unclosed parenthetical expression return true return false 16 Recursion I Here s an unfortunate but legal grammar rule Here s some code for it expression expression term public boolean expression if expression return false if addOperator return true if term error Error in expression after or


View Full Document

Penn CIT 594 - Recognizers

Documents in this Course
Trees

Trees

17 pages

Searching

Searching

24 pages

Pruning

Pruning

11 pages

Arrays

Arrays

17 pages

Stacks

Stacks

30 pages

Recursion

Recursion

25 pages

Hashing

Hashing

24 pages

Recursion

Recursion

24 pages

Graphs

Graphs

25 pages

Storage

Storage

37 pages

Trees

Trees

21 pages

Arrays

Arrays

24 pages

Hashing

Hashing

24 pages

Recursion

Recursion

25 pages

Graphs

Graphs

23 pages

Graphs

Graphs

25 pages

Stacks

Stacks

25 pages

Recursion

Recursion

25 pages

Quicksort

Quicksort

21 pages

Quicksort

Quicksort

21 pages

Graphs

Graphs

25 pages

Recursion

Recursion

25 pages

Searching

Searching

24 pages

Counting

Counting

20 pages

HTML

HTML

18 pages

Recursion

Recursion

24 pages

Pruning

Pruning

11 pages

Graphs

Graphs

25 pages

Load more
Download Recognizers
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Recognizers and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Recognizers 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?