DOC PREVIEW
Penn CIT 594 - Recognizers

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

RecognizersParsers and recognizersBuilding a recognizerRecognizing simple alternatives, ICreating a decorator classPushbackTokenizer IPushbackTokenizer IISample use of PushbackTokenizerRecognizing simple alternatives, IIHelper methodsRecognizing simple alternatives, IIICategories of tokensImplementing symbolnextTokenMatches #1nextTokenMatches #2addOperator repriseSequences, ISequences, IISequences, IIISimple sequences in JavaSequences and alternativesRecursion, IRecursion, IIExtended BNF—optional partsExtended BNF—zero or moreBack to parsersConclusionsThe EndJan 14, 2019Recognizers2Parsers and recognizersGiven a grammar (say, in BNF) and a string,A recognizer will tell whether the string belongs to the language defined by the grammarA parser will try to build a tree corresponding to the string, according to the rules of the grammarInput string Recognizer result Parser result2 + 3 * 4 true2 + 3 * false Error3Building a recognizerOne way of building a recognizer from a grammar is called recursive descentRecursive descent is pretty easy to implement, once you figure out the basic ideasRecursive descent is a great way to build a “quick and dirty” recognizer or parserProduction-quality parsers use much more sophisticated and efficient techniquesIn the following slides, I’ll talk about how to do recursive descent, and give some examples in Java4Recognizing simple alternatives, IConsider the following BNF rule:<add_operator> ::= “+” | “-”That is, an add operator is a plus sign or a minus signTo recognize an add operator, we need to get the next token, and test whether it is one of these charactersIf it is a plus or a minus, we simply return trueBut what if it isn’t?We not only need to return false, but we also need to put the token back because it doesn’t belong to us, and some other grammar rule probably wants itWe need a tokenizer that can take back charactersWe will make do with putting back only one token at a time5Creating a decorator classA Decorator class is a class that extends another class and add functionalityIn this case, StringTokenizer may do what we need, except be able to take back tokensWe can decorate StringTokenizer and add the ability to take back tokensFor simplicity, we’ll allow only a single token to be returnedOur decorator class will need a little extra storage, and will need to override and extend some methods6PushbackTokenizer Ipublic class PushbackTokenizer extends StringTokenizer { String pushedToken = null; // to hold returned token  public PushbackTokenizer(String s) { super(s); // superclass has no default constructor } public void pushBack(String token) { // added method pushedToken = token; }7PushbackTokenizer II public boolean hasMoreTokens() { if (pushedToken != null) return true; else return super.hasMoreTokens(); }Notice how we not only overrode this method, but if we didn’t have anything special to do, we just let the superclass’s method handle it public String nextToken() { if (pushedToken == null) return super.nextToken(); // We only return a pushedToken once String result = pushedToken; pushedToken = null; return result; }Again, we just use the superclass’s method, we don’t reinvent it8Sample use of PushbackTokenizerpublic static void main(String[ ] args) { PushbackTokenizer pb = new PushbackTokenizer("This is too cool"); String token; System.out.print(pb.nextToken( ) + " "); // “This” System.out.print(pb.nextToken( ) + " "); // “is” System.out.print (token = pb.nextToken() + " "); // “too” pb.pushBack(token); // return “too” System.out.print(pb.nextToken( ) + " "); // get “too” again System.out.print(pb.nextToken( ) + " "); // “cool”}Output: This is too too coolQuestion: Why the extra space?9Recognizing simple alternatives, IIOur rule is <add_operator> ::= “+” | “-”Our method for recognizing an <add_operator> (which we will simply call addOperator) looks like this:public boolean addOperator() { Get the next token, call it t If t is a “+”, return true If t is a “-”, return true If t is anything else, put the token back return false}10Helper methodsWe could turn the preceding pseudocode directly into Java codeBut we will be writing a lot of very similar code......and it won’t be very readable codeWe should write some auxiliary or “helper” methods to hide some of the details for usFirst helper method:private boolean symbol(String expectedSymbol)Gets the next token and tests whether it matches the expectedSymbolIf it matches, return trueIf it doesn’t match, put the symbol back and return falseWe’ll look more closely at this method in a moment11Recognizing simple alternatives, IIIOur rule is <add_operator> ::= “+” | “-”Our pseudocode is:public boolean addOperator() { Get the next token, call it t If t is a “+”, return true If t is a “-”, return true If t is anything else, put the token back return false}Thanks to our helper method, our actual Java code is:public boolean addOperator() { return symbol("+") || symbol("-");}12Categories of tokensTokens are always strings, but they come in a variety of kinds:Names: "limit", "y", "maxValue"Keywords: "if", "while", "instanceof"Numbers: "25", "3"Symbols: "+", "=", ";"Special: "\n", end_of_inputInstead of treating tokens as simple strings, it’s convenient to create a Token class that holds both its string value and a constant telling what kind of token it isclass Token { String value; int type;...and this class should define some constants to represent the various types:public static final int NAME = 1;public static final int SYMBOL = 2; etc.If you are using Java 5.0, this is what enums were invented for!13Implementing symbolsymbol gets a token, makes sure it’s a symbol, compares it to the desired value, possibly puts the token back, and returns true or falseWe will want to do something similar for numbers, names, end of lines, and maybe end of inputIt would be foolish to write and debug all of these separatelyAgain, we should use an auxiliary methodprivate boolean symbol(String expectedSymbol) { return nextTokenMatches(Token.SYMBOL, expectedSymbol);}14nextTokenMatches #1The


View Full Document

Penn CIT 594 - Recognizers

Documents in this Course
Trees

Trees

17 pages

Searching

Searching

24 pages

Pruning

Pruning

11 pages

Arrays

Arrays

17 pages

Stacks

Stacks

30 pages

Recursion

Recursion

25 pages

Hashing

Hashing

24 pages

Recursion

Recursion

24 pages

Graphs

Graphs

25 pages

Storage

Storage

37 pages

Trees

Trees

21 pages

Arrays

Arrays

24 pages

Hashing

Hashing

24 pages

Recursion

Recursion

25 pages

Graphs

Graphs

23 pages

Graphs

Graphs

25 pages

Stacks

Stacks

25 pages

Recursion

Recursion

25 pages

Quicksort

Quicksort

21 pages

Quicksort

Quicksort

21 pages

Graphs

Graphs

25 pages

Recursion

Recursion

25 pages

Searching

Searching

24 pages

Counting

Counting

20 pages

HTML

HTML

18 pages

Recursion

Recursion

24 pages

Pruning

Pruning

11 pages

Graphs

Graphs

25 pages

Load more
Download Recognizers
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Recognizers and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Recognizers 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?