DOC PREVIEW
FSU COP 4342 - Flex and lexical analysis

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Flex and lexical analysisFrom the area of compilers, we get a host of tools to convert text filesinto programs. The first part of that process is often called lexicalanalysis, particularly for such languages as C.A good tool for creating lexical analyzers is flex. It takes aspecification file and creates an analyzer, usually called lex.yy.c.Unix Tools: Program Development 4Lexical analysis termsA token is a group of characters having collective meaning.A lexeme is an actual character sequence forming a specificinstance of a token, such as num.A pattern is a rule expressed as a regular expression anddescribing how a particular token can be formed. For example,[A-Za-z][A-Za-z_0-9]*is a rule.Characters between tokens are called whitespace; these includespaces, tabs, newlines, and formfeeds. Many people also countcomments as whitespace, though since some tools such aslint/splint look at comments, this conflation is not perfect.Unix Tools: Program Development 4Attributes for tokensTokens can have attributes that can be passed back to the callingfunction.Constants could have the value of the constant, for instance.Identifiers might have a pointer to a location where information iskept about the identifier.Unix Tools: Program Development 4Some general approaches to lexical analysisUse a lexical analyzer generator tool, such as flex.Write a one-off lexical analyzer in a traditional programminglanguage.Write a one-off lexical analyzer in assembly language.Unix Tools: Program Development 4Flex - our lexical analyzer generatorIs linked with its library (libfl.a) using -lfl as a compile-timeoption (or is now sometimes/often found in libc).Can be called as yylex().It is easy to interface with bison/yacc.*l file → lex → lex.yy.clex.yy.c and → gcc → lexical analyzerother filesinput stream → lexical analyzer → actions takenwhen rules appliedUnix Tools: Program Development 4Flex specificationsLex source:{ definitions }%%{ rules }%%{ user subroutines }Unix Tools: Program Development 4DefinitionsDeclarations of ordinary C variables and constants.flex definitionsUnix Tools: Program Development 4RulesThe form of rules are:regularexpression actionThe actions are C/C++ code.Unix Tools: Program Development 4Flex regular expressionss string s literally\c character c literally, where c would normally be a lex operator[s] character class^ indicates beginning of line[^s] characters not in character class[s-t] range of characterss? s occurs zero or one timeUnix Tools: Program Development 4Flex regular expressions, continued. any character except newlines*zero or more occurrences of ss+ one or more occurrences of sr|s r or s(s) grouping$ end of lines/r s iff followed by r (not recommended) (r is*NOT*consumed)s{m,n} m through n occurences of sUnix Tools: Program Development 4Examples of regular expressions in flexa*zero or more a’s.*zero or more of any character except newline.+ one or more characters[a-z] a lowercase letter[a-zA-Z] any alphabetic letter[^a-zA-Z] any non-alphabetic charactera.b a followed by any character followed by brs|tu rs or tua(b|c)d abd or acd^start beginning of line with then the literal characters startEND$ the characters END followed by an end-of-line.Unix Tools: Program Development 4Flex actionsActions are C source fragments. If it is compound, or takes more thanone line, enclose with braces (’{’ ’}’).Example rules:[a-z]+ printf("found word\n");[A-Z][a-z]*{ printf("found capitalized word:\n");printf(" ’%s’\n",yytext);}Unix Tools: Program Development 4Flex definitionsThe form is simplyname definitionThe name is just a word beginning with a letter (or an underscore, butI don’t recommend those for general use) followed by zero or moreletters, underscore, or dash. The definition actually goes from the firstnon-whitespace character to the end of line. You can refer to it via{name}, which will expand to (definition). (cite: this islargely from “man flex”.)For example:DIGIT [0-9]Now if you have a rule that looks like{DIGIT}*\.{DIGIT}+that is the same as writing([0-9])*\.([0-9])+Unix Tools: Program Development 4An example Flex program/*either indent or use %{ %}*/%{int num_lines = 0;int num_chars = 0;%}%%\n ++num_lines; ++num_chars;. ++num_chars;%%int main(int argc, char**argv){yylex();printf("# of lines = %d, # of chars = %d\n",num_lines, num_chars );}Unix Tools: Program Development 4Another example programdigits [0-9]ltr [a-zA-Z]alphanum [a-zA-Z0-9]%%(-|\+)*{digits}+ printf("found number: ’%s’\n", yytext);{ltr}(_|{alphanum})*printf("found identifer: ’%s’\n", yytext);’.’ printf("found character: {%s}\n", yytext);. { /*absorb others*/ }%%int main(int argc, char**argv){yylex();}Unix Tools: Program Development


View Full Document

FSU COP 4342 - Flex and lexical analysis

Download Flex and lexical analysis
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Flex and lexical analysis and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Flex and lexical analysis 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?