Johns Hopkins EN 600 465 - Building Finite-State Machines - D2316407

Home> Schools> Johns Hopkins University> EN Computer Science (EN 600) > EN 600 465> Building Finite-State Machines

DOC PREVIEW

Johns Hopkins EN 600 465 - Building Finite-State Machines

School name Johns Hopkins University

Course En 600 465- Natural Language Processing

Pages 57

This preview shows page 1-2-3-4-26-27-28-54-55-56-57 out of 57 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 57 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

600.465 - Intro to NLP - J. Eisner 1 Building Finite-State Machines600.465 - Intro to NLP - J. Eisner 3 Xerox Finite-State Tool  You’ll use it for homework …  Commercial (we have license; open-source clone is Foma)  One of several finite-state toolkits available  This one is easiest to use but doesn’t have probabilities  Usage:  Enter a regular expression; it builds FSA or FST  Now type in input string  FSA: It tells you whether it’s accepted  FST: It tells you all the output strings (if any)  Can also invert FST to let you map outputs to inputs  Could hook it up to other NLP tools that need finite-state processing of their input or output600.465 - Intro to NLP - J. Eisner 4 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F ~ \ - complementation, minus ~E, \x, F-E .x. crossproduct E .x. F .o. composition E .o. F .u upper (input) language E.u “domain” .l lower (output) language E.l “range”600.465 - Intro to NLP - J. Eisner 5 Common Regular Expression Operators (in XFST notation) concatenation EF EF = {ef: e  E, f  F} ef denotes the concatenation of 2 strings. EF denotes the concatenation of 2 languages.  To pick a string in EF, pick e  E and f  F and concatenate them.  To find out whether w  EF, look for at least one way to split w into two “halves,” w = ef, such that e  E and f  F. A language is a set of strings. It is a regular language if there exists an FSA that accepts all the strings in the language, and no other strings. If E and F denote regular languages, than so does EF. (We will have to prove this by finding the FSA for EF!)600.465 - Intro to NLP - J. Eisner 6 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ E* = {e1e2 … en: n0, e1 E, … en E}  To pick a string in E*, pick any number of strings in E and concatenate them.  To find out whether w  E*, look for at least one way to split w into 0 or more sections, e1e2 … en, all of which are in E. E+ = {e1e2 … en: n>0, e1 E, … en E} =EE*600.465 - Intro to NLP - J. Eisner 7 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F E | F = {w: w E or w F} = E  F  To pick a string in E | F, pick a string from either E or F.  To find out whether w  E | F, check whether w  E or w  F.600.465 - Intro to NLP - J. Eisner 8 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F E & F = {w: w E and w F} = E F  To pick a string in E & F, pick a string from E that is also in F.  To find out whether w  E & F, check whether w  E and w  F.600.465 - Intro to NLP - J. Eisner 9 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F ~ \ - complementation, minus ~E, \x, F-E ~E = {e: e E} = * - E E – F = {e: e E and e F} = E & ~F \E =  - E (any single character not in E)  is set of all letters; so * is set of all strings600.465 - Intro to NLP - J. Eisner 10 Regular Expressions A language is a set of strings. It is a regular language if there exists an FSA that accepts all the strings in the language, and no other strings. If E and F denote regular languages, than so do EF, etc. Regular expression: EF*|(F & G)+ Syntax: E F * F G concat & + | Semantics: Denotes a regular language. As usual, can build semantics compositionally bottom-up. E, F, G must be regular languages. As a base case, e denotes {e} (a language containing a single string), so ef*|(f&g)+ is regular.600.465 - Intro to NLP - J. Eisner 11 Regular Expressions for Regular Relations A language is a set of strings. It is a regular language if there exists an FSA that accepts all the strings in the language, and no other strings. If E and F denote regular languages, than so do EF, etc. A relation is a set of pairs – here, pairs of strings. It is a regular relation if there exists an FST that accepts all the pairs in the language, and no other pairs. If E and F denote regular relations, then so do EF, etc. EF = {(ef,e’f’): (e,e’)  E, (f,f’)  F} Can you guess the definitions for E*, E+, E | F, E & F when E and F are regular relations? Surprise: E & F isn’t necessarily regular in the case of relations; so not supported.600.465 - Intro to NLP - J. Eisner 12 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F ~ \ - complementation, minus ~E, \x, F-E .x. crossproduct E .x. F E .x. F = {(e,f): e  E, f  F}  Combines two regular languages into a regular relation.600.465 - Intro to NLP - J. Eisner 13 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F ~ \ - complementation, minus ~E, \x, F-E .x. crossproduct E .x. F .o. composition E .o. F E .o. F = {(e,f): m. (e,m)  E, (m,f)  F}  Composes two regular relations into a regular relation.  As we’ve seen, this generalizes ordinary function composition.600.465 - Intro to NLP - J. Eisner 14 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F ~ \ - complementation, minus ~E, \x, F-E .x. crossproduct E .x. F .o. composition E .o. F .u upper (input) language E.u “domain” E.u = {e: m. (e,m)  E}600.465 - Intro to NLP - J. Eisner 15 Common Regular Expression Operators (in XFST notation) concatenation EF * + iteration E*, E+ | union E | F & intersection E & F ~ \ - complementation, minus ~E, \x, F-E .x. crossproduct E .x. F .o. composition E .o. F .u upper (input) language E.u “domain” .l lower (output) language E.l “range”Function from strings to ... a:x/.5 c:z/.7 e:y/.5 .3 Acceptors (FSAs)

View Full Document