DOC PREVIEW
UMD CMSC 723 - Finite-State Morphology

This preview shows page 1-2-3-4-31-32-33-34-35-64-65-66-67 out of 67 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 67 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Finite-State MorphologyCMSC 723: Computational Linguistics I ― Session #3Jimmy LinJimmy LinThe iSchoolUniversity of MarylandWednesday, September 16, 2009Today’s Agenda| Computational toolsz Regular expressionsz Finite-state automata (deterministic vs. non-deterministic)z Finite-state transducers|Overview of morphological processes|Overview of morphological processes| Computational morphology with finite-state methodsRegular Expressions| A metalanguage for specifying simple classes of stringsz Very useful in searching and matching text strings| Everyone does it!z Implementations in the shell, Perl, Java, Python, …Regular Expressions| Basic regular expressions/happy/ → happy/[abcd]/ → a, b, c, d/[a-d]/ → a, b, c, d/[^a-d]/→e, f, g, … z/[ ad]/ e, f, g, … z/[Tt]he/ → the, The/(dog|cat)/ → dog, cat| Special metacharacters/colou?r/ → color, colour/oo*h!/ → oh! ooh!oooh!/ooh!/ → oh!, ooh!, oooh!, …/oo+h!/ → ooh!, oooh!, ooooh!, …/beg.n/ → began, begin, begun, begbn, …NLP*with Regular ExpressionsTranscript with Eliza, simulation of a Rogerian psychotherapist (Weizenbaum, 1966)User: Men are all alikeELIZA: IN WHAT WAYUser: They’re always bugging us about something or otherELIZA: CAN YOU THINK OF A SPECIFIC EXAMPLE?User: Well, my boyfriend made me come hereELIZA: YOUR BOYFRIEND MADE YOU COME HEREUser: He says I’m depressed much of the timeUser: He says Im depressed much of the timeELIZA: I AM SORRY TO HEAR THAT YOU ARE DEPRESSEDHow did it work?| .* all .* → IN WHAT WAY| .* always .* → CAN YOU THINK OF A SPECIFIC EXAMPLE| .* I’m (depressed|sad) .* → I AM SORRY TO HEAR YOU ARE \1 | .* I’m (depressed|sad) .* → WHY DO YOU THINK YOU ARE \1?Aside…| What is intelligence?| What does Eliza tell us about intelligence?at does a te us about te ge ceEquivalence Relations| We can say the followingz Regular expressions describe a regular languagez Regular expressions can be implemented by finite-state automataz Regular languages can be generated by regular grammars|So what?|So what?RegularLanguagesRegular ExpressionsFinite-State AutomataLanguagesRegular GrammarsSheeptalk!baa!b!Language:RlE ibaaa!baaaa!baaaaa!.../baa+!/Regular Expression:FiniteState Automaton:baa!Finite-State Automaton:q0q1q2q3q4aFinite-State Automata| What are they?| What do they do?at do t ey do| How do they work?FSA: What are they?| Q: a finite set of N states z Q = {q0, q1, q2, q3, q4}z The start state: q0z The set of final states: F = {q4}|Σ: a finite input alphabet of symbols|Σ: a finite input alphabet of symbolsz Σ = {a, b, !}|δ(qi): transition function|δ(q,i): transition function z Given state q and input symbol i, return new state q'z δ(q3,!) → q4q0q1q2q3q4baa!q0q1q2q3q4aFSA: State Transition TableInputStateba!Stateba!0 1∅∅1∅2∅1∅2∅2∅3∅3∅343∅344∅∅ ∅q0q1q2q3q4baa!q0q1q2q3q4aFSA: What do they do?| Given a string, a FSA either rejects or accepts itz ba! → rejectz baa! → acceptz baaaz! → rejectzbaaaa!→acceptbaaaa! acceptz baaaaaa! → acceptz baa → rejectmoooorejectzmoooo→ reject| What does this have to do with NLP?zThink grammaticality!zThink grammaticality!FSA: How do they work?q0q1q2q3q3q4baaa!ACCEPTbaa!q0q1q2q3q4aFSA: How do they work?q0q1q2ba!!!REJECTbaa!q0q1q2q3q4aD-RECOGNIZEAccept or Generate?| Formal languages are sets of stringsz Strings composed of symbols drawn from a finite alphabet| Finite-state automata define formal languages z Without having to enumerate all the strings in the language| Two views of FSAs:z Acceptors that can tell you if a string is in the languageGenerators to produce all and only the strings in the languagezGenerators to produce all and only the strings in the languageSimple NLP with FSAsIntroducing Non-Determinism| Deterministic vs. Non-deterministic FSAs| Epsilon (ε) transitionsUsing NFSAs to Accept Strings| What does it mean?z Accept: there exist at least one path (need not be all paths)z Reject: no paths exist| General approaches:z Backup: add markers at choice points, then possibly revisit unexplored arcs at marked choice pointz Look-ahead: look ahead in input to provide cluesz Parallelism: look at alternatives in parallel| Recognition with NFSAs as search through state space()z Agenda holds (state, tape position) pairsND-RECOGNIZEND-RECOGNIZEState Orderings| Stack (LIFO): depth-first|Queue (FIFO): breadth-firstQueue ( O) b eadtstND-RECOGNIZE: ExampleACCEPTWhat’s the point?| NFSAs and DFSAs are equivalentz For every NFSA, there is a equivalent DFSA (and vice versa)| Equivalence between regular expressions and FSAz Easy to show with NFSAs| Why use NFSAs?Regular Language: Definition| ∅ is a regular language| ׊a א Σ ׫ε, {a} is a regular language a׫ε,{a} s a egu a a guage| If L1and L2are regular languages, then so are:zL1·L2={xy|xאL1,yאL2}, theconcatenationof L1and L2L1 L2 {xy| xאL1, yאL2}, the concatenationof L1and L2z L1׫ L2, the union or disjunction of L1and L2z L1כ, the Kleene closure of L1Regular Languages: Starting PointsRegular Languages: ConcatenationRegular Languages: DisjunctionRegular Languages: Kleene ClosureFinite-State Transducers (FSTs)| A two-tape automaton that recognizes or generates pairs of strings| Think of an FST as an FSA with two symbol strings on each arcz One symbol string from each tapeFour-fold view of FSTs| As a recognizer|As a generatorsage eato| As a translator|As a set relater|As a set relaterSummary: Computational Tools| Regular expressions| Finite-state automata (deterministic vs. non-deterministic)testate auto ata (dete st c s odete st c)| Finite-state transducersComputational Morphology| Definitions and problemsz What is morphology?z Topology of morphologies| Computational morphologyz Finite-state methodsMorphology| Study of how words are constructed from smaller units of meaning| Smallest unit of meaning = morphemez fox has morpheme foxz cats has two morphemes cat and –sz Note: it is useful to distinguish morphemes from orthographic rules|Two classes of morphemes:|Two classes of morphemes:z Stems: supply the “main” meaningz Affixes: add “additional” meaningTopology of Morphologies| Concatenative vs. non-concatenative| Derivational vs. inflectionale ato a s ecto a| Regular vs. irregularConcatenative Morphology| Morpheme+Morpheme+Morpheme+…|Stems (also called lemma, base form, root, lexeme):Ste s (a so ca ed e a, base o , oot, e e e)z hope+ing → hopingz hop+ing → hopping| Affixes:z Prefixes:


View Full Document

UMD CMSC 723 - Finite-State Morphology

Documents in this Course
Lecture 9

Lecture 9

12 pages

Smoothing

Smoothing

15 pages

Load more
Download Finite-State Morphology
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Finite-State Morphology and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Finite-State Morphology 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?