Johns Hopkins EN 600 465 - Finite-State and the Noisy Channel - D1878319

Home> Schools> Johns Hopkins University> EN Computer Science (EN 600) > EN 600 465> Finite-State and the Noisy Channel

DOC PREVIEW

Johns Hopkins EN 600 465 - Finite-State and the Noisy Channel

School name Johns Hopkins University

Course En 600 465- Natural Language Processing

Pages 18

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

600.465 - Intro to NLP - J. Eisner 1 Finite-State and the Noisy Channel600.465 - Intro to NLP - J. Eisner 2 Word Segmentation  What does this say?  And what other words are substrings?  Could segment with parsing (how?), but slow.  Given L = a “lexicon” FSA that matches all English words.  How to apply to this problem?  What if Lexicon is weighted?  From unigrams to bigrams?  Smooth L to include unseen words? theprophetsaidtothecity600.465 - Intro to NLP - J. Eisner 3 Spelling correction  Spelling correction also needs a lexicon L  But there is distortion …  Let T be a transducer that models common typos and other spelling errors  ance () ence (deliverance, ...)  e  e (deliverance, ...)  e  e // Cons _ Cons (athlete, ...)  rr  r (embarrasş occurrence, …)  ge  dge (privilege, …)  etc.  Now what can you do with L .o. T ?  Should T and L have probabilities?  Want T to include “all possible” errors …600.465 - Intro to NLP - J. Eisner 4 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y600.465 - Intro to NLP - J. Eisner 5 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y correct spelling typos misspelling600.465 - Intro to NLP - J. Eisner 6 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y (lexicon space)* delete spaces text w/o spaces600.465 - Intro to NLP - J. Eisner 7 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y (lexicon space)* pronunciation speech language model acoustic model600.465 - Intro to NLP - J. Eisner 8 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y tree delete everything but terminals text probabilistic CFG600.465 - Intro to NLP - J. Eisner 9 Noisy Channel Model noisy channel X  Y real language X yucky language Y p(X) p(Y | X) p(X,Y) * = want to recover xX from yY choose x that maximizes p(x | y) or equivalently p(x,y)600.465 - Intro to NLP - J. Eisner 10 Noisy Channel Model p(X) p(Y | X) p(X,Y) * = .o. = Note p(x,y) sums to 1. Suppose y=“C”; what is best “x”?600.465 - Intro to NLP - J. Eisner 11 Noisy Channel Model p(X) p(Y | X) p(X,Y) * = .o. = Suppose y=“C”; what is best “x”?600.465 - Intro to NLP - J. Eisner 12 Noisy Channel Model p(X) p(Y | X) p(X, y) * = .o. = .o. * (Y=y)? restrict just to paths compatible with output “C”600.465 - Intro to NLP - J. Eisner 13 Morpheme Segmentation  Let Lexicon be a machine that matches all Turkish words  Same problem as word segmentation  Just at a lower level: morpheme segmentation  Turkish word: uygarlaştıramadıklarımızdanmışsınızcasına = uygar+laş+tır+ma+dık+ları+mız+dan+mış+sınız+ca+sı+na (behaving) as if you are among those whom we could not cause to become civilized  Some constraints on morpheme sequence: bigram probs  Generative model – concatenate then fix up joints  stop + -ing = stopping, fly + -s = flies, vowel harmony  Use a cascade of transducers to handle all the fixups  But this is just morphology!  Can use probabilities here too (but people often don’t)600.465 - Intro to NLP - J. Eisner 14 Edit Distance Transducer O(k) deletion arcs O(k) insertion arcs O(k2) substitution arcs O(k) no-change arcs600.465 - Intro to NLP - J. Eisner 15 Edit Distance Transducer O(k) deletion arcs O(k) insertion arcs O(k2) substitution arcs O(k) identity arcs Likely edits = high-probability arcs Stochastic600.465 - Intro to NLP - J. Eisner 16 clara Edit Distance Transducer Stochastic .o. = caca .o. Best path (by Dijkstra’s algorithm)Speech Recognition by FST Composition (Pereira & Riley 1996) 600.465 - Intro to NLP - J. Eisner 17 p(word seq) p(phone seq | word seq) p(acoustics | phone seq) .o. .o. trigram language model pronunciation model acoustic model .o. observed acousticsSpeech Recognition by FST Composition (Pereira & Riley 1996) 600.465 - Intro to NLP - J. Eisner 18 p(word seq) p(phone seq | word seq) p(acoustics | phone seq) .o. .o. .o. æ : phone context phone context CAT:k æ t trigram language

View Full Document