DOC PREVIEW
Johns Hopkins EN 600 465 - Finite-State and the Noisy Channel

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

600.465 - Intro to NLP - J. Eisner 1 Finite-State and the Noisy Channel600.465 - Intro to NLP - J. Eisner 2 Word Segmentation  What does this say?  And what other words are substrings?  Could segment with parsing (how?), but slow.  Given L = a “lexicon” FSA that matches all English words.  How to apply to this problem?  What if Lexicon is weighted?  From unigrams to bigrams?  Smooth L to include unseen words? theprophetsaidtothecity600.465 - Intro to NLP - J. Eisner 3 Spelling correction  Spelling correction also needs a lexicon L  But there is distortion …  Let T be a transducer that models common typos and other spelling errors  ance () ence (deliverance, ...)  e  e (deliverance, ...)  e  e // Cons _ Cons (athlete, ...)  rr  r (embarrasş occurrence, …)  ge  dge (privilege, …)  etc.  Now what can you do with L .o. T ?  Should T and L have probabilities?  Want T to include “all possible” errors …600.465 - Intro to NLP - J. Eisner 4 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y600.465 - Intro to NLP - J. Eisner 5 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y correct spelling typos misspelling600.465 - Intro to NLP - J. Eisner 6 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y (lexicon space)* delete spaces text w/o spaces600.465 - Intro to NLP - J. Eisner 7 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y (lexicon space)* pronunciation speech language model acoustic model600.465 - Intro to NLP - J. Eisner 8 Noisy Channel Model noisy channel X  Y real language X yucky language Y want to recover X from Y tree delete everything but terminals text probabilistic CFG600.465 - Intro to NLP - J. Eisner 9 Noisy Channel Model noisy channel X  Y real language X yucky language Y p(X) p(Y | X) p(X,Y) * = want to recover xX from yY choose x that maximizes p(x | y) or equivalently p(x,y)600.465 - Intro to NLP - J. Eisner 10 Noisy Channel Model p(X) p(Y | X) p(X,Y) * = .o. = Note p(x,y) sums to 1. Suppose y=“C”; what is best “x”?600.465 - Intro to NLP - J. Eisner 11 Noisy Channel Model p(X) p(Y | X) p(X,Y) * = .o. = Suppose y=“C”; what is best “x”?600.465 - Intro to NLP - J. Eisner 12 Noisy Channel Model p(X) p(Y | X) p(X, y) * = .o. = .o. * (Y=y)? restrict just to paths compatible with output “C”600.465 - Intro to NLP - J. Eisner 13 Morpheme Segmentation  Let Lexicon be a machine that matches all Turkish words  Same problem as word segmentation  Just at a lower level: morpheme segmentation  Turkish word: uygarlaştıramadıklarımızdanmışsınızcasına = uygar+laş+tır+ma+dık+ları+mız+dan+mış+sınız+ca+sı+na (behaving) as if you are among those whom we could not cause to become civilized  Some constraints on morpheme sequence: bigram probs  Generative model – concatenate then fix up joints  stop + -ing = stopping, fly + -s = flies, vowel harmony  Use a cascade of transducers to handle all the fixups  But this is just morphology!  Can use probabilities here too (but people often don’t)600.465 - Intro to NLP - J. Eisner 14 Edit Distance Transducer O(k) deletion arcs O(k) insertion arcs O(k2) substitution arcs O(k) no-change arcs600.465 - Intro to NLP - J. Eisner 15 Edit Distance Transducer O(k) deletion arcs O(k) insertion arcs O(k2) substitution arcs O(k) identity arcs Likely edits = high-probability arcs Stochastic600.465 - Intro to NLP - J. Eisner 16 clara Edit Distance Transducer Stochastic .o. = caca .o. Best path (by Dijkstra’s algorithm)Speech Recognition by FST Composition (Pereira & Riley 1996) 600.465 - Intro to NLP - J. Eisner 17 p(word seq) p(phone seq | word seq) p(acoustics | phone seq) .o. .o. trigram language model pronunciation model acoustic model .o. observed acousticsSpeech Recognition by FST Composition (Pereira & Riley 1996) 600.465 - Intro to NLP - J. Eisner 18 p(word seq) p(phone seq | word seq) p(acoustics | phone seq) .o. .o. .o. æ : phone context phone context CAT:k æ t trigram language


View Full Document

Johns Hopkins EN 600 465 - Finite-State and the Noisy Channel

Download Finite-State and the Noisy Channel
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Finite-State and the Noisy Channel and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Finite-State and the Noisy Channel 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?