Unformatted text preview:

CMSC723 LING645 MIDTERM SAMPLE QUESTIONS 10 12 04 READINGS Chapters 1 2 3 5 6 1 6 3 7 1 7 3 21 Rabiner s HMM tutorial pp 257 266 not including section IV 0 When and where is the midterm exam Answer Wed 10 19 04 CSIC 1122 1 Give examples of 3 types of ambiguity syntactic lexical semantic pragmatic 2 Why did early MT systems fail What makes systems more successful nowadays 3 Describe the tradeoffs among Direct Transfer and Interlingual systems Consider the following characteristics speed specificity of rules size of rule set depth of coverage breadth of coverage 4 MT Evaluation Is there a relation between human metrics and automatic metrics Consider the notions of accuracy and fluency in your answer 5 Bleu Compute Unigram and Bigram scores for the following MT Now the time attempts to boggle the mind for the man Ref 1 Now is the time that tries mens souls Ref 2 The the the the the Ref 3 War makes life difficult for the mind Note Ignore capitalization Also mens counts as one word How effective is Bleu in measuring the correctness of the translation given these references In your answer you may allude to the notions of ambiguity synonymy differences between long references and short references duplication of words in the reference etc 6 Finite State Machinery What are the settings of the 5 parameters Q q0 S F d for a FSA that accepts cow language i e strings defined by the regular expression mo You must draw out your automaton State what changes would be needed to accept the regular expression mo 7 What is an example of how ELIZA uses regular expressions 8 What is the difference between Inflectional and Derivational morphology What is the difference between templatic and concatenative morphology If reasonable illustrate the difference with examples not been presented in class What are we modeling in the morphology lab inflectional or derivational morphology 9 What are the pro s and con s of compiling out the effects of rules into the lexicon i e a lexicononly system What would be lost by doing this What would be gained In your answer consider different languages e g morphologically rich languages vs morphologically poor languages Also think about the downstream processes that take morphologically analyzed tokens as input how does the compilation of morphological rules into the lexical entries impact these later processes 10 How are FSTs different from FSAs What is a feasible pair What is a state transition table What is an example of a FST rule interaction Think back to your lab what sort of rules interacted with each other What happens when multiple rules are applicable to a particular string Can you think of a case where rule ordering would have made it easier to build the automata for two interacting rules 11 Can you express a two level Kimmo style rule in C H notation on pages 77 78 e g slides 4142 lecture 3 Intermediate to Surface transducer 12 How does the Porter Stemmer work What applications is it useful for Why 13 Bayes Rule How do we derive this That is Given 1 P A B P A B P B 2 P A B P B A Prove P B A P A B P B P A How does this relate to noisy channel 14 Describe how the Noisy Channel model can be used for a particular application Some of the ones we ve talked about are MT POS tagging speech recognition OCR Be sure to specify what the different components of the noisy channel refer to in this application i e w O P w P O P w O P O w For this application how would each of these be derived 15 Consider the problem of speech recognition Assume you have a corpus of size 3 715 820 where the word about occurs 3 725 times and a bow occurs 38 times Describe the speech recognition problem for a given pronunciation e g ax b aw To simplify this problem assume the only two words pronounced this way are about and a bow Compute the prior and likelihood for each of these two words and then determine word which has the highest likelihood for ax b aw You may assume you have access to only one pronunciation rule t d 0 V which has a probability of 0 48 Recall that if no pronunciation rules are applicable to a given word one pronunciation is available so the probability of that pronunciation given the word is 1 16 Consider N grams combined with the noisy channel model Above we looked at one word at a time unigram But even though the word about has the highest probability for the pronunciation ax b aw independent of context the word a bow has a higher probability next to words like and N grams allow us to think about context Let s imagine there are only 4 words in the vocabulary about a bow and and the Describe how to compute the probability that ax b aw is about when followed by and vs the Compare this to the probability that ax b aw is a bow when followed by and vs the Assume that the word about is followed by the word the with 0 99 probability and the word and with 0 01 probability Conversely assume the word a bow is followed by the word the with 0 01 probability and the word and with 0 99 probability 17 What is the Markov Assumption 18 How do we use a bigram grammar to compute P S the probability of a given sentence S You may also hear compute P S using a Maximum Likelihood Estimate For example what is P like Chinese food using a MLE where like occurs 15000 times in corpus of 1000000 words like Chinese occurs 10000 times in same corpus Chinese occurs 80000 times in same corpus and Chinese food occurs 56000 times in same corpus 2 19 Sparse data problem what is it What can we do about it What is the problem of add one smoothing Given a vocabulary of 1000 words types and a corpus of 100 000 words tokens we want to estimate the bi gram probabilities We know that half of the bi grams do not occur in the corpus What is the unsmoothed probability of a bi gram that occurs 100 times in the corpus What is its probability if we apply add one smoothing You are allowed to round off probabilities 20 What are the characteristics of a hidden variable i e what is the reason that some information is hidden 21 HMMs can also be used for part of speech tagging Consider the following string Time flies like an arrow where time can be a noun or a verb and like can be a preposition or a verb Below are the emission probabilities For example the probability that a noun emits the observation arrow is 0 2 det noun prep verb an 0 2 0 0 0 0 0 0 arrow 0 0 0 2 0 …


View Full Document

UMD CMSC 723 - MIDTERM SAMPLE QUESTIONS

Documents in this Course
Lecture 9

Lecture 9

12 pages

Smoothing

Smoothing

15 pages

Load more
Loading Unlocking...
Login

Join to view MIDTERM SAMPLE QUESTIONS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MIDTERM SAMPLE QUESTIONS and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?