CMSC 723 LING 645 Intro to Computational Linguistics September 29 2004 Dorr Toward a Kimmo FAQ Prof Bonnie J Dorr Dr Christof Monz TA Adam Lee FAQ for Kimmo 1 What order does PCKimmo go through the FSAs Is it in the order listed in the aut file NO They are simultaneously traversed as described in class last week I ll emphasize this again this week 2 In the aut file there is an underscore in all of the FSAs What does this mean You should only use the characters that I have specified for German in the lab description The underscore in English was only because that automaton was used in a working system after tokenization had taken place So it was there to deal with tokens like because of or so that But you don t need to do anything like this so ignore it for your lab 3 Can you have two letters after the colon replacement operator NO As described last week in class only single characters are allowed FAQ for Kimmo continued 4 In an FSA where specificity matters it is only the feasible pair that matters not the individual letter correct for example if I have a pair of e i I do not need to mention e or i specifically if I do not want this feasible pair to happen again If you have the feasible pair e i in one automaton you do not need to mention that pair explicitly again in some other automaton However unfortunately once you have the feasible pair e i it will affect what you put in other automata In particular all your other automata need to have a transition to cover that case even if it is simply the transition that is you need to make sure you don t fail in some other automaton while you are convering an e to an i in the e to i automaton FAQ for Kimmo continued 5 Nouns in German are capitalized and this is how they are listed but the root forms are not capitalized I assumed we should make the root forms capitals Is this correct No the root forms need not be capital letters We stated this assumption explicitly in the lab For the purpose of this project use lower case characters only even for German nouns which are usually otherwise capitalized 6 In PC Kimmo how do we say which states are end states for the FSAs I COMPLETELY FORGOT TO MENTION THIS LAST WEEK However it is very clearly stated in the PC Kimmo manual Use colon to indicate final state and dot to indicate non final state Although I didn t mention this it is evident in the matrix form of the automaton I showed last week
View Full Document
Unlocking...