Unformatted text preview:

CMSC 723 / LING 645: Intro to Computational LinguisticsFAQ for KimmoFAQ for Kimmo (continued)Slide 4CMSC 723 / LING 645: Intro to Computational LinguisticsSeptember 29, 2004: DorrToward a Kimmo FAQProf. Bonnie J. DorrDr. Christof MonzTA: Adam LeeFAQ for Kimmo1) What order does PCKimmo go through the FSAs? Is it in the order listed in the .aut file?NO. They are simultaneously traversed (as described in class last week); I'll emphasize this again this week.2) In the .aut file, there is an underscore in all of the FSAs. What does this mean?You should only use the characters that I have specified for German in the lab description. The underscore in English was only because that automaton was used in a working system, after tokenization had taken place. So it was there to deal with tokens like "because_of" or "so_that". But you don't need to do anything like this, so ignore it for your lab.3) Can you have two letters after the colon (replacement operator)?NO. As described last week in class, only single characters are allowed.FAQ for Kimmo (continued)4) In an FSA, where specificity matters, it is only the feasible pair that matters, not the individual letter, correct? (for example if I have a pair of e:i, I do not need to mention e or i specifically if I do not want this feasible pair to happen again).If you have the feasible pair e:i in one automaton, you do not need to mention that pair explicitly again in some other automaton. However, unfortunately, once you have the feasible pair e:i, it will affect what you put in other automata. In particular, all your other automata need to have a transition to cover that case, even if it is simply the transition =:= , that is you need to make sure you don't fail in some *other* automaton while you are convering an "e" to an "i" in the e-to-i automaton.FAQ for Kimmo (continued)5) Nouns in German are capitalized, and this is how they are listed, but the root forms are not capitalized? I assumed we should make the root forms capitals. Is this correct?No, the root forms need not be capital letters. We stated this assumption explicitly in the lab: "For the purpose of this project, use lower case characters only, even for German nouns (which are usually otherwise capitalized)."6) In PC Kimmo, how do we say which states are end states for the FSAs?I COMPLETELY FORGOT TO MENTION THIS LAST WEEK! However, it is very clearly stated in the PC Kimmo manual. Use colon (:) to indicate final state and dot (.) to indicate non-final state. (Although I didn't mention this, it is evident in the matrix form of the automaton I showed last


View Full Document

UMD CMSC 723 - Intro to Computational Linguistics

Documents in this Course
Lecture 9

Lecture 9

12 pages

Smoothing

Smoothing

15 pages

Load more
Download Intro to Computational Linguistics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Intro to Computational Linguistics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Intro to Computational Linguistics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?