DOC PREVIEW
SWARTHMORE CS 97 - One sense per collocation for prepositions

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Appeared in: Proceedings of the Class of 2003 Senior Conference, pages 8–14Computer Science Department, Swarthmore CollegeOne sense per collocation for prepositionsHollis Easter & Benjamin SchakMay 7,th2003AbstractThis paper presents an application of the one-sense-per-collocation hypoth-esis to the problem of word sense disambiguation for prepositions. The hy-pothesis is tested through translation using a bilingual French-English cor-pus. The paper shows that one-sense-per-collocation does hold for prepo-sitions.1 IntroductionThe one-sense-per-collocation hypothesis (Yarowsky 1993) states that words1tend to occur with only one sense within different instances of the same col-location. Yarowsky (1993) tested this hypothesis with strong results on coarse-grained senses of ambiguous nouns, verbs, an adjectives. Although Martinezand Agirre (2000) achieved weaker results for fine-grained sense distinctions,the hypothesis can help a wide range of natural language processing tasks.Since the one-sense-per-collocation hypothesis is implicit in much of the previ-ous work, such as (Japkowicz, 1991) on translating prepositions, an evaluationof the Hypothesis could yield improvement in translation systems. This paperdiscusses compelling reasons for why the Hypothesis should hold, and teststhe Hypothesis on a bilingual English-French corpus.Our first problem is how to define senses for prepositions. Yarowsky (1993)gives several ways to approach this. One way is the “hand-tagged homographmethod,” in which one uses a corpus tagged with the correct senses of eachword. This won’t work for us because no corpus known to us has reliablesense distinctions for prepositions. We also want to avoid methods based on1Words with more than one sense are polysemes.8Appeared in: Proceedings of the Class of 2003 Senior Conference, pages 8–14Computer Science Department, Swarthmore Collegehomophones, ambiguities in online character recognition, and pseudo-wordsbecause the closed class of prepositions is too small. So, we equate the notionof a sense with that of a French translation.1.1 SubcategorizationAs noted above, there are two linguistic observations that recommend the one-sense-per-collocation hypothesis. The first of these is subcategorization, the no-tion that every noun, verb, and adjective selects (or “takes”) certain types ofphrases for complements, and can determine the heads of those complements.For example, consider the English adjective interested, translated into Frenchas interess´e. Sentences (1) and (2) show that interested must take a prepositionalphrase headed by the preposition in as its complement, while interess´e musttake a prepositional phrase headed by par.John is interested∗math / in math /∗for math /∗to math /∗mathematic /∗to do math.(1)Jacques est interess´e∗les maths / par les maths /∗pour les maths /∗aux maths /∗math´ematique /∗faire les maths.(2)It should be clear that there is nothing about mathematics per se that requiresone preposition or another; while one can be interested in math, one can alsorely on math or be afraid of math or look to math.1.2 Noun-complement specificityThe second encouraging observation, used by Japkowicz and Wiebe (1991), isthat many nouns may only be complements of certain prepositions. They as-sert that most nouns may only be used with particular prepositions, and thatanalogous nouns in different languages (English and French, for example) ad-mit different propositions because the languages conceptualize those nounsdifferently. For example, in saying on the bus but dans l’autobus (literally “in thebus”), “English conceptualizes the bus as a surface that can support entities, byhighlighting only its bottom platform, while French conceptualizes the bus asa volume that can contain entities, by highlighting its bottom surface, its sides,and its roof altogether.” (Japkowicz, 1991)22Readers may wonder when prepositions are determined by a preceding word and when theyare determined by a complement. We suspect that adverbial prepositional phrases, such as Jap-9Appeared in: Proceedings of the Class of 2003 Senior Conference, pages 8–14Computer Science Department, Swarthmore College1.3 Local collocationsIn testing one-sense-per-collocation for nouns, verbs, and adjectives, Yarow-sky (1993) tested only local collocations. That is, he ignored the possibility thatdistant content words could give reliable information sense disambiguation.We do the same here, and with better cause. While it is somewhat plausiblethat senses of nouns, verbs, and adjectives—categories whose words are re-plete with meaning—could be inferred from distant context, such a situationseems unlikely for prepositions.1.4 Potential problemsGiven these sensible arguments for the Hypothesis, why bother testing it? Tru-jillo (1992) provides examples where the one-sense-per-collocation hypothesisfails. He presents an English sentence (3) with three plausible Spanish transla-tions (4).She ran under the bridge.(3)Corri´o debajo / por debajo / hasta debajo del puente.(4)The first translation implies that she was running around under the bridge,the second that she ran on a path that went under the bridge and kept going,and the third that she ran up to a position under the bridge and stopped. Wehope, however, that this example is of an infrequent special case, and can beovercome. Sentence (3) usually translates best with por debajo, and the samesentence with the verb rested translates best with debajo de.Another possible problem is that individual speakers may use different pre-positional phrases for essentially the same concept. While one speaker mayuse on top of, another may use atop, another on, and so on. Given these issues,additional testing is warranted.2 MethodsTo test the Hypothesis, we used the sentence-aligned Hansards of the 36thPar-liament of Canada, a French-English bilingual corpus. (Hansards, 2001) Ourkowicz and Wiebe’s locatives, are determined by their complements, while prepositional phrasesgoverned by a preceding noun, verb, or adjective are determined by their governor.10Appeared in: Proceedings of the Class of 2003 Senior Conference, pages 8–14Computer Science Department, Swarthmore Collegeanalysis takes four steps:1. We preprocess the French sentences, changing au to `ale, aux to `a les, du tode le, des to de les, and d’ to de.2. We create a database, for each preposition in our list3, with one recordfor each


View Full Document

SWARTHMORE CS 97 - One sense per collocation for prepositions

Documents in this Course
Load more
Download One sense per collocation for prepositions
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view One sense per collocation for prepositions and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view One sense per collocation for prepositions 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?