Stanford CS 224 - Lecture Notes - D335602

Home> Schools> Stanford University> Computer Science (CS) > CS 224> Lecture Notes

DOC PREVIEW

Stanford CS 224 - Lecture Notes

School name Stanford University

Course Cs 224- N Natural Language Processing with Deep Learning

Pages 11

This preview shows page 1-2-3-4 out of 11 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lin, Long, Miduthuri1A Comparison of Google N-grams and Gigaword Dependenciesas Automatically Mined Features in Temporal Relation ExtractionChristopher LinJessica LongArun MiduthuriComputer ScienceDepartmentComputer ScienceDepartmentElectrical EngineeeringDepartmentStanford UniversityStanford UniversityStanford [email protected]@[email protected] paper describes a system for learningbefore-after temporal relations by trainingon text and dependency relation patternsmined from large-scale corpora. Wemined the Google N-gram corpus to ex-tract good text patterns to use as featuresin a temporal ordering classification sys-tem. Though we identified these as beingqualitatively good features. Classificationresults were just above baseline. Wecompare this set of features to another setof features generated from a corpus of de-pendency relations in newswire text. Thelatter set of features improved classifica-tion beyond baseline by about 2%, butwhen combined with existing state-of-the-art features, worsened performance by asmall amount.1 IntroductionThe extraction and classification of temporal rela-tion in text is an important component of naturallanguage understanding and has wide applicationsin information extraction, question answering,summarization, and inference. For example, com-prehending the sentence, "Sebastian Pinera issworn in as president of quake-hit Chile as a 6.9-magnitude aftershock strikes the centre of thecountry," requires understanding that the eventindicated by the phrase "sworn in" occurs duringthe event indicated by the phrase "strikes." Thetask of temporal ordering encompasses both classi-fication of pairs of events and classification ofpairs of events and times. Recent work in this fieldhas improved performance on this task by focusingon the use of supervised machine learning tech-niques to extract textual features in event stringsand their surrounding context that are indicative ofparticular types of temporal relations.Unfortunately, the small size and poor labelling ofthe TimeBank corpus limit the amount of dataavailable for supervised machine learning methodsto learn features for temporal relation classifica-tion. While TimeBank provides a number of use-ful textual features, such as part of speech, tense,modality, and polarity, this set of features is lim-ited to what can be derived solely from the text ofdocuments contained in the corpus. One potentialway to address this problem is to harness the large-scale data available from other existing corporaand the World Wide Web to supplement the infor-mation in TimeBank. Using labelled examplesfrom TimeBank, we can mine these other corporafor features characteristic of these event pairs indocuments beyond the TimeBank corpus. Suchfeatures could be used to improve current temporalrelation classifiers while also labelling additionalevent pairs to expand the corpora available for su-pervised learning in the field of temporal relations.In this paper, we describe a system that comparescontextual features drawn from TimeBank, textpattern features drawn a Web corpus of N-grams tosupplement TimeBank, and dependency featuresdrawn from the Gigaword corpus of newswire text.We discuss previous work on temporal relationextraction in Section 2. Section 3 describes thedesign of our experiments, and Sections 4 and 5describe our data and implementation. Section 6gives the results of our experiments, which weanalyze in Section 7 and suggest improvementsupon in Section 8.Lin, Long, Miduthuri22 Previous WorkThe TimeBank corpus for temporal relation classi-fication was introduced by Pustejovsky, et al.(2003). Mani, et al. (2007), Lapata and Lascarides(2006), and Chambers, Wang, and Jurafksy (2007)built classifiers based on this corpus with varyingfeatures. Mani, et al. used a MaxEnt classifier,with individual word features like tense, aspect,modality, polarity, and pairwise features involvingthese construct, while Lapata and Lascarides fo-cused on learning features to detect inter-sententialtemporal relations.Chambers, Wang, and Jurafsky (2007) imple-mented a fully automatic machine learning systemfor the learning and classification of temporal rela-tions for pairs of events. This system used manyof the features described in earlier work whileadding many more. They draw a large number offeatures from the text, including the event string,lemmas of event words, part of speech tags forwords in and surrounding the event string, and bi-gram features of tense, aspect, and class. Alsoused are syntactic features like parse tree charac-teristics, WordNet synsets, presence in a list ofselect prepositional phrases, and a split approachthat learns separate models for event pairs that ap-pear in the same sentence and those that appear indifferent sentences. An SVM trained on thesefeatures performed very well, achieving a peakaccuracy of 59.43% on TimeBank.Mani, et al. (2007) also focus on the effects ofcertain variations on training methods on the accu-racy of the classifier. They found that partitioningthe training set by pairs rather than documents pro-vided better performance on the test set, and thattransitive closure can amplify these effects by in-creasing or decreasing the amount of shared con-text between the training and test sets.Our work is similar to that of both sets of authorsin terms of focusing on the task of learning typesof relations for event-event pairs. We depart fromthese studies by looking beyond the TimeBankcorpus for feature learning. We supplement the setof features drawn from TimeBank with featuresdrawn from large-scale data in the hope that thescale of the data and the additional information itbrings will result in more sophisticated features.Finally, one interesting aspect of current work ontemporal relation extraction and classification tasksis enforcement of global consistency of all pair-wise relations. Mani, et al. (2007) do some pre-liminary investigation in this area using a greedyalgorithm based on confidence intervals that addsprogressively

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4 out of 11 pages.

Stanford CS 224 - Lecture Notes

Sign up for free to view:

Please select your school