DOC PREVIEW
Columbia COMS W4705 - Discourse - Structure and Coherence

This preview shows page 1-2-3-4-24-25-26-50-51-52-53 out of 53 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS4705: Natural Language ProcessingHomework questions?Slide Number 3Class Wrap-UpWhat is a coherent/cohesive discourse?Summarization, question answering, information extraction, generationOutlinePart I: Discourse StructureDiscourse SegmentationUnsupervised Discourse SegmentationApplicationsSlide Number 12Key intuition: cohesionKey intuition: cohesionIntuition of cohesion-based segmentationSlide Number 16TextTiling (Hearst 1997)TextTiling algorithmCosineVector Space ModelRepresentationTerm WeightingTerm WeightingTerm WeightsTFxIDF WeightingBack to SimilaritySimilarity in Space (Vector Space Model)SimilaritySimilarityTextTiling algorithmSlide Number 31Lexical Score Part 2: Introduction of New TermsLexical Score Part 3: Lexical ChainsSupervised Discourse segmentationSupervised discourse segmentationSupervised discourse segmentationSummarization, Question answering, Information extraction, generation Part II: Text CoherenceBetter?CoherenceWhat makes a text coherent?Hobbs 1979 Coherence RelationsHobbs: “Explanation”Hobbs: “Parallel”Hobbs “Elaboration”Summarization, question answering, information extraction, generationCoherence relations impose a discourse structureRhetorical Structure TheoryOne example of rhetorical relationAutomatic Rhetorical Structure LabelingFeatures: cue phrasesSome Problems with RSTSummarization, Question answering, Information Extraction, generationDiscourse: Structure and CoherenceKathy McKeownThanks to Dan Jurafsky, Diane Litman, Andy Kehler, Jim Martin` HW4:◦ For HW3 you experiment with different features (at least 3) and different learning algorithms (at least 2) but you turn in your best model◦ For HW4 you are asked to write up your findings from your experiments in HW3x What features did you experiment with and why?x How did each individual feature contribute to success vs. the combination? (show the evaluation results)x Why do you think the features worked this way?x How do the different machine learning algorithms compare?x What features did you try but throw out?x You should provide charts with numbers both comparison of feature impact and learning algorithm impact◦◦ Evaluation: How would your system fare if you used the pyramid method rather than precision and recall? Show how this would work on one of the test document sets. That is, for the first 3 summary sentences in the four human models, show the SCUs, the weights for each SCU, and which of the SCUs your system got.◦ If you could do just one thing to improve your system, what would that be? Show an example of where things went wrong and say whether you think there is any NL technology that could help you address this. ◦ Your paper should be between 5-7 pages.◦ Professor McKeown will grade the paper` Final exam: December 17th, 1:10-4:00 here` Pick up your work: midterms, past assignments from me in my office hours or after class` HW2 grades will be returned the Thurs after Thanksgiving` Interim class participation grades will be posted on courseworks the week after Thanksgiving` Which are more useful where?x Discourse structure: subtopicsx Discourse coherence: relations between sentencesx Discourse structure: rhetorical relations` Discourse Structure◦ Textiling` Coherence◦ Hobbs coherence relations◦ Rhetorical Structure Theory` Conventional structures for different genres◦ Academic articles: x Abstract, Introduction, Methodology, Results, Conclusion◦ Newspaper story: x inverted pyramid structure (lead followed by expansion)` Simpler task◦ Discourse segmentationx Separating document into linear sequence of subtopics` Hearst (1997): 21-pgraph science news article called “Stargazers”` Goal: produce the following subtopic segments:` Information retrieval: x automatically segmenting a TV news broadcast or a long news story into sequence of stories` Text summarization: ?` Information extraction:x Extract info from inside a single discourse segment` Question Answering?` Halliday and Hasan (1976): “The use of certain linguistic devices to link or tie together textual units”` Lexical cohesion:◦ Indicated by relations between words in the two units (identical word, synonym, hypernym)x Before winter I built a chimney, and shingled the sides of my house.I thus have a tight shingled and plastered house.x Peel, core and slice the pears and the apples. Add the fruitto the skillet.` Non-lexical: anaphora◦ The Woodhouses were first in consequence there. All looked up to them.` Cohesion chain:◦ Peel, core and slice the pears and the apples. Add the fruit to the skillet. When they are soft…` Sentences or paragraphs in a subtopic are cohesive with each other` But not with paragraphs in a neighboring subtopic` Thus if we measured the cohesion between every neighboring sentences◦ We might expect a ‘dip’ in cohesion at subtopic boundaries.1. Tokenization◦ Each space-deliminated word◦ Converted to lower case◦ Throw out stop list words◦ Stem the rest◦ Group into pseudo-sentences of length w=202. Lexical Score Determination: cohesion score1. Three part score including◦ Average similarity (cosine measure) between gaps◦ Introduction of new terms◦ Lexical chains3. Boundary Identification` In the vector space model, both documents and queries are represented as vectors of numbers. x For textiling: both segments are represented as vectorsx For categorization, both documents are represented as vectors` The numbers are derived from the words that occur in the collection` Start with bit vectors` This says that there are N word types in the collection and that the representation of a document consists of a 1 for each corresponding word type that occurs in the document.` We can compare two docs or a query and a doc by summing the bits they have in common),...,,( 321 Nj ttttd =jiNikijk ttdqsim ,1,),( ×=∑=rr` Bit vector idea treats all terms that occur in the query and the document equally.` Its better to give the more important terms greater weight.x Why?x How would we decide what is more important?` Two measures are used◦ Local weightx How important is this term to the meaning of this documentx Usually based on the frequency of the term in the document◦ Global weightx How well does this term discriminate among the documents in the collectionx The more documents a term occurs in the less important it is; The fewer the better.` Local weights◦ Generally, some function of the frequency of terms in documents is used` Global weights◦ The


View Full Document

Columbia COMS W4705 - Discourse - Structure and Coherence

Download Discourse - Structure and Coherence
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Discourse - Structure and Coherence and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Discourse - Structure and Coherence 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?