Slide 1What makes a text or dialogue coherent?Or, this?What makes a text coherent?OutlineConventions of Discourse StructureDiscourse SegmentationUnsupervised SegmentationSlide 9Intuition: CohesionIntuition: CohesionCohesion-Based SegmentationSlide 13TexTiling (Hearst ’97)TexTiling MethodCosine SimilarityVector Space ModelRepresentationsTerm WeightingTerm WeightingTerm WeightingTf-IDF WeightingBack to SimilaritySimilarity in Space (Vector Space Model)SimilaritySimilarityTextTiling algorithmSlide 28Lexical Score Part 2: Introduction of New TermsLexical Score Part 3: Lexical ChainsSupervised Discourse segmentationSupervised discourse segmentationSupervised discourse segmentationText CoherenceOr….CoherenceWhat makes a text coherent?Hobbes ’79: Coherence RelationsSlide 39Slide 40Slide 41Coherence RelationsRhetorical Structure TheoryOne Rhetorical RelationAutomatic LabelingFeatures: Cue PhrasesSome Problems with RSTSummarization, Q/A, I/E, Generation, …SummaryDiscourse Structure and Discourse CoherenceJulia HirschbergCS 4705Thanks to Dan Jurafsky, Diane Litman, Andy Kehler, Jim MartinWhat makes a text or dialogue coherent? “Consider, for example, the difference between passages (18.71) and (18.72). Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Do you have a discourse? Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book.”Or, this?“Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book. Do you have a discourse? Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Consider, for example, the difference between passages (18.71) and (18.72).”What makes a text coherent?•Appropriate use of coherence relations between subparts of the discourse -- rhetorical structure•Appropriate sequencing of subparts of the discourse -- discourse/topic structure•Appropriate use of referring expressionsOutline•Discourse Structure–Textiling•Coherence–Hobbs coherence relations–Rhetorical Structure TheoryConventions of Discourse Structure•Differ for different genres–Academic articles: •Abstract, Introduction, Methodology, Results, Conclusion–Newspaper stories: •Inverted Pyramid structure:–Lead followed by expansion, least important last–Textbook chapters–News broadcasts–NB: We can take advantage of this to ‘parse’ discourse structuresDiscourse Segmentation•Simpler task: Separating document into linear sequence of subtopics•Applications–Information retrieval•Automatically segmenting a TV news broadcast or a long news story into sequence of stories–Text summarization–Information extraction•Extract information from a coherent segment or topic–Question AnsweringUnsupervised Segmentation•Hearst (1997): 21-paragraph science news article on “Stargazers”•Goal: produce the following subtopic segments:Intuition: Cohesion•Halliday and Hasan (1976): “The use of certain linguistic devices to link or tie together textual units”•Lexical cohesion:–Indicated by relations between words in the two units (identical word, synonym, hypernym)•Before winter I built a chimney, and shingled the sides of my house.I thus have a tight shingled and plastered house.•Peel, core and slice the pears and the apples. Add the fruit to the skillet.Intuition: Cohesion•Non-lexical: anaphora–The Woodhouses were first in consequence there. All looked up to them.•Cohesion chain:–Peel, core and slice the pears and the apples. Add the fruit to the skillet. When they are soft…Cohesion-Based Segmentation•Sentences or paragraphs in a subtopic are cohesive with each other•But not with paragraphs in a neighboring subtopic•So, if we measured the cohesion between every neighboring sentences–We might expect a ‘dip’ in cohesion at subtopic boundaries.TexTiling (Hearst ’97)1. Tokenization–Each space-delimited word–Converted to lower case–Throw out stop list words–Stem the rest–Group into pseudo-sentences (windows) of length w=202. Lexical Score Determination: cohesion scoreThree part score including•Average similarity (cosine measure) between gaps•Introduction of new terms•Lexical chains3. Boundary IdentificationTexTiling MethodCosine SimilarityVector Space Model•In the vector space model, both documents and queries are represented as vectors of numbers •For TexTiling: both segments are represented as vectors•For document categorization, both documents are represented as vectors•Numbers are derived from the words that occur in the collectionRepresentations•Start with bit vectors•This says that there are N word types in the collection and that the representation of a document consists of a 1 for each corresponding word type that occurs in the document.•We can compare two docs or a query and a doc by summing the bits they have in common),...,,(321 Njttttd jiNikijkttdqsim,1,),( Term Weighting•Bit vector idea treats all terms that occur in the query and the document equally•Better to give more important terms greater weight•Why?•How would we decide what is more important?Term Weighting•Two measures used–Local weight•How important is this term to the meaning of this document?•Usually based on the frequency of the term in the document–Global weight•How well does this term discriminate among the documents in the collection?•The more documents a term occurs in the less important it is -- the fewer the betterTerm Weighting•Local weights–Generally, some function of the frequency of terms in documents is used•Global weights–The standard technique is known as inverse document frequencyiinNidf logN= number of documents; ni = number of documents with term iTf-IDF Weighting•To get the weight for a term in a document, multiply the term’s frequency-derived weight by its inverse document frequencyBack to Similarity•We were counting bits to get similarity•Now we have weights•But that favors long documents over shorter ones•We need to normalize by lengthjiNikijkwwdqsim,1,),( jiNikijkttdqsim,1,),( Similarity in
View Full Document