Penn CIS 700 - Supplementing Entity Coherence with Local Rhetorical Relations for Information Ordering

Unformatted text preview:

Supplementing Entity Coherence with Local RhetoricalRelations for Information OrderingNikiforos KaramanisNatural Language and Information Processing GroupComputer LaboratoryUniversity of [email protected] paper investigates whether the model of local rhetorical coherence suggestedin Knott et al. (2001) can boost the performance of the Centering-based metrics ofentity coherence employed by Karamanis et al. (2004) for the task of informationordering. Rhetorical coherence is integrated into the way Centering’s basic datastructures are derived from the annotated features of the GNOME corpus. Theresults indicate that (a) the simplest metric continues to perform better than itscompetitors even when local rhetorical coherence is taken into account, and (b) thisextra coherence constraint decreases its performance.Keywords: Information Ordering, Centering Theory, Rhetorical Coherence.1. IntroductionText generation is the field in Computational Linguistics which dealswith the automated production of text from information derivedfrom either an underlying non-linguistic representation (concept-to-text generation: Reiter and Dale, 2000) or other documents (text-to-text generation), e.g. to summarise them (Mani, 2001). InformationOrdering (Barzilay and Lee, 2004), i.e. deciding in which sequenceto present a set of preselected information-bearing items (typicallycorresponding to clauses or sentences) has received much attentionin recent work in text generation. Text generation systems need toorganise the content in a way that makes the output text coherent, i.e.easy to read and comprehend. The easiest way to exemplify coherenceis by arbitrarily reordering the sentences of an understandable text.This process very often gives rise to documents that do not makesense although the information content is the same before and afterthe reordering (Marcu, 1997; Reiter and Dale, 2000).Entity coherence, which is based on the way NP referents relatesubsequent clauses in the text, is an important aspect of textual felicity.Since the early ’80s, when it was first introduced, Centering Theory hasbeen an influential framework for modelling entity coherence, especiallyfor text interpretation (see the collection of papers in Walker et al.c 2007 Kluwer Academic Publishers. Printed in the Netherlands.JOLLIfinal.tex; 3/05/2007; 18:07; p.12 Nikiforos Karamanis1998b for an overview). However, as Kibble (2001) observes, Centeringbegan being applied to text generation only relatively recently.Karamanis et al. (2004) presented the first attempt evaluateCentering-based metrics of coherence for the purposes of informationordering in text generation. A subset of the GNOME corpus (Poesio etal., 2004) was used as test data because it consisted of texts which wererepresentative of the domain Karamanis et al. were mainly interested in,namely descriptions of museum artefacts, and were reliably annotatedwith features related to Centering. Although Centering was expected tobe particularly appropriate for information ordering in this genre, theirmain finding was that the simplest metric sets a baseline that cannotbe outperformed by other metrics which utilise additional Centering-specific notions. However, the baseline did not perform well enough tobe used in practice for information ordering on its own.Karamanis et al. tested metrics suitable for the information orderingapproaches in text generation presented by Karamanis and Manurung(2002) and Althaus et al. (2004). These approaches, which are inspiredby related work on text-to-text generation (Lapata, 2003; Barzilay andLee, 2004; Barzilay and Lapata, 2005), receive an unordered set ofclauses as their input and use a metric to output the highest scoringordering of these clauses.1The metrics were evaluated empiricallyusing the experimental methodology of Karamanis (2003). The mainassumption behind this method is that the observed ordering of clausesin a text represents a gold standard solution. The gold standard isscored by each metric, which is penalised proportionally to the amountof alternative orderings of the same material that score equally to orbetter than the gold standard. This methodology extends the wayBarzilay and her colleagues evaluate automatically their approachesto information ordering.Similarly to most work on Centering for text interpretation,Karamanis et al. investigated the impact of Centering only anddid not take other coherence-inducing factors into account in theirstudy. However, Kibble (2001) argued that Centering needs to besupplemented with other models of coherence while Poesio et al. (2004)suggested that the model of local rhetorical coherence introduced byKnott et al. (2001) may be a good candidate to supplement Centeringin my domain of interest (i.e. object descriptions).Knott et al. (2001) object to the traditional view of textual structureas a tree of Rhetorical Relations (RR-tree) motivated by Rhetorical1Typically, information ordering in concept-to-text generation is a side effect ofbuilding a tree of Rhetorical Relations. However, this is not the most appropriateway to account for the coherence of descriptive texts as I discuss in more detailbelow.JOLLIfinal.tex; 3/05/2007; 18:07; p.2Entity vs. Rhetorical Coherence for Information Ordering 3Structure Theory (Mann and Thompson, 1987). Organising the entiretext structure hierarchically in terms of an RR-tree can be tracedback to at least Hovy (1988) and has inspired a lot of work in textgeneration, with the approaches of Scott and de Souza (1990) andMarcu (1997) being among the most influential. Nevertheless, Knottet al. argue that descriptive texts do not feature an entirely tree-likestructure. In their model of local rhetorical coherence, RR-trees aremade of a small number of Rhetorical Relations applied locally. Thelocal RR-trees are related to each other via links induced by constraintson entity coherence.Knott et al. do not commit to a specific framework of entitycoherence to supplement local rhetorical coherence in their modelalthough Centering is mentioned as a potentially compatible theory.One way of integrating Centering with local rhetorical structurehas been suggested by Kibble and Power (2000, 2004). In Kibbleand Power’s system, which generates pharmaceutical leaflets, notionsderived from Centering are applied together with constraints onrhetorical coherence to decide on the best local RR-tree. Crucially, inKibble and Power’s approach


View Full Document

Penn CIS 700 - Supplementing Entity Coherence with Local Rhetorical Relations for Information Ordering

Documents in this Course
Load more
Download Supplementing Entity Coherence with Local Rhetorical Relations for Information Ordering
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Supplementing Entity Coherence with Local Rhetorical Relations for Information Ordering and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Supplementing Entity Coherence with Local Rhetorical Relations for Information Ordering 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?