GT CS 7450 - Text and Document Visualization 2

Unformatted text preview:

1 Text and Document Visualization 2 CS 7450 - Information Visualization November 13, 2013 John Stasko Topic Notes Example Tasks & Goals • Which documents contain text on topic XYZ? • Which documents are of interest to me? • Are there other documents that are similar to this one (so they are worthwhile)? • How are different words used in a document or a document collection? • What are the main themes and ideas in a document or a collection? • Which documents have an angry tone? • How are certain words or themes distributed through a document? • Identify “hidden” messages or stories in this document collection. • How does one set of documents differ from another set? • Quickly gain an understanding of a document or collection in order to subsequently do XYZ. • Find connections between documents. Fall 2013 CS 7450 22 This Week’s Agenda Fall 2013 CS 7450 Visualization for IR Helping search Visualizing text Showing words, phrases, and sentences Visualizing document sets Words & sentences Analysis metrics Concepts & themes Last Time 3 Related Topic - Sensemaking • Sensemaking  Gaining a better understanding of the facts at hand in order to take some next steps  (Better definitions in VA lecture) • InfoVis can help make a large document collection more understandable more rapidly Fall 2013 CS 7450 Recall 43 Today’s Agenda • Move to collections of documents  Still do words, phrases, sentences  Add More context of documents Document analysis metrics Document meta-data Document entities Connections between documents Documents concepts and themes Fall 2013 CS 7450 5 Various Document Metrics • Goals? • Different variables for literary analysis  Average word length  Syllables per word  Average sentence length  Percentage of nouns, verbs, adjectives  Frequencies of specific words  Hapax Legomena – number of words that occur once Keim & Oelke VAST ‘07 Fall 2013 CS 7450 64 Vis Each block represents a contiguous set of words, eg, 10,000 words Do partial overlap in blocks for a smoother appearance Fall 2013 CS 7450 7 The Bible Fall 2013 CS 7450 85 Follow-On Work • Focus on readability metrics of documents • Multiple measures of readability  Provide quantitative measures • Features used:  Word length  Vocabulary complexity  Nominal forms  Sentence length  Sentence structure complexity Fall 2013 CS 7450 Oelke & Keim VAST ‘10 9 Visualization & Metrics Fall 2013 CS 7450 Uses heatmap style vis (blue-readable, red-unreadable) 106 Interface Fall 2013 CS 7450 11 Their Paper (Before & After) Fall 2013 CS 7450 127 Comment from the Talk • In academic papers, you want your abstract to be really readable • Would be cool to compare rejected papers to accepted papers Fall 2013 CS 7450 13 Overviews of Documents • Can we provide a quick browsing, overview UI, maybe especially useful for small screens? Fall 2013 CS 7450 148 Document Cards • Compact visual representation of a document • Show key terms and important images Fall 2013 CS 7450 Strobelt et al TVCG (InfoVis) ‘09 15 Representation Fall 2013 CS 7450 Layout algorithm searches for empty space rectangles to put things 169 Interaction • Hover over non-image space shows abstract in tooltip • Hover over image and see caption as tooltip • Click on page number to get full page • Click on image goes to page containing it • Clicking on a term highlights it in overview and all tooltips Fall 2013 CS 7450 17 Fall 2013 CS 7450 InfoVis ’08 Proceedings 1810 Fall 2013 CS 7450 Zooming In Video 19 Bohemian Bookshelf Fall 2013 CS 7450 20 Serendipitous browsing Thudt et al CHI ‘12 Video11 Themail • Visualize one’s email history  With whom and when has a person corresponded  What words were used • Answer questions like:  What sorts of things do I (the owner of the archive) talk about with each of my email contacts?  How do my email conversations with one person differ from those with other people? Fall 2013 CS 7450 21 Viégas, Golder & Donath CHI ‘06 Interface Fall 2013 CS 7450 2212 Characteristics • Text analysis to seed visualization • Monthly & yearly words Fall 2013 CS 7450 23 Query UI Fall 2013 CS 7450 2413 PaperLens • Focus on academic papers • Visualize doc metadata such as author, keywords, date, … • Multiple tightly-coupled views • Analytics questions • Effective in answering questions regarding:  Patterns such as frequency of authors and papers cited  Themes  Trends such as number of papers published in a topic area over time  Correlations between authors, topics and citations Fall 2013 CS 7450 Lee et al CHI ‘05 Short 25 PaperLens a) Popularity of topic b) Selected authors c) Author list d) Degrees of separation of links e) Paper list f) Year-by-year top ten cited papers/ authors – can be sorted by topic Video Fall 2013 CS 7450 2614 NetLens Kang et al Information Visualization ‘07 Fall 2013 CS 7450 27 More Document Info • Highlight entities within documents  People, places, organizations • Document summaries • Document similarity and clustering • Document sentiment Fall 2013 CS 7450 2815 Jigsaw • Targeting sense-making scenarios • Variety of visualizations ranging from word-specific, to entity connections, to document clusters • Primary focus is on entity-document and entity-entity connection • Search capability coupled with interactive exploration Stasko, Görg, & Liu Information Visualization ‘08 Fall 2013 CS 7450 29 Document View Document summary Wordcloud overview Doc List Selected document’s text with entities identified Fall 2013 CS 7450 3016 List View Entities listed by type Fall 2013 CS 7450 31 Document Cluster View Fall 2013 CS 7450 3217 Document Grid View Here showing sentiment analysis of docs Fall 2013 CS 7450 33 Calendar View Temporal context of entities & docs Video Fall 2013 CS 7450 3418 Jigsaw • Much more to come on Visual Analytics day… Fall 2013 CS 7450 35 FacetAtlas • Show entities and concepts and how they connect in a document collection • Visualizes both local and global patters • Shows  Entities  Facets – classes of entities  Relations – connections between entities  Clusters – groups of similar entities in a facet Fall 2013 CS 7450 Cao et al TVCG (InfoVis) ‘10 3619 Visualization Fall 2013 CS 7450 37 Video Up to Higher Level • How do we present


View Full Document

GT CS 7450 - Text and Document Visualization 2

Documents in this Course
Animation

Animation

23 pages

Load more
Download Text and Document Visualization 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Text and Document Visualization 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Text and Document Visualization 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?