Unformatted text preview:

Document Visualization “Visualizing the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents”OutlineDocument VisualizationVector Space Analysis (Salton et. al.)Reduced Text + Interaction: SeeSoft (Eick, 1992)SeeSoft (Eick, 1992)2D Maps (Lin, 1992)Visual Text Analysis: SPIREApplications2D Scatterplot: GalaxiesGalaxies3D Landscapes: ThemescapesThemescapesVisualization TransformationsProcessing Text RequirementsVisual Output of Text ProcessingInterface DesignScreenshotFavorite SentencesContributionsCritiqueOther CommentsGalaxy of News: Interactive LandscapesGalaxy of News: SummaryDocument Visualization“Visualizing the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents”J. A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow Proceedings of Infoviz’95Reviewed by Nada Golmiefor CMSC 838SFall 199901/14/19 Document Visualization 2 Outline•Document visualization: –What? Why? How?•Examples for 1D, 2D visualizations:–vector space analysis (Salton 1995)–reduced text + interaction (Eick, 1992)–2D maps of document collections (Lin, 1992)•3D Visualization: SPIRE(Wise et. al. 1995)•3D + Time: Interactive Landscapes(Rennison, 1994)01/14/19 Document Visualization 3Document Visualization•Document visualization is an important IV application due to emerging technology trends:–World Wide Web–Digital Libraries–Communication Advances•Mapping a text document: –Understand the content of a document.•Mapping a collection of documents:– Discover relationships among documents.01/14/19 Document Visualization 4Vector Space Analysis (Salton et. al.)•Support of free-form text queries in IR.•Text passages are mapped into a vector of terms in high dimensional space: where is the weighted assigned to term k in document .•Given document and query a similarity computation is computed as:)d,...,d,(dDiki2i1iikdiDiDjQjkt1kikjidd)Q,sim(D01/14/19 Document Visualization 5Reduced Text + Interaction: SeeSoft (Eick, 1992)•Reduced representation–display of lines as rows, files as columns (max 900 rows per column)•Colors are used to display statistics –statistics include: age, programmer, feature, type of line, number of times the line was executed•Direct manipulation techniques–find interesting patterns–capability to read actual code using magnification01/14/19 Document Visualization 6SeeSoft (Eick, 1992)01/14/19 Document Visualization 72D Maps (Lin, 1992)•Framework for information retrieval:–mapping of high dimensional document space into 2D map.–document relationships are explored using visual cues such as: dots, links, clusters, and areas.•Neural network self-organizing learning algorithm based on Kohonen’s feature map:–preserves distance relationships between input data.–allocates different numbers of nodes to inputs based on their occurrence frequencies.•Sitemap01/14/19 Document Visualization 8Visual Text Analysis: SPIRE SPIRE (Spatial Paradigm for Information Retrieval and Exploration) is a software that allows users:– to explore complex relationships between text documents. –to rapidly discover known and hidden information relationships by reading only the pertinent documents rather than wading through large volumes of text.01/14/19 Document Visualization 9Applications•SPIRE was originally developed for the U.S. intelligence community.•Other potential applications include:–environmental assessment–market analysis –corporations researching competitive products, –health care providers searching patient records,– or attorneys reading through previous cases.01/14/19 Document Visualization 102D Scatterplot: Galaxies•Galaxies computes word similarities and patterns in documents and then displays the documents on a computer screen to look like a universe of "docustars”:– closely related documents will cluster together in a tight group.– unrelated documents will be separated by large spaces.01/14/19 Document Visualization 11Galaxies01/14/19 Document Visualization 123D Landscapes: Themescapes•Themes within the document spaces appear on the computer screen as a relief map of natural terrain:–mountains in Themescapes indicate where themes are dominant; –valleys indicate weak themes. –shapes reflect how the thematic information is distributed and relate across documents. •Themes close in content will be close visually based on the many relationships within the text spaces.01/14/19 Document Visualization 13Themescapes01/14/19 Document Visualization 14Visualization Transformations•Definition of text: written form of natural language.•Text conversion to spatial form: algorithms & processes. •Meaningful visualizations: mathematical procedures and analytical measures.•Database management:store and manage text and its derivative forms.01/14/19 Document Visualization 15Processing Text Requirements•Identification and extraction of text features: –frequency-based measures of words –higher order statistics taken on words: occurrence, frequency, context of individual words are used to characterize defined word classes.–Semantic approaches using natural language understanding.•Efficient and flexible representation of documents in terms of text features.•Support of information retrieval and visualization.01/14/19 Document Visualization 16Visual Output of Text Processing•Vector representation of document in high dimensional feature space.–Comparisons, filters, transformations can be applied• Projection onto 2-3D visualization–dimensionality reduction–scaling–clustering in high dimension feature space and centroids of clusters are fed into layout algorithms (principal component analysis or multidimensional scaling)01/14/19 Document Visualization 17Interface Design•Three display types:–Backdrop: central display resource.–Workshop: grid with resizable windows to hold multiple views.–Chronicle: space where views are placed and linked to form a visual story.•Tools provided to allow more in-depth analysis: point and click, grouping, annotation, query, subset, temporal slicing.01/14/19 Document Visualization 18Screenshot01/14/19 Document Visualization 19Favorite Sentences“The bottleneck in the human processing and understanding of information in large amounts of text can be overcome if the text is spatialized in a manner that takes advantage of common powers of perception.”“So


View Full Document
Download Document Visualization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Document Visualization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Document Visualization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?