IntroductionCS 4460/7450 - Information VisualizationJan. 8, 2009John StaskoSpring 2009 CS 4460/7450 2Exercise• Get out pencil and paperSpring 2009 CS 4460/7450 3Data Explosion• Society is more complex− There simply is more “stuff”• Computers, internet and web give people access to an incredible amount of data− news, sports, financial, purchases, etc...How Much Data? (1)• Estimated info added to digital universe each year will soon approach 1 ZB (zettabyte)*− 1000000000000000000000 (1021)bytes− From: http://www.emc.com/digital_universe viewed December 8, 2008Spring 2009 CS 4460/7450 4*But only half that goes to my email inboxHow Much Data? (2)• 6 million FedEx transactions per day http://www.fedex.com/us/about/today/companies/corporation/facts.html• Average of 98 million Visa credit-card transactions per day in 2005 http://www.corporate.visa.com/md/nr/press278.jsp• Average of 5.4 petabytes of data crosses AT&T’s network per day http://att.sbc.com/gen/investor-relations?pid=5711• Average of 610 to 1110 billion e-mails worldwide per year (based on estimates in 2000) http://www2.sims.berkeley.edu/research/projects/how-much-info/internet.htmlSpring 2009 CS 4460/7450 5Slide courtesy Jim ThomasSpring 2009 CS 4460/7450 6Data Overload• Confound: How to make use of the data− How do we make sense of the data?− How do we harness this data in decision-making processes?− How do we avoid being overwhelmed?Spring 2009 CS 4460/7450 7The Challenge• Transform the datainto information(understanding, insight) thus making it useful to peopleSpring 2009 CS 4460/7450 8The ProblemDataHow?Data TransferWeb,Books,Papers, Game scores, Scientific data,Biotech,ShoppingPeopleStock/financeNewsVision: 100 MB/sEars: <100 b/sTelepathyHaptic/tactileSmellTasteTwo slides courtesyof Chris NorthSpring 2009 CS 4460/7450 9Human Vision• Highest bandwidth sense• Fast, parallel• Pattern recognition• Pre-attentive• Extends memory and cognitive capacity(Multiplication test)• People think visuallyImpressive. Lets use it!Some Examples• Why visualization helps…Spring 2009 CS 4460/7450 10Spring 2009 CS 4460/7450 11Which cereal has the most/least potassium?Is there a relationship between potassium and fiber?If so, are there any outliers?Which manufacturer makes the healthiest cereals?Questions:Spring 2009 CS 4460/7450 12PotassiumPotassiumFiberSpring 2009 CS 4460/7450 13Even Tougher?• What if you could only see one cereal’s data at a time? (e.g. some websites)• What if I read the data to you?Spring 2009 CS 4460/7450 14Another Illustrative ExampleAnother Illustrative ExampleSpring 2009 CS 4460/7450 15Four Data Sets• Mean of the x values = 9.0• Mean of the y values = 7.5• Equation of the least-squared regression line is: y = 3 + 0.5x• Sums of squared errors (about the mean) = 110.0• Regression sums of squared errors (variance accounted for by x) = 27.5• Residual sums of squared errors (about the regression line) = 13.75• Correlation coefficient = 0.82• Coefficient of determination = 0.67http://astro.swarthmore.edu/astro121/anscombe.htmlhttp://astro.swarthmore.edu/astro121/anscombe.htmlSpring 2009 CS 4460/7450 16The Data SetsSpring 2009 CS 4460/7450 17The Values1 2 3 410.0, 8.04 10.0,9.14 10.0, 7.46 8.0, 6.58 8.0, 6.95 8.0,8.14 8.0, 6.77 8.0, 5.7613.0, 7.58 13.0,8.74 13.0,12.74 8.0, 7.719.0, 8.81 9.0,8.77 9.0, 7.11 8.0, 8.84 11.0, 8.33 11.0,9.26 11.0, 7.81 8.0, 8.4714.0, 9.96 14.0,8.10 14.0, 8.84 8.0, 7.046.0, 7.24 6.0,6.13 6.0, 6.08 8.0, 5.254.0, 4.26 4.0,3.10 4.0, 5.39 19.0,12.5012.0,10.84 12.0,9.13 12.0, 8.15 8.0, 5.567.0, 4.82 7.0,7.26 7.0, 6.42 8.0, 7.915.0, 5.68 5.0,4.74 5.0, 5.73 8.0, 6.89Spring 2009 CS 4460/7450 18Exercise Redux• Let’s check what you did…• People work differentlySpring 2009 CS 4460/7450 19Visualization• Definition− “The use of computer-supported, interactive visual representations of data to amplify cognition.”From [Card, Mackinlay Shneiderman ‘98]Spring 2009 CS 4460/7450 20Visualization• Often thought of as process of making a graphic or an image• Really is a cognitive process− Form a mental image of something− Internalize an understanding• “The purpose of visualization is insight, not pictures”− Insight: discovery, decision making, explanationSpring 2009 CS 4460/7450 21Main Idea• Visuals help us think− Provide a frame of reference, a temporary storage area• Cognition → Perception• Pattern matching• External cognition aid− Role of external world in thinking and reasonLarkin & Simon Larkin & Simon ’’8787Card, Mackinlay, Shneiderman Card, Mackinlay, Shneiderman ‘‘98 98 Spring 2009 CS 4460/7450 22When to Apply?• Many other techniques for data analysis− Data mining, DB queries, machine learning…• Visualization most useful in exploratory data analysis− Don’t know what you’re looking for− Don’t have a priori questions− Want to know what questions to askSpring 2009 CS 4460/7450 23Part of our Culture• “I see what you’re saying”• “Seeing is believing”• “A picture is worth a thousand words”Spring 2009 CS 4460/7450 24OverviewVisualization“Data visualization”ScientificvisualizationInformationvisualizationSpring 2009 CS 4460/7450 25Scientific Visualization• Primarily relates to and represents something physical or geometric− Often 3-D− ExamplesAir flow over a wingStresses on a girderTorrents inside a tornadoOrgans in the human bodyMolecular bondingNot the focus of this classSpring 2009 CS 4460/7450 26Information Visualization• What is “information”?− Items, entities, things which do not have a direct physical correspondence− Notion of abstractness of the entities is important too− Examples: baseball statistics, stock trends, connections between criminals, car attributes...Spring 2009 CS 4460/7450 27Information Visualization• What is “visualization”?− The use of computer-supported, interactive visual representations of data to amplify cognition.From [Card, Mackinlay Shneiderman ‘98]Spring 2009 CS 4460/7450 28Information Visualization• Components:− Taking items without a direct physical correspondence and mapping them to a 2-D or 3-D physical space.− Giving information a visual representation that is useful for analysis and decision-makingSpring 2009 CS 4460/7450 29Two Key Attributes•
View Full Document