Stanford HRP 223 - Graphics in EG and R (104 pages)

Previewing pages 1, 2, 3, 4, 5, 6, 7, 49, 50, 51, 52, 53, 54, 55, 98, 99, 100, 101, 102, 103, 104 of 104 page document View the full content.
View Full Document

Graphics in EG and R



Previewing pages 1, 2, 3, 4, 5, 6, 7, 49, 50, 51, 52, 53, 54, 55, 98, 99, 100, 101, 102, 103, 104 of actual document.

View the full content.
View Full Document
View Full Document

Graphics in EG and R

69 views


Pages:
104
School:
Stanford University
Course:
Hrp 223 - Introduction To Data Management And Analysis In Sas
Unformatted text preview:

Graphics in EG and R HRP223 2009 November 16th 2009 Copyright 1999 2009 Leland Stanford Junior University All rights reserved Warning This presentation is protected by copyright law and international treaties Unauthorized reproduction of this presentation or any portion of it may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law 1 Robbins Creating More Effective Graphics by Naomi Robbins is a wonderful book showing the right and wrong ways to visualize scientific data Read it when you have an afternoon off It is an ideal read on a transcontinental flight 2 Why Do Data Visualization Well designed pictures will show you the details and the whole pattern in your data Numeric descriptions can easily hide important patterns Some patterns are hard to detect in tables Whenever data is reported over time or locations you need art YOU CAN LEARN A LOT BY JUST LOOKING Yogi Berra 3 Fisher s Plot Data Reported in Cleveland Year 1 Year 2 Based on code written by Robert Allison at SAS Institute 4 Scatter Plot for Correlations 15 15 10 10 5 5 0 0 5 10 15 0 20 15 15 10 10 5 5 0 0 5 10 15 20 Anscombe 1973 Graphs in Statistical Analysis 0 0 5 10 15 20 0 5 10 15 20 All have r2 675 Bad Things First I want to talk about bad graphics that I frequently see 3d Pie Donuts Stacked graphics 6 General 3D graphics Don t Don t Don t While the SAS implementation of 3D graphics is relatively good don t use 3D effects unless you are measuring something in 3D Even then don t 7 Tufte is a God to many The empiricist in me is very nervous about the amount of pontificating in his books I want to have evidence based advice His best advice is to put no extra ink on the page Think about the ink to information ratio Remove all chart junk Note the irony of the chart junk on this slide 8 Example Bar Chart Serum Samples in Each Trimester You can remove ink rather than adding 9 Ink to Information Ratio How much ink for seven numbers Based on Soukup Davidson 2002 Visual Data Mining 10 Cleveland If you want to know how to do scientific visualization you must read William Cleveland s work He attempted to quantify what makes a good graphic good His early work on graphics is one of the reasons why R S plus is taking over the statistical world 11 Pie is bad Work by Cleveland and experimental psychologists suggests that people are bad at judging the relative magnitude of angles if you twist the rotation of the pie you can cause people to systematically misjudge the size of the angles a 3rd dimension makes judgment worse If you get a glossy handout with a 3D pie assume someone is lying to you Don t use them 12 Don t Explode This exploded 3D pie brought to you by Excel is nearly useless for judging amounts Total tweaked twisted wrecked 13 Forbidden Donut Donut plots have the same problems as pies if not worse 14 Stacking is Bad Cleveland also quantified the fact that people are bad at judging the relative height of stacked data 15 Wow a cinnamon roll plot Good luck making rapid judgments using this stacked 3D pie 16 What is a good graphic Don t make your audience think unnecessarily Minimize the amount of ink on the page This needs to be studied Show the central tendency and the variability Plot the quantity inference that you want people to notice Be sure colorblind people can understand it Use a black and white photocopier and make sure you can distinguish all groups 17 Avoid Thinking But labels on the graphic directly instead of using a key If you want people to compare the difference between two lines plot the difference not the two lines 18 Bivariate Comparisons with Lines People are extremely bad at judging the distance between two curves Never ask people to judge up and down vertical distances between curves The distance between the two curves is the same at all points Based on Robbins Creating More Effective Graphs 2005 19 Plot Types Univariate one variable Categorical variables Bar charts Dot plots Waffle plots Continuous variables Histogram Box plot Violin plots 20 Bar Charts The ink to information ratio is lousy A one dimensional quantity is being expanded into two dimensions Doubling of the amount corresponds to how much of an increase in area 21 SAS Bar Charts SAS makes the reader do extra work by rotating the axis labels in ActiveX images They pointlessly include variable labels by default 22 How to do it Notice you can Edit the data and apply filters You can right click on variables and apply user defined formats off the Properties dialog 23 First create the format In the Data windowpane of the Bar Chart GUI right click on the variable and change the format to the User Defined format you had created 24 The GUI is Solid My only complaints are that the rotate grouping values text does not work position in this example and the summary statistics do not show up when you request ActiveX images 25 Saving the Graphic for Publication The easiest way to get publication quality graphics is to set the output type to be RTF 26 PNG format ActiveX image format 27 Default Output and Graphics The default graphic format in EG is ActiveX These images can be edited even on the web but they only display with Internet Explorer I have set my graphics to display as ActiveX images Tweak this with Tools Options Graph 28 Types of Images The default formats of the images are determined by the ODS destinations you are using LISTING pgn visible in the Windows Image Fax Viewer HTML png gif jpg contained in web pages and visible in Internet Explorer Firefox or Opera LATEX PostScrpt epsi gif jpeg pgn are visible in GhostView PCL or PS contained in Postscript file are visible in GhostView PDF contained in pdf which is visible with Adobe Reader RTF visible in MS Word 29 I Typically Use HTML Include image dpi 200 to set the resolution to be higher than the default 100 dots per inch Try 200 for final images pasting into MS Office This is the appearance template For optimal results use Analysis color Default overdistinguishes symbols for color or B W Journal or journal2 etc black and white Statistical or statistical2 etc color This says the images should show tooltips with extra statistical details when you hover the mouse over parts of the graphic I can t image these 30 Useful ods graphics Options After the ods graphics on statement type a then imagename fileName reset resets the counter of images back to 0 imagefmt jpg width 4 5 in height 4 5 in If you set only width or height it will use a 4


View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Graphics in EG and R and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Graphics in EG and R and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?