MASON PSYC 612 - Statistical Graphics and Data Presentation - D278844

Home> Schools> George Mason University> (PSYC) > PSYC 612> Statistical Graphics and Data Presentation

MASON PSYC 612 - Statistical Graphics and Data Presentation

Pages 51

Download Save

Unformatted text preview:

PSYC 612, SPRING 2007Lecture 7: Statistical Graphics and Data PresentationMarch 5, 2007Contents1 Preamble to today’s lecture 12 Part 1: Cursory review of the assigned readings 12.1 Ehrenberg (1977) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 Part 2: Important aspects not covered in the readings 53.1 Distinguishing between data and information . . . . . . . . . . . . . . . . . . 53.2 Goo d points are timeless, good articles are not . . . . . . . . . . . . . . . . . 63.3 Research rarely guides practice and vice versa . . . . . . . . . . . . . . . . . 64 Part 3: Some additional thoughts on data presentation 61 Preamble to today’s lectureExamsTufte readingsStatistical Rules of Thumb reading2 Part 1: Cursory review of the assigned readings2.1 Ehrenberg (1977)Data presentation frequently gets the least attention and creates the greatest confusionwhen presenting empirical research. The problems often occur during presentations wherethe audience members have limited time and, perhaps limited interest to grasp the point.Presenters display large pages of numbers. Those numbers may be expressed as integers or1floating point values. Regardless of the presentation, the values almost always seem jumbled,uniterpretable, sparse, or cramped. What makes a good presentation? I am confidenteveryone has an opinion about but few of us are bold enough to have a definitive answer.Ehrenberg attempts to provide us with some guidelines. Those guidelines he details in his“Six Basic Rules.”1. Rounding helps reduce the clutterv1 v2 v3v1 1.0000000 0.6039743 -0.1201797v2 0.6039743 1.0000000 0.2601364v3 -0.1201797 0.2601364 1.0000000That is a mess. Ehrenberg suggests that matrix would look better if we presented itin the following manner:v1 v2 v3v1 1.00 0.60 -0.12v2 0.60 1.00 0.26v3 -0.12 0.26 1.002. Averages help provide perspectiveWe start with a simple 3-variable dataset with 10 observations.v1 v2 v31 1.5077704 -1.9534591 0.15890272 1.2249646 -0.0959000 0.97326233 0.6570191 1.2268209 1.13230074 -2.3627428 -3.2672082 1.00139045 0.1720719 0.1706215 1.01456606 0.5883822 0.5497726 0.72293127 0.2953197 -1.2382645 -0.75106308 0.3979605 -0.6370861 -0.49962549 1.7386045 0.2755249 0.752008910 1.0262608 1.2612794 0.4983007If we round those observations for presentation we get:v1 v2 v31 1.51 -1.95 0.162 1.22 -0.10 0.973 0.66 1.23 1.134 -2.36 -3.27 1.0025 0.17 0.17 1.016 0.59 0.55 0.727 0.30 -1.24 -0.758 0.40 -0.64 -0.509 1.74 0.28 0.7510 1.03 1.26 0.50Now if we provide row and column means we get a bit better appreciation of the data.v1 v2 v3 RowAvg1 1.51 -1.95 0.16 -0.102 1.22 -0.10 0.97 0.703 0.66 1.23 1.13 1.014 -2.36 -3.27 1.00 -1.545 0.17 0.17 1.01 0.456 0.59 0.55 0.72 0.627 0.30 -1.24 -0.75 -0.568 0.40 -0.64 -0.50 -0.259 1.74 0.28 0.75 0.9210 1.03 1.26 0.50 0.93ColAvg 0.52 -0.37 0.50 0.223. Column comparisons tend to be less cognitively demandingWhich of the following is easier to process?v1 v21 1.51 -1.952 1.22 -0.103 0.66 1.234 -2.36 -3.275 0.17 0.176 0.59 0.557 0.30 -1.248 0.40 -0.649 1.74 0.2810 1.03 1.26or1 2 3 4 5 6 7 8 9 10v1 1.51 1.22 0.66 -2.36 0.17 0.59 0.30 0.40 1.74 1.03v2 -1.95 -0.10 1.23 -3.27 0.17 0.55 -1.24 -0.64 0.28 1.26You decide.34. Sorting creates clarity from chaosSorting does something magical.v1 v2 v3 RowAvg1 -2.36 -3.27 1.00 -1.542 0.30 -1.24 -0.75 -0.563 0.40 -0.64 -0.50 -0.254 1.51 -1.95 0.16 -0.105 0.17 0.17 1.01 0.456 0.59 0.55 0.72 0.627 1.22 -0.10 0.97 0.708 1.74 0.28 0.75 0.929 1.03 1.26 0.50 0.9310 0.66 1.23 1.13 1.01ColAvg 0.52 -0.37 0.50 0.225. Reasonable spacing reduces the required visual field6. Graphs offer more than tables but at a cost that may be dearConsider the different depictions of the relationships between v1, v2, and v3.Method 1:v1 v2 v3v1 1.00 0.60 -0.12v2 0.60 1.00 0.26v3 -0.12 0.26 1.00Method 2:4●●●●●●●●●−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5ObservedPredictedMethod 3:5−1.0 0.0 0.5 1.0 1.5−1.0 0.0 1.0 2.0Fitted valuesResiduals●●●●●●●●●●Residuals vs Fitted149●●●●●●●●●●−1.5 −0.5 0.5 1.5−2 −1 0 1 2Theoretical QuantilesStandardized residualsNormal Q−Q419−1.0 0.0 0.5 1.0 1.50.0 0.5 1.0 1.5Fitted valuesStandardized residuals●●●●●●●●●●Scale−Location4190.0 0.2 0.4 0.6−2 −1 0 1 2LeverageStandardized residuals●●●●●●●●●●Cook's distance10.50.51Residuals vs Leverage4173 Part 2: Important aspects not covered in the read-ings3.1 Distinguishing between data and informationData are what we observe. Information comes only when data reduce our uncertainty. Thisis a strict definition of the two and I find it helps disentangle the sloppy use of the twoterms. I suggest you consider adopting these definitions in the future. Why? Because toooften we use these terms without much regard for what either means. Data are simply that- data. Data may be numbers, letters, sentences, pictures, artifacts, rocks, voice recordings,or whatever else we choose to collect. The mere fact that we collect the data does not meanthat they will be informative. Data are only informative when they convey information.Contrary to the cliches and popular notions, data do not speak for themselves. They neverhave nor will they ever in the future. We must process data to extract the information. Thatprocessing may come in the form of statistics or in the form of synthesis via graphical or6tabular means. Information comes only when we gain insights and thus lose our uncertaintyabout a phenomena. Thus, data only become information when we change.3.2 Good points are timeless, good articles are notBefore you lay into this article as absolute rubbish, consider the publication date. Ehrenbergpublished this article approximately 30 years ago when computers were behemoths housedin buildings. No computer was able to readily generate graphs or tables. All of these datadisplay metho ds were done by hand. Now that we have computers, the problems are worse.People had the opportunity to take some care with their work. There is little need to attendmuch to our data presentation because they can be redone with almost no effort. The changedoes not make our new situation worse or better - it is merely different. What we need nowis to go back to the mindset of 30 years ago and implement some of these ideas that can bereadily implemented in seconds.3.3 Research rarely guides practice and vice versaSome of the most brilliant scientists who

View Full Document


School:
Email:
New Password:
Confirm Password:

MASON PSYC 612 - Statistical Graphics and Data Presentation

Sign up for free to view:

Please select your school