DOC PREVIEW
Cal Poly STAT 217 - Describing Data

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Winter, 2012 Thursday, Jan. 5Stat 217 – Day 3Describing DataRecap: - Variable = any characteristic of an observational unit that can be assigned a number or categoryo Categorical variable: places observational units into groups or categorieso Quantitative variable: assigns meaningful numerical values to the observational units- Variability = variables take on different values from observational unit to observational unitStep 4 of the Statistical Investigation Method is to Explore the data. In this investigation you will learn how to use technology to explore data (based on whether the data is quantitative or categorical) and also some statistical terminology to use. In particular, we typically want a graphical and a numerical summary of the distribution of the variable (the values the variable can take and the frequency or relative frequency of those values). You have already seen a bar graph for displaying the distribution of a categorical variable.Task 1: The following graphs display the results from the Initial Course Survey on the followingvariables. A. your Coke vs Pespi preferencesB. your heights C. your ratings of the value of statisticsD. your year in schoolE. cost of your last hair cut F. number of siblings G. genderH. number of states visitedThe variable is displayed along the horizontal axis (lower numbers on the left) and the heights of the bars indicate the frequency of observational units at those values (which may be groups or numbers). Your task is to identify which graph belongs to which variable. You will be gradedon your justification more than the correctness of your matches.(1) (2) (3) (4) (5) (6) (7) (8)Winter, 2012 Thursday, Jan. 5Task 2: Now let’s use technology explore a couple of these variables. From the Lecture Notes/Calendar webpage, click the link for variables.jmp. This should hopefully launch the JMPsoftware package. (If a “tip” window appears, you can close it.) When the file opens you should see a “spreadsheet” of the data. (There is some chance you will need to open JMP first,from Start > Math and Stat Programs > JMP 9 Pro, and press Continue to ignore the “old” license.)(a) Is the variable height quantitative or categorical?Note: JMP will recognize this and place a blue icon next to the variable in the data Columns window. (You can click on this icon to change a variable type if ever necessary.)Use JMP to create a histogram of this variable and calculate some summary statistics.o From the menu bar select Analyze > Distributiono In the Select Columns window, double click on heights so that it appears in the Y, Columns box.o Press OK.o When the graph appears, click on the red triangle next to the variable name to pull down an options menu and select Display Options > Horizontal Layout.o Use ctrl-C to copy and graph and then ctrl-V to paste below. (If you get strange characters, go back and click on the bar that says Q16_height and try again.)Paste output here(b) Describe the graph, how does this variable behave? In particular, is it roughly symmetric (the two sides are roughly mirror images)? Is this what you would expect for this variable? Why?In the output, you will see the mean, which is the average of everyone’s height in the class, and the standard deviation (Std Dev) which is a measure of the "spread” or horizontal width of the distribution. Distributions with larger standard deviations are more variable.(c) If you were to look at the heights of the customers coming into McDonald’s today at lunch, would you expect the distribution of these heights to have more or less variability than the heights of Stat 217 students? Explain.Winter, 2012 Thursday, Jan. 5(d) Repeat the above JMP steps for the number of states visited variable and again answer question (b). (Make sure you discuss why you are not surprised that this variable is not symmetric, what that implies about the behavior of the variable.) Also, which distribution is more “spread out” (look at the “width” of the distribution, the range of the values, and the values of the standard deviations)?)Paste output here(e) Repeat the above JMP steps for the gender variable. Note that JMP recognizes this as a categorical variable and so constructs a bar graph instead of a histogram, and counts and proportions as the summary statistics. How do you think JMP decides which category to put on the left and which on the right?Now create separate histograms of the haircut prices by males and females.o Choose Analyze > Distribution.o Double click on the haircut variable so it appears in the Y, Columns box.o Click on the gender variable and press the By button.o Press OK.o Turn the histograms horizontal.Make a screen capture and crop to see just this window.(f) Compare the two distributions. In particular, does one gender tend to pay more than the other? And is one distribution more spread out than the other? (Cite evidence from the graphs and from the numbers) To turn in: Submit your answers (typed or handwritten but with output integrated) to the abovequestions by Tuesday. Submit one report with both students’ names.Winter, 2012 Thursday, Jan. 5Alternative technology (Optional)You can also create dotplots and histograms using some applets from the course webpage. Both the Dotplot Summaries applet and the Histogram Bin Width applet allow you to paste in your own data (see notes at bottom of page). The Dotplot Summaries applet will also output descriptive statistics.Use the Dotplot Summaries applet to create a dotplot of the height variable and to calculate some summary statistics.o Open the variables.xls file. Highlight theheight column and copy to the clipboard(e.g., ctrl-C).o In the applet, press the Edit/Paste Data button.o A text entry data window should appear (it may be behind other windows). Paste (e.g., ctrl-V) into this window and press OK.o To display summary statistics, check the boxes in the left panel.o Make a screen capture of the applet outputas before. To look at height by gender, put the two columns side by side (height and then gender)and then copy and paste those two columns together into the paste data window. Your categorical variable can have up to 4 groups.The Histogram Bin Width applet works similarly (for just one group), copy the data from the Excel column and paste into the data entry window. Change the slider or use the Bin width box to adjust the number and size of bins. How does the picture change as


View Full Document

Cal Poly STAT 217 - Describing Data

Download Describing Data
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Describing Data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Describing Data 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?