DOC PREVIEW
Duke STA 101 - Lab Assignment 3

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

STA 101.01 May 22, 20022002: Summer Session ILab Assignment 3: Descriptive StatisticsCreating and Analyzing New Variables.Let’s continue using the Cereal.JMP data set from last time. We are going to create a newvariable by ”standardizing Complex Carbos”.1. First we need to create a new column, Cols → New Column, and call it StandardizedComplex Carbos. Click on New Property and select Formula. Select ”Complex Carbos”under Table Columns, then use the keypad to subtract the mean of ”Complex Carbos”.We do this by first selecting ”Complex Carbos” again and then selecting Statistical → ColMean under Functions. Click on the entire formula, so there is a red box around the entirequantity. Then use the keypad to divide by standard deviation of ”Complex Carbos”. It doesnot matter whether which order you select first - Statistical → Col Std Dev or ”ComplexCarbos.” After the formula is complete, click on Apply. Directly after selecting Apply youshould see ”Evaluations done” at the bottom left of the screen, then click OK. Once you getback to the New Column window, click on Apply → OK again. Look at the left handside of the screen where the variables are listed. The ”Standardized Complex Carbo” variableshould have a yellow box with a black plus sign in it, this indicates the variable was createdusing a formula.2. Next, make a histogram of the new variable and make the graph orientation horizontal. Clickon the red arrow next to variable name and select Fit Distribution → Normal. This willoverlay a normal curve over the histogram. How well does the normal curve approximate thedata?The normal curve fits the data good on the right half of the histogram, butnot the greatest on the left side. A Normal Quantile plot also illustrates this fit.3. Now let’s focus on the elements of the box plot. The rectangle represents the IQR and thevertical line the median. The diamond within the IQR gives a 95% confidence interval forthe mean, which is indicated by the vertical points of the diamond. (We will discuss 95%confidence intervals later in the course.)4. To construct a Normal Quantile Plot click on the red arrow next to the variable name andselect Normal Quantile Plot. If the normal curve is a good approximation of the data,then the points should fall within the two red confidence bands. Based on this plot, wouldyou say the normal curve is a good approximation for this variable?As in the previous question, the fit is good for half of the data, but not goodfor the other half.5. Create four new columns: ”Calories” + 25, ”Calories” - 25, ”Calories”*2, ”Calories”*0.5.Draw a histogram of ”calories” and the four transformations of the ”calories”. Specify all5 histograms with a single use of Analyze → Distribution of Y. You can do this byhighlighting the desired variables, then selecting Y, Columns. Repeat until all five havebeen selected. Now select the red arrow to the left of Distribution and click on Stack.• How does adding 25 to calories affect the histogram?Shifts the histogram to the right1• How does subtracting 25 from calories affect the histogram?Shifts the histogram to the left• How does multiplying calories by 2 affect the histogram?Expands the histogram by a factor of 2• How does multiplying calories by 0.5 affect the histogram?Compresses the histogram by a factor of 2Identifying Observations within a Variable.Draw histograms for ”calories” and ”manufacturer” in a single plot. Click on the bar representingKellogg. Notice what happens in the histograms of calories. Alternate clicking on the histogram fordifferent manufacturers. Watch what happens in the histogram of calories. This gives a preliminarylook at the calorie distribution within each group.Importance of Plotting Histograms.It is always important to plot the histogram of variable to check for irregularities that may occur.Plot the histogram of ”Total Carbs”, and explain what you see. What happens when you overlaya normal curve over the histogram? Normal Quantile plot?The histogram for Total Carbs is bimodal, meaning that it has two different peaks.When a normal curve is drawn over the histogram is does not approximate that datawell because it is very flat (compensating for both peaks) and not a good fit. Thenormal quantile plot tell the same picture.Other Options.Select the brush (next to the hand) and investigate the affects the tool has on a


View Full Document

Duke STA 101 - Lab Assignment 3

Download Lab Assignment 3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lab Assignment 3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lab Assignment 3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?