DOC PREVIEW
PSU STAT 501 - Transforming the data

This preview shows page 1-2-3-4-28-29-30-31-57-58-59-60 out of 60 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Fixing problems with the modelOptions for fixing problems with the modelAbandoning the modelChoices for transforming the dataTransforming the X values onlySlide 6Memory retentionFitted line plotResidual vs. fits plotNormal probability plotTransform the X valuesFitted line plot using transformed X valuesResiduals vs. fits plot using transformed X valuesNormal probability plot using transformed X valuesPredicting new proportionPredicting new proportionTransforming the Y values onlySlide 18Gestation time and birth weight for mammalsSlide 20Slide 21Slide 22Transform the Y valuesFitted line plot using transformed Y valuesResidual vs. fits plot using transformed Y valuesNormal probability plot using transformed Y valuesPredicting new gestationSlide 28Transforming both the X and Y valuesSlide 30Diameter (inches) and volume (cu. ft.) of 70 shortleaf pinesResiduals vs. fits plotSlide 33Transform the Y values onlyFitted line plot using transformed Y valuesResiduals vs. fits plot using transformed Y valuesSlide 37Transform both the X and Y valuesFitted line plot using transformed X and Y valuesResidual plot using transformed X and Y valuesNormal probability plot using transformed X and Y valuesTransformation strategiesEffects of transformationsSlide 44Slide 45Knowing functional relationship is of the power formKnowing functional relationship is of the exponential formSlide 48Family of power transformationsEffect of loge transformationSlide 51Some guidelines for specifying λPossible transformationsSlide 54Slide 55Slide 56Slide 57Common variance stabilizing transformationsSlide 59Transforming data in MinitabFixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data.Options for fixing problems with the model•Abandon simple linear regression model and find a more appropriate – but typically more complex – model.•Transform the data so that the simple linear regression model works for the transformed data.Abandoning the model•If not linear: try a different function, like a quadratic (Ch. 7) or an exponential function (Ch. 13).•If unequal error variances: use weighted least squares (Ch. 10).•If error terms are not independent: try fitting a time series model (Ch. 12).•If important predictor variables omitted: try fitting a multiple regression model (Ch. 6).•If outlier: use robust estimation procedure (Ch. 10).Choices for transforming the data•Transform X values only.•Transform Y values only.•Transform both the X and the Y values.Transforming the X values onlyTransforming the X values only•Appropriate when non-linearity is the only problem – normality and equal variance okay – with the model.•Transforming the Y values would likely change the well-behaved error terms into badly-behaved error terms.Memory retentiontime prop1 0.845 0.7115 0.6130 0.5660 0.54120 0.47240 0.45480 0.38720 0.361440 0.262880 0.205760 0.1610080 0.08• Subjects asked to memorize a list of disconnected items. Asked to recall them at various times up to a week later• Predictor time = time, in minutes, since initially memorized the list.• Response prop = proportion of items recalled correctly.Example 1Fitted line plot10000 5000 00.90.80.70.60.50.40.30.20.10.0timepropS = 0.152284 R-Sq = 57.1 % R-Sq(adj) = 53.2 %prop = 0.525870 - 0.0000557 timeRegression PlotExample 1Residual vs. fits plot0.50.40.30.20.10.00.30.20.10.0-0.1-0.2Fitted ValueResidualResiduals Versus the Fitted Values(response is prop)Example 1Normal probability plotP-Value (approx): > 0.1000R: 0.9751W-test for NormalityN: 13StDev: 0.145801Average: -0.00000000.30.20.10.0-0.1-0.2.999.99.95.80.50.20.05.01.001ProbabilityRESI1Normal Probability PlotExample 1Transform the X valuestime prop log10_time1 0.84 0.000005 0.71 0.6989715 0.61 1.1760930 0.56 1.4771260 0.54 1.77815120 0.47 2.07918240 0.45 2.38021480 0.38 2.68124720 0.36 2.857331440 0.26 3.158362880 0.20 3.459395760 0.16 3.7604210080 0.08 4.00346Change (“transform”) the predictor time to log10(time).Example 1Fitted line plot using transformed X values0 1 2 3 40.00.10.20.30.40.50.60.70.80.9log10timepropprop = 0.846415 - 0.182427 log10timeS = 0.0233881 R-Sq = 99.0 % R-Sq(adj) = 98.9 %Regression PlotExample 1Residuals vs. fits plot using transformed X values 0.90.80.70.60.50.40.30.20.10.040.030.020.010.00-0.01-0.02-0.03-0.04Fitted ValueResidualResiduals Versus the Fitted Values(response is prop)Example 1Normal probability plotusing transformed X valuesP-Value (approx): > 0.1000R: 0.9786W-test for NormalityN: 13StDev: 0.0223924Average: -0.00000000.030.00-0.03.999.99.95.80.50.20.05.01.001ProbabilityRESI1Normal Probability PlotExample 1Predicting new proportion Estimated regression function: timeY10log182.0846.0ˆTherefore, we predict the proportion of words recalled after 1000 minutes is: 30.03182.0846.0ˆ1000log182.0846.0ˆ10YYExample 1Predicting new proportionExample 1Predicted Values for New ObservationsNew Fit SE Fit 95.0% CI 95.0% PI1 0.299 0.00765 (0.282, 0.316) (0.245, 0.353) Values of Predictors for New ObservationsNew Obs log10tim1 3.00We can be 95% confident that a person will recall between 24.5% and 35.3% of the words after 1000 minutes.Transforming the Y values onlyTransforming the Y values only•Appropriate when non-normality and/or unequal variances are the problems.•The transformation on Y may also help to “straighten out” a curved relationship.Gestation time and birth weight for mammalsMammal Birthwgt GestationGoat 2.75 155Sheep 4.00 175Deer 0.48 190Porcupine 1.50 210Bear 0.37 213Hippo 50.00 243Horse 30.00 340Camel 40.00 380Zebra 40.00 390Giraffe 98.00 457Elephant 113.00 670• Predictor Birthwgt = birth weight, in kg, of mammal.• Response Gestation = number of days until birthExample 2Fitted line plot 0 50 100200300400500600700BirthwgtGestationGestation = 187.084 + 3.59137 BirthwgtS = 66.0943 R-Sq = 83.9 % R-Sq(adj) = 82.1 %Regression PlotExample 2Residual vs. fits plot6005004003002001000-100Fitted ValueResidualResiduals Versus the Fitted Values(response is Gestatio)Example 2Normal probability plotP-Value (approx): > 0.1000R: 0.9703W-test for NormalityN: 11StDev: 62.7025Average:


View Full Document

PSU STAT 501 - Transforming the data

Documents in this Course
VARIABLES

VARIABLES

33 pages

Load more
Download Transforming the data
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Transforming the data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Transforming the data 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?