UNL STAT 870 - Inferences in Regression Analysis - D1674706

Home> Schools> University of Nebraska-Lincoln> (STAT) > STAT 870> Inferences in Regression Analysis

DOC PREVIEW

UNL STAT 870 - Inferences in Regression Analysis

School name University of Nebraska-Lincoln

Course Stat 870- Multiple Regression Analysis

Pages 33

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Chapter 2: Inferences in Regression AnalysisAs X changes, E(Y) does not change. X is not linearly related to Y.a)Chapter 2: Inferences in Regression AnalysisPopulation model: Yi=0+1Xi+i where 0 and 1 are parametersXi are known constantsi ~ independent N(0,2)2.1 Inferences concerning 1Is X linearly related to Y? Perform a hypothesis test to answer these questions.Suppose 1 = 0Population Model: Yi = 0 + 1Xi + iThen Yi = 0 + 0*Xi + i= 0 + i Example plot: Suppose 0=3.  2012 Christopher R. Bilder2.101234560 2 4 6 8XYAs X changes, E(Y) does not change. X is not linearly related to Y.Use b1 to determine if 1 = 0 using a hypothesis test.How? Use b1’s sampling distribution If we repeated the process of taking a sample from the population an infinite amount of times (calculating b1 each time), the average of the b1’s would be 1, the variance of the b1’s would be2 2(X X) , and b1~N(1,2 2(X X) ). i.e., E(b1)=1 and Var(b1)=2(b1)=2 2(X X) (NOTE: Different notation for variance!)  2012 Christopher R. Bilder2.2Proof: We already saw that 1 1E(b ) =b in Chapter 1. Let’s examine the variance now. Remember that ni ii 11n2ii 1(X X)(Y Y)b(X X)==- -�=-�. Also, note that the numerator can be simplified to ben ni i i i ii 1 i 1n ni i ii 1 i 1(X X)(Y Y) (X X)Y (X X)Y(X X)Y (X X)Y= == =� �- - = - - -� �� = - - -� �( )n ni i ii 1 i 1n ni i ii 1 i 1(X X)Y Y (X X)(X X)Y Y X nX= == =� � � �= - - -� �� = - - -� �� ni ii 1ni ii 1(X X)Y Y 0(X X)Y==� �= - - *�� = -�� Then  2012 Christopher R. Bilder2.3( )ni ii 11n2ii 1ni i2ni 12ii 1n2i i2ni 12ii 1(X X)YVar(b ) Var(X X)1Var (X X)Y(X X)1(X X) Var(Y )(X X)======� �-�� =� �-�� = -�� -�� = -�� -�� n2 2i2ni 12ii 12n2ii 11(X X)(X X)(X X)==== - s�� -�� s=-�Remember that 2i i i iVar( a Y ) a Var(Y )  (see p. 646 equation A.3 or Chapter 4 of my STAT 380 notes)Notice that the variance of b1 has a parameter in it - 2. To find the estimated variance of b1, 2 is replaced by its estimate – MSEn2 21 1 ii 1Var(b ) s (b ) MSE (X X)�== = -�  2012 Christopher R. Bilder2.4Sampling distribution of 1 1 1t (b ) Var(b )�*= - bPurpose: Find a test statistic for a test of 1=0Note: The standardized quantity, 1 1 1(b ) Var(b )- b, is distributed as N(0,1) random variable. Quick Review - A standardized quantity is: statistic E(statistic)Var(statistic)- Since 2 unknown, we generally can not use this quantity for hypothesis testing. Use the “studentized” version of the above quantity: 1 1 1t (b ) Var(b )�*= - bsince this contains no unknown parameters (1 will be specified in the hypothesis test). t*~t(n-2) where t(n-2) represents a random variable with a student t-distribution (or just t-distribution) withn-2 degrees of freedom.  2012 Christopher R. Bilder2.5For a proof, see p. 45 (or 11.44 of my STAT 380 Chapter11 notes at www.chrisbilder.com/stat380/schedule.htm) Hypothesis Test for 1=0:1) State H0 and HaH0: 1=0 (no linear relationship)Ha: 10 (linear relationship)2) Test statistic: 1 1 1t (b ) Var(b )�*= - b3)Critical value:  t(1-/2; n-2) where  is the type I error level; note that this is the 1-/2 quantile from a t-distribution with n-2 degrees of freedom. 4)Reject or don’t reject H0 T DistributiontProbabilityDon't Reject HoCritical ValueReject HoReject Ho0Critical Value5)ConclusionReject H0 – X is linearly related to YDon’t Reject H0 – There is not sufficient evidence to show that X is linearly related to Y 2012 Christopher R. Bilder2.6where ____ means to put in what X and Y are in the problemUsing a p-value: 1) Same2) State p-value: p-value = 2*P( t(n-2) > |t*|) where t(n-2) denotes a t random variable with n-2 degress of freedomT DistributiontProbability0|t*|Note: The p-value gives the probability of findinga value of |t*| at least this great assuming the nullhypothesis is true.3) State 4) Reject H0 if p-value   Don’t reject H0 if p-value > 5) SameHow can you find p-values and critical values in R for thet-distribution?  2012 Christopher R. Bilder2.7In the summary() output for an object resulting from using lm(), the p-value for a test of H0: 1=0 vs. Ha: 10 will be given. The more general way to find it is using the pt() function which finds the probability of random variable t(n-2) is less than a particular value. For example, if the test statistic was 1.96 with 10 degrees of freedom, we obtain P(t(n-2) < 1.96), > pt(q = 1.96, df = 10)[1] 0.9607819Since the p-value is 2P( t(n-2) > |t*|), we can change this to > 2*(1-pt(q = abs(1.96), df = 10))[1] 0.07843624To find a critical value, use the qt() function,> qt(p = 0.95, df = 10)[1] 1.812461Thus, P( t(n-2) < 1.81) = 0.95; i.e., 1.81 is the 0.95 quantile from a t-distribution with 10 degrees of freedom. (1-  )100% confidence interval (C.I.) for 1: The “usual” type of t-distribution based confidence interval is: Estimator  t(1-/2, df)*(S.E. of estimator) 2012 Christopher R. Bilder2.8where df=degrees of freedom and S.E. is standard errorFor 1: 1 1b t(1 / 2,n 2) * Var(b )�� - a -This can be rewritten as: 1 1 1 1 1b t(1 / 2,n 2) Var(b ) b t(1 / 2,n 2) Var(b )� �- - a - <b < + - a -See p. 2.20 for a review of how knowing the distribution of t* can be used to find a C.I. Example: Sales and advertisingIs advertising linearly related to sales? Use =0.051) H0: 1=0 Ha: 102) 1 1 1t (b ) Var(b )�*= - b = 0.70/0.1915 = 3.6556X Y (X-X)21 1 42 1 13 2 04 2 15 4 410Note: 2012 Christopher R. Bilder2.9n21 ii 1Var(b ) MSE (X X) 0.3667 /10 0.1915�== - = =�3) t(1-0.05/2; 5-2) = t(0.975;3)= 3.182> qt(p = 1-0.05/2, df = 3)[1] 3.1824464) Since 3.6556 > 3.182 reject H0.T DistributiontProbabilityDon't Reject Ho-3.1823.182Reject HoReject Ho03.665) Advertising is linearly related to sales.See t-dist_program.R for an R program that shows how to draw a plot similar to the one above. Example: HS and College GPA (HS_college_GPA_ch2.R)Suppose  = 0.01 for the hypothesis test. > #Fit the simple linear regression model and save the  2012 Christopher R. Bilder2.10results in mod.fit> mod.fit<-lm(formula = College.GPA ~

View Full Document