Unformatted text preview:

Stat 401 Lab Activity 4 Wednesday November 2 2005 Part I Demonstration of the correlation coefficient In this activity we will generate and plot bivariate data having different correlation coefficients We will use a regression model to generate the X Y values In particular we will assume that the regression function of Y on X is linear E Y X x 2 x For each value x of X we will generate a value of Y by adding error noise on the regression function Thus for a value x of X a value of Y is generated as Y 2 x E where E is the noise We will always generate X values from the normal distribution with mean 9 and standard deviation 2 The noise E will also be generated from the normal distribution with zero mean Different values of the variance of E will produce data sets with different correlation coefficient Since the regression function will always be the same this activity also demonstrates that the regression function though it describes one aspect of the relation between X and Y is not designed to quantify the degree of dependence between X and Y We will generate plot and compute the correlation coefficient of three data sets generated with E having standard deviation of 1 2 and 3 respectively The Minitab command sequences described below only for the case of E having standard deviation 1 are 1 Generate 50 X values store in C1 Calc Random Data Normal Generate 50 rows of data Store in column C1 with mean 9 0 and Standard deviation 2 0 OK 2 Generate 50 E values store in C2 Calc Random Data Normal Generate 50 rows of data Store in column C2 with mean 0 0 and Standard deviation 1 0 OK 3 Calculate the 50 Y values store in C3 Calc Calculator Store result in variable C3 Expression 2 C1 C2 OK Do a scatter plot of the 50 X Y values and compute the correlation coefficient of the X Y values using command sequences already described 4 For homework 7 generate three sets of X Y values working as above except for generating X values from a normal with zero mean and standard deviation 2 and using the regression function E Y X x 2 x 2 Use the same three standard deviations for the noise E as before Compare the resulting correlation coefficients with the corresponding i e same standard deviation of the noise correlation coefficients obtained by using the linear regression function Part II The Sampling Distribution of the Sample Mean and Sample Variance In this activity we generate 100 samples with sample size 10 from the normal distribution with mean 0 and variance 1 and find the sample mean and sample variance for each sample This amounts to generating random numbers from the sampling distribution of the sample mean and sample variance Histograms and probability plots can then be used to check the known facts about the distribution of the sample mean and sample variance We begin by generating random numbers from the sampling distribution of X 1 Generate 100 samples of size 10 Calc Random Data Normal Generate 10 rows of data Store in columns C1C100 with mean 1 and Standard deviation 1 OK 2 Find the sample means for the 100 samples and store them in a column This is done in two steps The first is Data Stack Columns Stack the following columns C1 C100 under Store stacked data in select Column of current worksheet and fill Data Store subscripts in Sample Thus all 100 samples each with size 10 so a total of 1000 observations are stacked in column C101 and C102 T contains information as to which sample the observation in the corresponding row of C101 came from The second step actually finds the 100 sample means and stores them Stat Basic Statistics Store Descriptive Statistics select Data and Sample for Variables and By variables optional click Statistics select Mean OK OK Then columns C103 T C104 and C105 show the sample number sample means X to that certain sample and sample size Check the normality of sample means a Histogram b Probability Plot Graph Probability Plots Select Single Select C104 Mean1 for Graph variable click Distribution select Normal under distribution 0 and 0 31628 for Mean and StDev OK OK 5 The Sampling Distribution of the Sample Variance We next generate random numbers from the sampling distribution of the sample variance We do this by calculating the sample variance of each of the 100 samples of size 10 and storing them in a column A histogram and probability plot can be used to check the facts about the sampling distribution of the sample variance The basic fact is that for a sample of size n from a normal distribution n 1 S 2 n2 1 2 For samples of size n 10 which is what we n 1 S 2 10 1 S 2 92 2 2 Moreover in our case the population variance equals 1 Let s find 100 sample variances and compare with Chi square distribution Repeat part 2 except 2 changes Stat Basic Statistics Store Descriptive Statistics select Data and Sample for Variables and By variables optional click Statistics select Variance OK OK Probability Plot Graph Probability Plots Select Single Select Variance2 for Graph variable click Distribution select Gamma under distribution 4 5 and 2 for Shape and Scale OK OK


View Full Document
Loading Unlocking...
Login

Join to view Stat 401 Lab Activity 4 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Stat 401 Lab Activity 4 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?