Unformatted text preview:

13-1©2006 Raj JainCSE567MWashington University in St. LouisComparing Systems Comparing Systems Using Sample DataUsing Sample DataRaj Jain Washington University in Saint LouisSaint Louis, MO [email protected] slides are available on-line at:http://www.cse.wustl.edu/~jain/cse567-06/13-2©2006 Raj JainCSE567MWashington University in St. LouisOverviewOverview! Sample Versus Population! Confidence Interval for The Mean! Approximate Visual Test! One Sided Confidence Intervals! Confidence Intervals for Proportions! Sample Size for Determining Mean and proportions13-3©2006 Raj JainCSE567MWashington University in St. LouisSampleSample! Old French word `essample' ⇒ `sample' and `example'! One example ≠ theory! One sample ≠ Definite statement13-4©2006 Raj JainCSE567MWashington University in St. LouisSample Versus PopulationSample Versus Population! Generate several million random numbers with mean μ and standard deviation σDraw a sample of n observations≠μ! Sample mean ≠ population mean! Parameters: population characteristics = Unknown = Greek! Statistics: Sample estimates = Random = English13-5©2006 Raj JainCSE567MWashington University in St. LouisConfidence Interval for The MeanConfidence Interval for The Mean! k samples ⇒ k Sample means⇒ Can't get a single estimate of μ⇒ Use bounds c_{1} and c_{2}:Probability{c1≤ μ ≤ c2} = 1- α! Confidence interval: [(c1, c2)]! Significance level: α! Confidence level: 100(1-α)! Confidence coefficient: 1-αμc1c213-6©2006 Raj JainCSE567MWashington University in St. LouisDetermining Confidence IntervalDetermining Confidence Interval! Use 5-percentile and 95-percentile of the sample means to get 90% Confidence interval ⇒ Need many samples.! Central limit theorem: Sample mean of independent and identically distributed observations:Where μ = population mean, σ = population standard deviation! Standard Error: Standard deviation of the sample mean = ! 100(1-a)% confidence interval for μ:z1-α/2= (1-α/2)-quantile of N(0,1)0-z1-α/2-z1-α/13-7©2006 Raj JainCSE567MWashington University in St. LouisExample 13.1Example 13.1! = 3.90, s = 0.95 and n = 32 ! A 90% confidence interval for the mean= ! We can state with 90% confidence that the population mean is between 3.62 and 4.17 The chance of error in this statement is 10%.13-8©2006 Raj JainCSE567MWashington University in St. LouisConfidence Interval: MeaningConfidence Interval: Meaning! If we take 100 samples and construct confidence interval for each sample, the interval would include the population mean in 90 cases.μc1c2Total yes > 100(1-α)13-9©2006 Raj JainCSE567MWashington University in St. LouisConfidence Interval for Small SamplesConfidence Interval for Small Samples! 100(1-α) % confidence interval for for n < 30:! t[1-α/2; n-1]= (1-α/2)-quantile of a t-variate with n-1 degrees of freedom13-10©2006 Raj JainCSE567MWashington University in St. LouisExample 13.2Example 13.2! Sample: -0.04, -0.19, 0.14, -0.09, -0.14, 0.19, 0.04, and 0.09.! Mean = 0, Sample standard deviation = 0.138.! For 90% interval: t[0.95;7]= 1.895! Confidence interval for the mean13-11©2006 Raj JainCSE567MWashington University in St. LouisTesting For A Zero MeanTesting For A Zero Mean13-12©2006 Raj JainCSE567MWashington University in St. LouisExample 13.3Example 13.3! Difference in processor times: {1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}.! Question: Can we say with 99% confidence that one is superior to the other?Sample size = n = 7Mean = 7.20/7 = 1.03Sample variance = (22.84 - 7.20*7.20/7)/6 = 2.57Sample standard deviation} = = 1.60t[0.995; 6]= 3.707! 99% confidence interval = (-1.21, 3.27)13-13©2006 Raj JainCSE567MWashington University in St. LouisExample 13.3 (Cont)Example 13.3 (Cont)! Opposite signs ⇒ we cannot say with 99% confidence that the mean difference is significantly different from zero.! Answer: They are same.! Answer: The difference is zero.13-14©2006 Raj JainCSE567MWashington University in St. LouisExample 13.4Example 13.4! Difference in processor times: {1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}. ! Question: Is the difference 1?! 99% Confidence interval = (-1.21, 3.27)! Yes: The difference is 113-15©2006 Raj JainCSE567MWashington University in St. LouisPaired vs. Unpaired ComparisonsPaired vs. Unpaired Comparisons! Paired: one-to-one correspondence between the ith test of system A and the ith test on system B! Example: Performance on ith workload! Use confidence interval of the difference! Unpaired: No correspondence! Example: n people on System A, n on System B⇒Need more sophisticated method13-16©2006 Raj JainCSE567MWashington University in St. LouisExample 13.5Example 13.5! Performance: {(5.4, 19.1), (16.6, 3.5), (0.6, 3.4), (1.4, 2.5),(0.6, 3.6), (7.3, 1.7)}. Is one system better?! Differences: {-13.7, 13.1, -2.8, -1.1, -3.0, 5.6}.! Answer: No. They are not different.13-17©2006 Raj JainCSE567MWashington University in St. LouisUnpaired ObservationsUnpaired Observations! Compute the sample means:! Compute the sample standard deviations:13-18©2006 Raj JainCSE567MWashington University in St. LouisUnpaired Observations (Cont)Unpaired Observations (Cont)! Compute the mean difference:! Compute the standard deviation of the mean difference:! Compute the effective number of degrees of freedom:! Compute the confidence interval for the mean difference:13-19©2006 Raj JainCSE567MWashington University in St. LouisExample 13.6Example 13.6! Times on System A: {5.36, 16.57, 0.62, 1.41, 0.64, 7.26}Times on system B: {19.12, 3.52, 3.38, 2.50, 3.60, 1.74}! Question: Are the two systems significantly different? ! For system A:! For System B:13-20©2006 Raj JainCSE567MWashington University in St. LouisExample 13.6 (Cont)Example 13.6 (Cont)! The confidence interval includes zero ⇒ the two systems are not different.13-21©2006 Raj JainCSE567MWashington University in St. LouisApproximate Visual TestApproximate Visual Test13-22©2006 Raj JainCSE567MWashington University in St. LouisExample 13.7Example 13.7! Times on System A: {5.36, 16.57, 0.62, 1.41, 0.64, 7.26}Times on system B: {19.12, 3.52, 3.38, 2.50, 3.60, 1.74}t[0.95, 5]= 2.015! The 90% confidence interval for the mean of A = 5.31 ∓(2.015)= (0.24, 10.38) ! The 90% confidence interval for the mean of B = 5.64 ∓(2.015)= (0.18, 11.10) ! Confidence intervals overlap and the mean of one falls in the confidence interval for the other.⇒ Two systems are not different at this level of confidence.13-23©2006


View Full Document

WUSTL CSE 567M - Comparing Systems Using Sample Data

Documents in this Course
Load more
Download Comparing Systems Using Sample Data
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Comparing Systems Using Sample Data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Comparing Systems Using Sample Data 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?