WUSTL CSE 567M - Comparing Systems Using Sample Data - D2910546

Home> Schools> Washington University in St. Louis> Computer Science and Engineering (CSE) > CSE 567M> Comparing Systems Using Sample Data

WUSTL CSE 567M - Comparing Systems Using Sample Data

School name Washington University in St. Louis

Course Cse 567m- Computer Systems Analysis

Pages 43

Download Save

Unformatted text preview:

13-1©2006 Raj JainCSE567MWashington University in St. LouisComparing Systems Comparing Systems Using Sample DataUsing Sample DataRaj Jain Washington University in Saint LouisSaint Louis, MO [email protected] slides are available on-line at:http://www.cse.wustl.edu/~jain/cse567-06/13-2©2006 Raj JainCSE567MWashington University in St. LouisOverviewOverview! Sample Versus Population! Confidence Interval for The Mean! Approximate Visual Test! One Sided Confidence Intervals! Confidence Intervals for Proportions! Sample Size for Determining Mean and proportions13-3©2006 Raj JainCSE567MWashington University in St. LouisSampleSample! Old French word `essample' ⇒ `sample' and `example'! One example ≠ theory! One sample ≠ Definite statement13-4©2006 Raj JainCSE567MWashington University in St. LouisSample Versus PopulationSample Versus Population! Generate several million random numbers with mean μ and standard deviation σDraw a sample of n observations≠μ! Sample mean ≠ population mean! Parameters: population characteristics = Unknown = Greek! Statistics: Sample estimates = Random = English13-5©2006 Raj JainCSE567MWashington University in St. LouisConfidence Interval for The MeanConfidence Interval for The Mean! k samples ⇒ k Sample means⇒ Can't get a single estimate of μ⇒ Use bounds c_{1} and c_{2}:Probability{c1≤ μ ≤ c2} = 1- α! Confidence interval: [(c1, c2)]! Significance level: α! Confidence level: 100(1-α)! Confidence coefficient: 1-αμc1c213-6©2006 Raj JainCSE567MWashington University in St. LouisDetermining Confidence IntervalDetermining Confidence Interval! Use 5-percentile and 95-percentile of the sample means to get 90% Confidence interval ⇒ Need many samples.! Central limit theorem: Sample mean of independent and identically distributed observations:Where μ = population mean, σ = population standard deviation! Standard Error: Standard deviation of the sample mean = ! 100(1-a)% confidence interval for μ:z1-α/2= (1-α/2)-quantile of N(0,1)0-z1-α/2-z1-α/13-7©2006 Raj JainCSE567MWashington University in St. LouisExample 13.1Example 13.1! = 3.90, s = 0.95 and n = 32 ! A 90% confidence interval for the mean= ! We can state with 90% confidence that the population mean is between 3.62 and 4.17 The chance of error in this statement is 10%.13-8©2006 Raj JainCSE567MWashington University in St. LouisConfidence Interval: MeaningConfidence Interval: Meaning! If we take 100 samples and construct confidence interval for each sample, the interval would include the population mean in 90 cases.μc1c2Total yes > 100(1-α)13-9©2006 Raj JainCSE567MWashington University in St. LouisConfidence Interval for Small SamplesConfidence Interval for Small Samples! 100(1-α) % confidence interval for for n < 30:! t[1-α/2; n-1]= (1-α/2)-quantile of a t-variate with n-1 degrees of freedom13-10©2006 Raj JainCSE567MWashington University in St. LouisExample 13.2Example 13.2! Sample: -0.04, -0.19, 0.14, -0.09, -0.14, 0.19, 0.04, and 0.09.! Mean = 0, Sample standard deviation = 0.138.! For 90% interval: t[0.95;7]= 1.895! Confidence interval for the mean13-11©2006 Raj JainCSE567MWashington University in St. LouisTesting For A Zero MeanTesting For A Zero Mean13-12©2006 Raj JainCSE567MWashington University in St. LouisExample 13.3Example 13.3! Difference in processor times: {1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}.! Question: Can we say with 99% confidence that one is superior to the other?Sample size = n = 7Mean = 7.20/7 = 1.03Sample variance = (22.84 - 7.20*7.20/7)/6 = 2.57Sample standard deviation} = = 1.60t[0.995; 6]= 3.707! 99% confidence interval = (-1.21, 3.27)13-13©2006 Raj JainCSE567MWashington University in St. LouisExample 13.3 (Cont)Example 13.3 (Cont)! Opposite signs ⇒ we cannot say with 99% confidence that the mean difference is significantly different from zero.! Answer: They are same.! Answer: The difference is zero.13-14©2006 Raj JainCSE567MWashington University in St. LouisExample 13.4Example 13.4! Difference in processor times: {1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}. ! Question: Is the difference 1?! 99% Confidence interval = (-1.21, 3.27)! Yes: The difference is 113-15©2006 Raj JainCSE567MWashington University in St. LouisPaired vs. Unpaired ComparisonsPaired vs. Unpaired Comparisons! Paired: one-to-one correspondence between the ith test of system A and the ith test on system B! Example: Performance on ith workload! Use confidence interval of the difference! Unpaired: No correspondence! Example: n people on System A, n on System B⇒Need more sophisticated method13-16©2006 Raj JainCSE567MWashington University in St. LouisExample 13.5Example 13.5! Performance: {(5.4, 19.1), (16.6, 3.5), (0.6, 3.4), (1.4, 2.5),(0.6, 3.6), (7.3, 1.7)}. Is one system better?! Differences: {-13.7, 13.1, -2.8, -1.1, -3.0, 5.6}.! Answer: No. They are not different.13-17©2006 Raj JainCSE567MWashington University in St. LouisUnpaired ObservationsUnpaired Observations! Compute the sample means:! Compute the sample standard deviations:13-18©2006 Raj JainCSE567MWashington University in St. LouisUnpaired Observations (Cont)Unpaired Observations (Cont)! Compute the mean difference:! Compute the standard deviation of the mean difference:! Compute the effective number of degrees of freedom:! Compute the confidence interval for the mean difference:13-19©2006 Raj JainCSE567MWashington University in St. LouisExample 13.6Example 13.6! Times on System A: {5.36, 16.57, 0.62, 1.41, 0.64, 7.26}Times on system B: {19.12, 3.52, 3.38, 2.50, 3.60, 1.74}! Question: Are the two systems significantly different? ! For system A:! For System B:13-20©2006 Raj JainCSE567MWashington University in St. LouisExample 13.6 (Cont)Example 13.6 (Cont)! The confidence interval includes zero ⇒ the two systems are not different.13-21©2006 Raj JainCSE567MWashington University in St. LouisApproximate Visual TestApproximate Visual Test13-22©2006 Raj JainCSE567MWashington University in St. LouisExample 13.7Example 13.7! Times on System A: {5.36, 16.57, 0.62, 1.41, 0.64, 7.26}Times on system B: {19.12, 3.52, 3.38, 2.50, 3.60, 1.74}t[0.95, 5]= 2.015! The 90% confidence interval for the mean of A = 5.31 ∓(2.015)= (0.24, 10.38) ! The 90% confidence interval for the mean of B = 5.64 ∓(2.015)= (0.18, 11.10) ! Confidence intervals overlap and the mean of one falls in the confidence interval for the other.⇒ Two systems are not different at this level of confidence.13-23©2006

View Full Document


School:
Email:
New Password:
Confirm Password:

WUSTL CSE 567M - Comparing Systems Using Sample Data

Sign up for free to view:

Please select your school