DOC PREVIEW
UT Dallas CS 6313 - Chapter_8_2b

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1overviewRecall …VarianceVarianceStandard errors of estimatesStandard errors of estimatesInterquartile rangesoutliersPROBABILITY AND STATISTICS IN COMPUTER SCIENCE AND SOFTWARE ENGINEERING Chapter 8: Introduction to Statistics1OVERVIEWWe’ve seen how to compute the mean, median, and quartiles/percentiles/quantiles for populations and samplesWe now explore other statistics that are often used: variance and standard deviationRecall that these values provided a measure of how much the distribution could “vary”We can then see how these concepts apply to populations and samplesWe’ll define interquartile ranges, and see how this helps us detect outliers2RECALL …For a random variable X, we had an expectation (or mean) and variance , which was defined by .If we took a sample of observations , we could estimate the population mean with the sample mean, .This estimator was unbiased () , consistent, and asymptotically normal.We also saw (page 213) that •=3VARIANCESuppose we have a sample . The sample variance is defined by the formulaThis measures the variability among the observations and estimates the population variance The sample standard deviation is the square root of the sample variance, i.e. . It measures variability in the same units as X and estimates the population standard deviation .Population and sample variance are in units squared•=4VARIANCEA simpler formula (computationally) is given byThe book (on page 220) shows how this is equivalent to the other formulationSee example 8.16 (same page) for a demonstration of how to compute this statisticThe book shows why the term in both formulations is useful – it ensures that the sample variance is unbiased. See the derivation on page 220-221.It can be shown that under certain assumptions the sample variance and standard deviation are also consistent and asymptotically normal•=5STANDARD ERRORS OF ESTIMATESWe can actually use the concepts of variance and standard deviation in another way: We can use them to measure precision and reliability of estimatorsBasically, we can approximate the variance of the estimator This helps us determine if the estimator is unbiased, for exampleSuppose we have an estimator for some population parameter. We define the standard error of the estimator to be its standard deviation, i.e. Given a set of estimations, we can of course compute the standard deviation of this sample with . These standard errors show how much estimators of the same parameter may vary if computed from different samples•=6STANDARD ERRORS OF ESTIMATESConsider the diagrams at the top of page 222:We can think of these as being a series of estimates for some population parameter , perhaps computed from multiple samplesThe diagrams show estimates that are biased and unbiased, have low standard error and have high standard errorNote biased implies the estimator is “biased” to one side or the other of the true valueThe standard error for the estimator is a measure of how likely it is that the estimator will be close to the actual parameter value – it is a measure of how “spread out” the distribution of the estimators isObviously we would like our estimator to be unbiased and have low standard deviation – see example 8.17 on page 221•=7INTERQUARTILE RANGESGenerally, we would like some mechanism by which we can identify outliers – these are sample observations that fall outside the “normal range” and thus may dramatically affect our computations of the sample mean and varianceOne approach: Consider inter-qu artile rangesWe can define the inter- q uartile range as the distance between the first and third quartiles, i.e.. This is another measure of the variability of the data, and can be estimated by the sample quartiles•=8OUTLIERSHow to identify outliers?One general rule is the rule of 1.5(IQR)For normal distributions, 99.3% of the distribution lies above and below . Check this for a standard normal distribution by looking at table A4The idea is to identify any observation that lies outside this range as an outlierCheck the example on page


View Full Document

UT Dallas CS 6313 - Chapter_8_2b

Documents in this Course
ch09-01

ch09-01

24 pages

ch08-2

ch08-2

19 pages

ch08-1

ch08-1

17 pages

ch07-3

ch07-3

19 pages

ch07-2

ch07-2

11 pages

ch04

ch04

51 pages

ch02

ch02

50 pages

ch01

ch01

28 pages

ch11-3

ch11-3

26 pages

ch11-2

ch11-2

17 pages

ch11-1

ch11-1

13 pages

ch10-02

ch10-02

29 pages

ch10-01

ch10-01

28 pages

ch09-04

ch09-04

22 pages

ch09-03

ch09-03

17 pages

ch09-02

ch09-02

22 pages

ch11-3

ch11-3

26 pages

ch11-2

ch11-2

17 pages

ch11-1

ch11-1

13 pages

ch10-02

ch10-02

29 pages

ch10-01

ch10-01

28 pages

ch09-04

ch09-04

22 pages

ch09-03

ch09-03

17 pages

ch09-02

ch09-02

22 pages

ch09-01

ch09-01

24 pages

ch08-2

ch08-2

19 pages

ch08-1

ch08-1

17 pages

ch07-3

ch07-3

19 pages

ch07-2

ch07-2

11 pages

ch04

ch04

51 pages

ch02

ch02

50 pages

ch01

ch01

28 pages

PS-10

PS-10

18 pages

PS-9

PS-9

14 pages

PS-7

PS-7

11 pages

PS-6

PS-6

12 pages

PS-5

PS-5

8 pages

PS-4

PS-4

8 pages

probs 2-3

probs 2-3

17 pages

ch09-02

ch09-02

22 pages

ch09-01

ch09-01

24 pages

ch08-2

ch08-2

19 pages

ch08-1

ch08-1

17 pages

ch07-3

ch07-3

19 pages

ch07-2

ch07-2

11 pages

ch04

ch04

51 pages

ch02

ch02

50 pages

ch01

ch01

28 pages

PS-10

PS-10

18 pages

PS-4

PS-4

8 pages

probs 2-3

probs 2-3

17 pages

ch11-3

ch11-3

26 pages

ch11-2

ch11-2

17 pages

ch11-1

ch11-1

13 pages

ch10-02

ch10-02

29 pages

ch10-01

ch10-01

28 pages

ch09-04

ch09-04

22 pages

ch09-03

ch09-03

17 pages

SCAN0004

SCAN0004

12 pages

SCAN0001

SCAN0001

12 pages

Prob9

Prob9

12 pages

prob10

prob10

3 pages

Load more
Download Chapter_8_2b
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter_8_2b and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter_8_2b 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?