DOC PREVIEW
UVA STAT 2120 - Topic_03

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Producing dataProducing data Toward statistical inferenceSection 3.3Toward statistical inferenceIdea: Use “sampling” to understand statistical inference Statistical inference is when a conclusion about a population is inferred from the characteristics of apopulation is inferred from the characteristics of a sample drawn from itPopulationSampleSampleTerminology A parameter is a number that describes a characteristic of a population Ex: p is the proportion with some trait in the populationAstatisticis a number that describes aA statisticis a number that describes a characteristic of a sample Ex: is the proportion with the trait in the sample The observed value of a statistic is used to ti tth b d l f testimatethe unobserved value of a parameter Ex: estimates pSampling variability Sampling variability is the phenomenon by which repeated implementation of the sampling pp pgmechanism produces distinct samples Suppose a statistic is recalculated for each sample under repeated sampling. The distribution of its values is its sampling distributionpgBias of a statistic The bias of a statistic is described by the center of its sampling distribution A statistic is unbiased if the mean of its sampling distribution is the same as the parameter it is intended to estimate Use random sampling to produce unbiased estimatesVariability of a statistic The variability of a statistic is described by the spread of its sampling distribution A “margin of error” is determined by the variability of a statisticof a statisticThe variability of a statistic will be smaller if it isThe variability of a statistic will be smaller if it is calculated from a larger sample Variability can be made arbitrarily small with a large enough sampleenough sample (… but sampling costs money, time, effort, etc.)Producing dataProducing dataData ethicsSection 3.4Risks of data productionEthical issues may arise in the production of data, especially when people are involved as subjectsExamples of risks to participating subjects:  Direct risk to physical health Violations of personal space and privacy Target of deceptionStandards of data ethics Oversight by an institutional review board Charged to protect the interest of subjectsgp j Participation only after informed consent Inform of the nature of the experiment and risks Consent in writing, if possible Confidentiality of raw data Only release statistical summaries publicallyProbability and SamplingProbability and Sampling DistributionsRdRandomnessChapter 4.1Randomness and probabilityObservations of random phenomena: Patterns emerge “in the long-run” after many repetitions of a chance-happening Short-term patterns di t blare unpredictableProbability attempts to describe the long-term patterns of random phenomena“Long-run” probabilitiesA probability is the proportion of times that some interesting outcome is observed “in the long run.”First series of tossesSecond seriesProbability and SamplingProbability and Sampling DistributionsPbbili dlProbability modelsChapter 4.2Probability models A probability model is an assignment of probabilities to events defined from a set of outcomes It is a mathematical framework for describing random phenomenaAn outcome is a possible value generated by the chance-happening of interestAill i f i i ibl lAn eventis a collection of interesting possible values (i.e., outcomes)P b bilit lth th ti l l i dProbability rules are the mathematical laws required for a probability model to make senseBasic setup of a probability model The sample space, S, is the set of all possible outcomes Represents a single repetition of a chance-happening*An event, A, B, C, etc., is a subset of the sample space  Represents the occurrence of a certain interesting thingpgg* Including a chance-happening that is itself a repetition of some more elementary chance-happening. (We need this to understand sampling distributions.)Relationships between events The compliment, Ac, of an event, A, is the set of outcomes that are ,not in A.  Represents the nonoccurrence of a certain interesting thingcertain interesting thing Events A and B are disjoint if they share no outcomesRepresent things that cannot occur simultaneouslyRepresent things that cannot occur simultaneouslyDisjoint Not disjointProbability rules 0 ≤ P(A) ≤ 1 P(S) = 1 Complement rule: P(Ac) = 1 – P(A)  Addition rule for disjoint events:If A and B are disjoint then P(A or B) = P(A) + P(B)Finite probabilitiesProbability rules simplify when there are a finite number of possible outcomes. p Each probability is a number between zero and one The sum of all probabilities is one The probability of an event is the sum of the probabilities of outcomes comprising that event.probabilities of outcomes comprising that event.Example: equally likely outcomesA couple wants to have three children. Observe the possible sequences of boys (B) and girls (G).S = { BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG }BBB - BBBG - BBGAssign equal probability of 1/8 to each outcomeBGB - BGBG - BGGA = “exactly two girls” = { BGG, GBG, GGB }BB - GBBG - GBGP(A) = P(BGG) + P(GBG) + P(GGB)=1/8+1/8+1/8GB - GGBG-GGGG 1/8 1/8 1/8 = 3/8G -GGGExample: Benford’s LawEmpirical probabilities of “first digits” in financial docs1stdigit123456789Probability0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.0460200.250.300.35lityProbability histogram0000.050.100.150.20ProbabiP(1stdigit ≥6) = 0.067 + 0.058 + 0.051 + 0.046 = 0.2220.00123456789Outcomes(dgt 6) 0 06 0 058 0 05 0 0 6 0Example: Two die rollsThirty-six possible die rolls, equal probabilities:P(sum is 5)=4/36=0.111P(sum is 5) 4/36 0.111P(doubles) = 6/36 = 0.167, etc.Note: X = “sum” is an example of a “random variable” (more later)Probabilities of intervalsIf S is continuum of values then probabilities are assigned using a density curve.gy No part of a density curve can be negative The total “area under the curve” must be one The probability P(A) of an event A = { a ≤ X ≤ b } is the “area under the curve”betweenaandb.area under the curve between aand b.Random variableExample: Uniform density curveProbabilities of a random number generator, S={ numbers between 0 and 1 }S { numbers between 0 and 1 }P(0 3≤X≤07)=07–03=04P(0.3 ≤X≤0.7) = 0.7 –0.3 = 0.4Example: General uniform density


View Full Document

UVA STAT 2120 - Topic_03

Download Topic_03
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Topic_03 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Topic_03 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?