Front Back
data can be ____ or ____
quantitative or qualitative
two types of qualitative data
Nominal - no natural order Ordinal -natural rank or order
Two types of quantitative data
Discrete - counts integer values only Continuous - Measurement any value in any range
The closeness of a measured value to the known or true value
accuracy
the closeness of two or more measurements to each other
precision
Low accuracy and low precision
scattered all over
low accuracy and high precision
close together but not near the center
high accuracy and low precision
near the center but not together
high accuracy and high precision
all close together and in the center
incorrect measurements due to carelessness
mistakes or gross errors
errors of the same size or magnitude with each subsequent measurement
systematic error
error that is always present in a measurement 
random error
Number of observations in each variable class
frequency
fraction of the total observation sin each variable class
relative frequency
frequency of a variable class plus the frequencies of the classes below it
cumulative frequency
relative frequencies plus the relative frequencies of the classes below it
cumulative relative frequency
what do we do to reduce bias in data?
shuffle it
Formula for Mean
μ=Σx/N
Variance Formula
s2=Sum((xi-x)2/(n-1))
Formula for standard deviation
square root of variance
Formula for Standard Error
SD √N
formula for median
(n+1)/2
how to determine which observation to use for the 3 quartiles
Q1 = (n+1)/4 Q2 = median Q3 = 3(n+1)/4
how to determine if a data point is or isnt an outlier?
q3+1.5(iqr) q1 -1.5(iqr
The Empirical Rule
if the data set is a bell shaped curve, then: 68% within mean +/- Std Dev 95% within mean +/- 2x Std Dev 99.7% within mean +/- 3x Std Dev
what is the formula for the Z score
Z = (x - x bar)/SD
when would you use the z score and z table?
if the population standard deviation is known
when would you use the t scores and t table?
if the population standard deviation is unknown
how many standard deviations away from the mean our value is
z score
what is the central limit theorem
regardless of the shape of the original distribution of data, the sampling distribution of the mean will be approximately normally distributed. and as n increases, the sample mean tends to cluster around the true population mean
Three things that impact the width of the confidence interval?
1) confidence interval - as interval increases, it gets wider 2) variability - populations with more variability generate wider CI's 3) sample size - smaller samples sizes generate wider intervals
statistic
value which summarizes a property of the sample. (sample mean, variance, or SD)
Parameter
A measurement or characteristic of the population
Coefficient of Variation formula (CV)
(sd/x) * 100
when comparing means using confidence intervals, what do we do if the intervals overlap to find if there is a significant difference?
construct a confidence interval on the difference in the two means. if this interval contains 0 then they are not significantly different
Null hypothesis
- the status quo - includes an equal sign - often reverse of what experimenter believes
Alternative hypothesis
- opposite of null hypothesis
type 1 error and type 2 error
type 1 - false positive; reject null hypothesis when it is really true type 2 - false negative; failure to reject a null hypothesis when it is not true
where on the normal distribution graph would you reject the null hypothesis and fail to reject the null hypothesis
If the sample mean falls in the tails then you reject, if it falls in the middle region then you fail to reject
when would you use a two tailed hypothesis test?
when H1 has a not equal to sign
you reject H0 if ____
|z|>|zc|
the smallest level of significance at which the null hypothesis is true
p vlaue
if p value is less than or equal to alpha
reject null hypothesis
if p value is greater than alpha
fail to reject null hypothesis
a two sample t-test is only appropriate if the following criteria is met:
1) the two samples are random and independent 2) both populations are normally distributed 3) population variances are unknown but are the same for both populations
what is a regression?
functional relationship between two or more correlated variables that is often empirically determined by data, and is used to predict the value of one variable when the values of other variables are known
the variable you want to predict. 
the dependent variable (y)
the variable(s) that you actually measure or use to explain the variation in the dependent variable
independent variable (x)
what is the linear correlation coefficient "r"
strength and direction of the x/y relationship
what is the coefficient of determination "r squared"
measure of the strength of the linear relationship between the two variables - the percentage of the variation in y than can be explained by the regression line
total variation = _____
explained variation + unexplained variation
equation for r squared
explained variation/total variation
the measure of variablility around the regression line
standard error of estimate
Sb1 = ____ Sxy = ____
SE of the slope SE of the estimate
most popular way to measure size of a population
capture mark recapture
Lincoln-Peterson
N=[(M+1)(C+1)/(R+1)]-1 M = Total caught and marked on 1st visit C = Total captured on second visit R = Ones caught on 2nd visit that are marked
Capture mark recapture assumptions
- Population is closed and N is constant - all individuals have same chance of getting caught - marking individuals doesnt affect their chance of being caught - individuals dont lose mark
three types of sampling
- simple random - systematic - stratified
simple random sampling
taking random sample of n units from a population size of N
systematic sample
sampling every kth unit from a population
stratified sample
dividing population into non overlapping blocks (strata) and taking random samples within strata
advantages and disadvantages of simple random
Advantages: - requires minimum prior knowledge of population - unbiased - easiest Disadvantages: - sub groups may lead to lower estimate - doesnt guaruntee perfect representation
simple random sampling steps
- define sampling unit - randomly select number of sampling units - visit and measure the random units - compute point or interval estimates using the sample data
how would one randomly select the sampling units for a random sample?
random number table
systematic sampling advantages
- ensures good coverage of population - easier to locate sampling units - sub groups are likely to be sampled
systematic disadvantages
may result in very biased point and interval estimates, especially if you hit a pattern
systematic sampling steps
- define sampling unit - randomly select first unit - develop method for locating subsequent sampling units - visit and measure units - analyze as you would for simple
how to calculate the representative area for systematic sampling
distance between plots x distance between lines
how to calculate representative area for systematic sampling
tract area / plots needed
simple random and systematic are both applied in situations where ______
were only interested in making an inference about the entire population
when subgroups in nature are non overlapping they are referred to as _____
strata
ways to allocate sample size to strata
- proportional allocation (bigger the strata, the larger the sample size) - by variance
steps in stratified sampling
- determine different strata - calculate sample statistics for each strata - determine proportional allocation or weight for each strata - calculate number of samples necessary for certain level of Allowable Error - multiply total number of samples by each strata's proportional alloca…
formalized process by which scientists gain new knowledge
scientific method
steps to scientific method
- state problem - form hypothesis - design experiment - make observations/collect data - interpret data - draw conclusion
what is deductive reasoning?
- using one or more general to reach a logically certain conclusion - general premises --> specific examples
what is inductive reasoning?
taking specific examples and making sweeping general conclusions
advantage of inductive reasoning
provides us with new ideas and can expand our knowledge
____ is the variable expected to change whenever the ____ changes
dependent variable (y) independent variable (x)

Access the best Study Guides, Lecture Notes and Practice Exams

Login

Join to view and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?