FANR 3000: Final Exam
79 Cards in this Set
Front | Back |
---|---|
data can be ____ or ____
|
quantitative or qualitative
|
two types of qualitative data
|
Nominal - no natural order
Ordinal -natural rank or order
|
Two types of quantitative data
|
Discrete - counts integer values only
Continuous - Measurement any value in any range
|
The closeness of a measured value to the known or true value
|
accuracy
|
the closeness of two or more measurements to each other
|
precision
|
Low accuracy and low precision
|
scattered all over
|
low accuracy and high precision
|
close together but not near the center
|
high accuracy and low precision
|
near the center but not together
|
high accuracy and high precision
|
all close together and in the center
|
incorrect measurements due to carelessness
|
mistakes or gross errors
|
errors of the same size or magnitude with each subsequent measurement
|
systematic error
|
error that is always present in a measurement
|
random error
|
Number of observations in each variable class
|
frequency
|
fraction of the total observation sin each variable class
|
relative frequency
|
frequency of a variable class plus the frequencies of the classes below it
|
cumulative frequency
|
relative frequencies plus the relative frequencies of the classes below it
|
cumulative relative frequency
|
what do we do to reduce bias in data?
|
shuffle it
|
Formula for Mean
|
μ=Σx/N
|
Variance Formula
|
s2=Sum((xi-x)2/(n-1))
|
Formula for standard deviation
|
square root of variance
|
Formula for Standard Error
|
SD
√N
|
formula for median
|
(n+1)/2
|
how to determine which observation to use for the 3 quartiles
|
Q1 = (n+1)/4
Q2 = median
Q3 = 3(n+1)/4
|
how to determine if a data point is or isnt an outlier?
|
q3+1.5(iqr)
q1 -1.5(iqr
|
The Empirical Rule
|
if the data set is a bell shaped curve, then:
68% within mean +/- Std Dev
95% within mean +/- 2x Std Dev
99.7% within mean +/- 3x Std Dev
|
what is the formula for the Z score
|
Z = (x - x bar)/SD
|
when would you use the z score and z table?
|
if the population standard deviation is known
|
when would you use the t scores and t table?
|
if the population standard deviation is unknown
|
how many standard deviations away from the mean our value is
|
z score
|
what is the central limit theorem
|
regardless of the shape of the original distribution of data, the sampling distribution of the mean will be approximately normally distributed. and as n increases, the sample mean tends to cluster around the true population mean
|
Three things that impact the width of the confidence interval?
|
1) confidence interval - as interval increases, it gets wider
2) variability - populations with more variability generate wider CI's
3) sample size - smaller samples sizes generate wider intervals
|
statistic
|
value which summarizes a property of the sample. (sample mean, variance, or SD)
|
Parameter
|
A measurement or characteristic of the population
|
Coefficient of Variation formula (CV)
|
(sd/x) * 100
|
when comparing means using confidence intervals, what do we do if the intervals overlap to find if there is a significant difference?
|
construct a confidence interval on the difference in the two means. if this interval contains 0 then they are not significantly different
|
Null hypothesis
|
- the status quo
- includes an equal sign
- often reverse of what experimenter believes
|
Alternative hypothesis
|
- opposite of null hypothesis
|
type 1 error and type 2 error
|
type 1 - false positive; reject null hypothesis when it is really true
type 2 - false negative; failure to reject a null hypothesis when it is not true
|
where on the normal distribution graph would you reject the null hypothesis and fail to reject the null hypothesis
|
If the sample mean falls in the tails then you reject, if it falls in the middle region then you fail to reject
|
when would you use a two tailed hypothesis test?
|
when H1 has a not equal to sign
|
you reject H0 if ____
|
|z|>|zc|
|
the smallest level of significance at which the null hypothesis is true
|
p vlaue
|
if p value is less than or equal to alpha
|
reject null hypothesis
|
if p value is greater than alpha
|
fail to reject null hypothesis
|
a two sample t-test is only appropriate if the following criteria is met:
|
1) the two samples are random and independent
2) both populations are normally distributed
3) population variances are unknown but are the same for both populations
|
what is a regression?
|
functional relationship between two or more correlated variables that is often empirically determined by data, and is used to predict the value of one variable when the values of other variables are known
|
the variable you want to predict.
|
the dependent variable (y)
|
the variable(s) that you actually measure or use to explain the variation in the dependent variable
|
independent variable (x)
|
what is the linear correlation coefficient "r"
|
strength and direction of the x/y relationship
|
what is the coefficient of determination "r squared"
|
measure of the strength of the linear relationship between the two variables
- the percentage of the variation in y than can be explained by the regression line
|
total variation = _____
|
explained variation + unexplained variation
|
equation for r squared
|
explained variation/total variation
|
the measure of variablility around the regression line
|
standard error of estimate
|
Sb1 = ____
Sxy = ____
|
SE of the slope
SE of the estimate
|
most popular way to measure size of a population
|
capture mark recapture
|
Lincoln-Peterson
|
N=[(M+1)(C+1)/(R+1)]-1
M = Total caught and marked on 1st visit
C = Total captured on second visit
R = Ones caught on 2nd visit that are marked
|
Capture mark recapture assumptions
|
- Population is closed and N is constant
- all individuals have same chance of getting caught
- marking individuals doesnt affect their chance of being caught
- individuals dont lose mark
|
three types of sampling
|
- simple random
- systematic
- stratified
|
simple random sampling
|
taking random sample of n units from a population size of N
|
systematic sample
|
sampling every kth unit from a population
|
stratified sample
|
dividing population into non overlapping blocks (strata) and taking random samples within strata
|
advantages and disadvantages of simple random
|
Advantages:
- requires minimum prior knowledge of population
- unbiased
- easiest
Disadvantages:
- sub groups may lead to lower estimate
- doesnt guaruntee perfect representation
|
simple random sampling steps
|
- define sampling unit
- randomly select number of sampling units
- visit and measure the random units
- compute point or interval estimates using the sample data
|
how would one randomly select the sampling units for a random sample?
|
random number table
|
systematic sampling advantages
|
- ensures good coverage of population
- easier to locate sampling units
- sub groups are likely to be sampled
|
systematic disadvantages
|
may result in very biased point and interval estimates, especially if you hit a pattern
|
systematic sampling steps
|
- define sampling unit
- randomly select first unit
- develop method for locating subsequent sampling units
- visit and measure units
- analyze as you would for simple
|
how to calculate the representative area for systematic sampling
|
distance between plots x distance between lines
|
how to calculate representative area for systematic sampling
|
tract area / plots needed
|
simple random and systematic are both applied in situations where ______
|
were only interested in making an inference about the entire population
|
when subgroups in nature are non overlapping they are referred to as _____
|
strata
|
ways to allocate sample size to strata
|
- proportional allocation (bigger the strata, the larger the sample size)
- by variance
|
steps in stratified sampling
|
- determine different strata
- calculate sample statistics for each strata
- determine proportional allocation or weight for each strata
- calculate number of samples necessary for certain level of Allowable Error
- multiply total number of samples by each strata's proportional alloca…
|
formalized process by which scientists gain new knowledge
|
scientific method
|
steps to scientific method
|
- state problem
- form hypothesis
- design experiment
- make observations/collect data
- interpret data
- draw conclusion
|
what is deductive reasoning?
|
- using one or more general to reach a logically certain conclusion
- general premises --> specific examples
|
what is inductive reasoning?
|
taking specific examples and making sweeping general conclusions
|
advantage of inductive reasoning
|
provides us with new ideas and can expand our knowledge
|
____ is the variable expected to change whenever the ____ changes
|
dependent variable (y)
independent variable (x)
|