Front Back
data editing
the inspection and correlation of the data received from each element of the sample
Primary tasks in editing process
- convert all responses into consistent units - assess degree of non response - check for consistency across responses - look for evidence that the respondent wasn't really thinking about the answers - verify that the branching questions were followed correctly - add any needed codes…
data coding
the process of transforming raw data into symbols
how to code close ended items: check all that apply questions
1 if checked, 0 if not
how to code factual open ended items
code the numerical variables
how to code exploratory open ended items
1. identify useable response 2. develop categories for response 3. sort responses into categories using multiple codes 4. assess the degree of agreement between coders
nominal can be used for
mode, frequency distrobution
ordinal can be used for
mode, median, frequency distribution, range
interval/ ratio can be used for
mode, median, mean, frequency distribution, range, standard deviation
chi square analysis
test for significance between the frequency distributions of 2 or more nominally scaled variables to determine if there is an association between the variables
This term defines how well the observed frequencies fit the pattern of expected frequencies
chi square
crosstabs
way to organize data by groups or categories, thus facilitating comparisons; joint frequency distribution of observations on two or more sets of variables
this term defines how certain variables differ among various subgroups of the total sample
crosstabs
when is a chi square test appropriate?
type of measurement is nominal and/or difference between 2 independent groups
t-tests require what sort of data?
interval or ratio
t-tests determine
if the difference between the 2 sample means occured by chance
what test should you use when the sample size is less than 30 and standard deviation is unknown
t-test
null hypothesis for t-test says
group means are equal
independent samples
2 or more groups of responses tested as if they came from different populations
related samples
2 or more groups of responses that originate from the sample population
paired sample t-test
difference in means for variables in the sample
ANOVAs
determines if 3 or more means are different from each other
null hypothesis for ANOVA says
all means are equal
the dependent variable in an ANOVA must be
measureable
the independent variable in an ANOVA must be
nominal
one-way ANOVAs have only one
independent variable
what test do ANOVAs use
f-test
f-tests are used to
evaluate the differences between the group means in ANOVA
how to determine significance using f-test
larger f-ratio value = reject null = group mean differences are significant
ANOVAs do NOT tell us
where the difference is
ANOVAs ONLY tell us
a difference exists
how to find where the difference is in an ANOVA
use a follow up/ post hoc test
a follow up/ post hoc test conducts
multiple pairwise comparisons of means to determine where the differences lie
what CAN you determine from the mean
if numbers are above or below the mean
what CAN NOT be determined by mean
- if there will be outliers - what individual scores were
outliers substantially distort
mean
what is usually the best choice to describe data without outliers
mean
what CAN you determine from median
- if numbers are above or below the halfway point
what CAN NOT be determined by the median
- if there were outliers - what individual scores were
what is the best choice to describe data when there are outliers
median
what CAN you determine from mode
what are the most frequent numbers
what CAN NOT be determined by mode
- where the number is in the group of data - what all of the other numbers are
this is the best choice to describe data if you want to select the most popular value
mode
what CAN you determine from range
how far apart the numbers are
what CAN NOT be determined by range
what the numbers are in the data
what is the best choice to describe the spread of the data
range
this is the best choice to show how much a typical number in the set differs from the mean
standard deviation
null hypothesis
claims a value is equal to some claimed value
alt hypothesis
claims a value is different from null value
the p-value is a measure of
significance
if p-value is small
there is strong evidence for alt hypothesis, reject null
if p-value is large
there is insignificant evidence, do not reject null
what is considered a small p-value
less than or equal to .05
null hypothesis assumes
no difference, association or relationship between variables
alt hypothesis assumes
a difference, association or relationship between variables
if p-value is ≤ .05
reject null
if p-value is ≥ .05
accept null
chi square test determines
if there is a significant difference between expected and observed frequencies in on or more category
chi square test requirements
1. one or more category (nominal data) 2. adequate sample size (at least 10) 3. simple random sample 4. data in frequency form 5. all observations must be used
What test determines whether the observed frequencies differ from the expected ones?
chi square test
null hypothesis for chi square claims
no significant difference between expected and observed
alt hypothesis for chi square claims
there IS a significant difference between expected and observed
(detail) what do you assume when rejecting the null hypothesis in a chi square test
there is a difference but it is NOT by chance or sample error. There is a REAL difference between expected and observed frequencies
"you want to know if the mean from one population is larger the mean for another. what do you use?
independent sample t-test
what does independent samples mean?
you have DIFFERENT individuals in your two sample groups
examples of independent sample t-test
- compare sales volume for stores that advertise vs. stores that dont - compare speed of survey programming for students that have completed some type of training vs no training
the null hypothesis in an independent sample claims
difference between the 2 means are 0
the alt hypothesis in an independent sample claims
difference between 2 samples are above/ below/ not equal to 0
what does paired sample t-test mean?
you have the SAME individuals in your individuals
what do paired sample t-tests compare?
compares the mean difference of values to 0
what is needed for paired sample t-tests to be valid?
difference between paired values should be approximately normally distrobuted
examples of paired sample t-test
- compare the weight of people on the show before the season begins and after the show ends. - are workers more productive 6 months after they attend training vs. before training?
ANOVAs are used to compare
3 or more means
ANOVAs ask us
what asks us "do all our groups come from populations with the same mean?"
A one-way ANOVA compares 3 or more means with?
only one independent variable
example of ANOVA
- comparing light, medium, and heavy consumers of Starbucks' attitudes towards an advertisement. - comparing light, medium, and heavy users of paper coupons with their likelihood to use mobile coupons
f-test is used to compare
variances from two normal populations.
the peak of any F-test is close to what number?
1
What values provide evidence against the null in an f-test?
values far from 1
crosstabs are what sort of variate technique?
multivariate
what multivariate technique studies the relationship between 2 or more categorical variables?
crosstabs
this technique constructs joint distributions of sample elements across variables
crosstabs
the independent variable is also known as?
the causal variable
the dependent variable is also know as?
the outcome variable
what is a banner?
a series of crosstabs between an outcome and several exploratory variables in a single tab
What does the "Pearson chi square test of independence" test for?
tests for significance between the frequency distributions of 2 or more nominal variables to determine if there is any association between variables
what tests the null hypothesis claiming categorical variables are independent of each other?
Pearson chi square test of independence
the same proportion of variable X make up each of the response categories for variable Y
null hypothesis of Pearson chi square test for independence
independent sample t-tests for the mean determine?
- determine whether 2 groups differ on some characteristics assessed on a continuous measure
what test is used to compare means of 2 groups to see if they are significantly DIFFERENT?
independent sample t-test
u1=u2. this test's null hypothesis symbolizes that the two means of (interval/ ratio variable x) are equal
independent sample t-test
example of independent sample t-test?
- satisfaction rating of men vs. women - age in years, customers vs. non customers
paired sample t-tests are used to?
used to compare 2 means when scores for both variables are provided by the same sample
paired sample t-tests are good for measuring?
good for measuring "before and after" results
this test is good for applying same measures to different objects
paired sample t-tests
what test would we use to "compare light, medium, and heavy users of crystal meth on their attitudes towards the show "breaking bad"
ANOVA
an ANOVAs independent variable must be?
nominal
an ANOVAs dependent variable must be?
interval/ ratio
An ANOVAs null hypothesis claims?
all means are equal. u1=u2=u3
what does a larger number mean in the F-ratio?
reject null, group means are significantly different
covariation is?
the amount of change in one variable in relation to the amount of change in another
scatter diagram is?
graphical plot of the relative position of 2 variables
scatterplots/ scattergrams/ scatter diagrams are?
a graph of 2 numerical variables (x,y)
what is a 2 dimensional graph representing 2 variable measures from the same set of subject elements?
a scatterplot
if the relationship of 2 variables forms a straight line (linear), variables are considered?
correlated
what variable is typically placed on the x axis?
the more controlled variable
what variable is typically placed on the y axis?
the response variable
Pearson correlation coefficient (r) measures?
measures the strength and direction of a linear relationship between 2 variables
Pearson correlation coefficient varies between?
varies between -1.00 & +1.00
a higher (r) value means what?
it means a stronger level of association
(r) can be ______ or _______ ?
can be positive or negative
what values depict a very strong range of coefficient?
+-0.81 to +-1.00
what values depict NO range of coefficient?
+-0.00 to +-0.20
what depicts a moderate range of coefficient?
+-0.41 to +-0.60
Properties of Pearson correlation consist of?
- values of r that dont depend on the units of measurement - values of r that dont depend on which variable is labeled x or y (x&y = y&x
r=+1 is what type of linear relationship?
a perfect POSITIVE linear relationship
r=-1 is what type of linear relationship?
a perfect NEGATIVE linear relationship
-1≤r≤+1. positive value of r means? negative value means?
- positive means positive linear. - negative means negative linear.
Value of r close to Zero means what?
means no linear relation
"no LINEAR relation" does NOT mean what?
it does not mean there is "NO relation AT ALL"
r ONLY measure what sort of relations?
this value only measures LINEAR relationships
there still may be non-linear relations if r is close to?
close to Zero
Assumptions in Pearson correlation coefficient
- both variables are measured using interval or ratio scales - nature of relationship is linear - Both variables come from a bivariate, normally distributed population
Causation is?
∆ in x CAUSES ∆ in y
common response is?
both x and y respond to change in an unobservable manner
confounding is?
the effect of x and y is mixed up with effects of other exploratory variables of y
example of confounding is?
tylenol and placebo effect
examples of common response are?
- ice cream sales and # of shark attacks are positively correlated (because ice cream and swimming occurs in same seasons) - # of cavities in elementary school children and vocabulary size are positively correlated (because # of cavities and vocabulary both increase with age, but they do…
example of causation is?
football weekends CAUSE heavier traffic, more food sales, etc. (because the game directly CAUSES this)
Spearman Rank Order Correlation measures?
measures the linear association between 2 ORDINALLY scaled (RANK ORDER) variables
Key differences between Spearman and Pearson?
- Sperman = ordinal variables - Pearson = interval/ ratio variables
Regression analysis is used to?
used to derive an equation representing the influence of a single or multiple independent variables on a continuous dependent variable
what is used to predict the value of a Dependent variable based on the value of at least 1 Independent variable
you use regression analysis
what explains the impact of changes in an independent variable, on the dependent variable?
regression analysis explains...
in a regression analysis, what variable do we wish to explain?
dependent variable
this variable is used to explain the other variable in a regression analysis
independent variable
what describes relationships in a linear function?
regression analysis
what assumes ∆ dependent variables are CAUSED by ∆ in independent variables?
regression analysis assumes
R^2 is also called ?
coefficient of multiple determination
R^2 is a measure representing what?
it is a measure representing total variation in the dependent variable that can be explained or accounted for by a fitted regression equation
what is a value called when there is only ONE predictor variable?
referred to as the "coefficient of determination"
key terms for, R
- correlation coefficient, indicating strength and direction of relationship
key terms for, R^2
- coefficient of determination, % of variation in one variable accounted for by another variable
Adjusted r^2 adjusts statistics based on what?
adjusts based on the number of independent variables in the model
Adjusted R^2 variable provides adjustments to R^2 how?
adjusts R^2 such that, an independent variable that HAS a correlation to Y increases adjusted R^2, and any variable WITHOUT a strong correlation will make R^2 decrease.
use R^2 for what variate regression?
use for bivariate regression
use Adjusted R^2 for what variate regression?
use for multiple regression
unstandardized regression coefficients are used in what?
used in simple/ bivariate regressions
large coefficients are good predictors for what type of regression coefficients?
good for unstandardized regression coefficients

Access the best Study Guides, Lecture Notes and Practice Exams

Login

Join to view and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?