Unformatted text preview:

Psych 524Andrew AinsworthData Screening 1Data check entryl One of the first steps to proper data screening is to ensure the data is correctl Check out each person’s entry individuallyl Makes sense if small data set or proper data checking procedurel Can be too costly so…l range of data should be checkedAssumption CheckingNormality l All of the continuous data we are covering need to follow a normal curvel Skewness (univariate) – this represents the spread of the dataNormalityl skewness statistic is output by SPSS and SE skewness is 3.2 violation of skewness assumptionSkewnessskewnessSkewnessskewnessSZSEZ→>6NNormalityl Kurtosis (univariate) – is how peaked the data is; Kurtosis stat output by SPSSl Kurtosis standard error = l for most statistics the skewness assumption is more important that the kurtosis assumption3.2 violation of kurtosis assumptionKurtosiskurtosisKurtosiskurtosisSZSEZ→>24NSkewness and KurtosisOutliers l technically it is a data point outside of you distribution; so potentially detrimental because may have undo effect on distributionOutliersl Univariate (brains in arc)l Should always check that data is coded correctlyl Two ways of looking at itl a data point represents an outlier if it is disconnected from the rest of the distributionl Data is an outlier if it has a Z-score above 3.3l If there is a concern – run data with and without to see if it has any influence on the dataOutliersl Leverage – is how far away a case is from the rest of the datal Discrepancy – is the degree to which a data point lines up with the rest of the datal Influence – amount of change in the regression equation (Bs) when a case is deleted. Calculated as a combination of Leverage and DiscrepancyOutliersDealing w/ univariate outliersl Once you find outliersl Look into the case to see if there are indicators that the case is not part of your intended samplel If this is true delete the casel Reduce influence of outlierl Move value inward toward the rest of the distribution, while still leaving it extremeMultivariate Outliersl Subject score may not be an outlier on any single variable; but on a combination of variables the subject is an outlierl “Being a teenager is normal, making $50,000 a year is normal, but a teenager making $50,000 a year is a multivariate outlier”.Multivariate Outliersl Mahalanobis distance – measurement of deviance from the centroid (center of multivariate distribution created by the means of all the variables)l Computing Mahalanobis distances you get a chi square distribution l χ2(df = # variables), l Lookup critical value (with a = .001) if MD is above the CV the participant is a multivariate outlierl If Multivariate outliers found, not much to do except delete the caseLinearityl relationships among variables are linear in nature; assumption in most analysesl Example resptran in arcHomoscedasticity (geese in arc)l For grouped data this is the same as homogeneity of variancel For ungrouped data – variability for one variables is the same at all levels of another variable (no variance interaction)Multicollinearity/Singularityl If correlations between two variables are excessive (e.g. .95) then this represents multicollinearityl If correlation is 1 then you have singularityl Often Multicollinearity/Singularity occurs in data because one variable is a near duplicate of another (e.g. variables used plus a composite of the


View Full Document

CSUN PSY 524 - Data Screening 1

Download Data Screening 1
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Data Screening 1 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Data Screening 1 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?