Chapter 12 Preparation of Data Analysis Here is the situation so far Surveys to get info Managers have problems Researchers come to help Did their sample sizes Questionnaires have been returned Now we will talk about the Data Analysis Extracting information from Fielded the studies numbers given Computers are collecting so much information from online activity that data Big data talking about lots of petabytes of data so much data to look at Suffering from Data Overload is being called big data Steps in Data Preparation 1 Editing Data is going to come in in a lot of different ways Types of data editing 1 Personal editing humans manually checks the questionnaire 2 Computer editing Computer programs written to look for possible errors computer alerted us that the bubbles weren t bubbled in dark enough 3 Field editing done out in the field with a supervisor doing the editing 4 Office editing all just shipped off to another office Looking for completeness legibility consistency and accuracy Fundamental issue here GET THE ERRORS OUT OF DATA Types of errors Interviewer error interviewer makes incorrect response category 1 2 Respondent error inconsistent answers Wrong informant survey not completed by someone else sisters did survey for mom Return to sender not correct address Illegible writing cannot be read Incomplete responses VERY COMMON taking a test on the scantron and you skip numbers 7 and 8 or Christmas treeing a scantron Damaged measuring instrument Portions are missing or damaged ripped or spilt lemonade on them Apparently confused respondent individual didn t understand the directions HAVE TO SPOT VISUALLY Lack of variance among responses artificial consistency bubbled in the same answers for all the questions Lack of consistency among responses provides answers that are contradictory will put same question twice and see if the respondent answered the same Late responses responses arrive after cutoff date No valid response missing data point Enough problems with a questionnaire we will toss out entirely 2 Coding Once the data has been edited it must be put into a form that the computer will read Close ended questions are easy to code Coding system must be both mutually exclusive response can only go into one category and collectively exhaustive there aren t any other possible numbers that it can be example in PowerPoint page 4 Coding missing responses give missing values a number Coding open ended questions Have to code all of the written material and turn all the information into numbers so the computer can read the information Must create categories for the computer to read Process is highly subjective example in PowerPoint page 5 Precoded questionnaires Numbers are already there Can clutter up the questionnaire Codebook A copy of the questionnaire that shows how each response is coded Blueprint for properly coding the data Codebook includes Variable name name of the variable to be coded male column and female column 1 male and 0 female Translating scratch marks into numbers 3 Entering data Physical process of getting numbers into the computer Common problems 1 Transposing of numbers People key punching numbers in old fashion way Will have more then one person enter the data so they can spot inconsistency 2 Entering number outside of code 4 Tabulating data Tabulation consists of arranging data in tabular form Simple tabulation One Way tabulation like age for example in PowerPoint on page 7 frequency distribution DESCPRIPTION snap shot of the data Cross tabulation Two way frequency tabulations like age and income on example in PowerPoint page 7 REALTIONSHIP between variables can tell us an example for the real world Why would we have a two way tabulation Know grade and gender for a test Who did better male or female Dr Pepper drink ads for men only a study must have been done show that men like this drink more than woman do 5 Reviewing tabulation Even when coding editing and data entry are carefully done ERRORS CAN STILL OCCUR Frequency distribution can check for some data entry errors Frequency distribution shows values for each of the category responses can spot the errors Chapter 13 Descriptive Analysis Percentage of major religions on slide from PowerPoint Turned frequency distribution into a graph chart This is a descriptive graph have more of an impact on people then just words This graph is getting into the idea of collecting data Statistical analysis technique Statistics are used to describe the sample and make inferences about a population from a sample Statistical methods 1 Descriptive statistics 2 Inferential statistics Differences Relationships Correlation and regression analysis correlation test of 2 variables regression examines more than two Questions Addressed by descriptive statistics 1 What is the average income of the sample 2 How old is the average employee 3 How different are the ages 4 How spread out is the income data Descriptive Statistics 1 Central tendency Where do people conjugate too 2 Dispersion How much spread is in the data is it a funny shape 3 Shape of the distribution Central Tendency Is a single number used to represent a group of numbers 3 measures of the central tendency Arithmetic mean median and mode Arithmetic mean average can be VERY misleading sum of values number of values Median the middle number for odd numbers there is an exact middle number and for even numbers you have to take the average of two of the middle numbers PERFERRED OVER THE MEAN OR AVERAGE Characteristics of the median Positional average Not defined algebraically Sometimes can t be computed exactly Centrally located Mode Value that occurs most frequently in the distribution Bimodal or multimodal data a data set that has two or more modes Characteristics of the mode Highest frequency in a set of values Not affected by extreme values Can t be computed from continuous data Examples in PowerPoint Dispersion Describes the dissimilarity or variability of responses Typical measures Frequency distribution Range Standard deviation and variance Lowest value and highest value Used for any type of scale can be characterized by its range and how spread out the data is Raw data data not organized numerically Array data arranged either ascending or descending Frequency array repeating values arranged together Example in PowerPoint Frequency distribution table showing data grouped by quantity and frequency of each group Example in PowerPoint Describes how spread out the data is
View Full Document