Chapter 11 Editing Definition the inspection and correction of the data received from each element of the sample or census The basic purpose of editing is to make certain that the raw data meet During editing process you must also decide what to do about cases with minimum quality standards incomplete answers o More than half missing than drop that case entirely Coding them Definition is the process of transforming raw data into symbols Most often the numbers are numerals because computers can easily handle The task is to transform respondent s answers or other information to be coded into numbers representing the answers Coding close ended items o In descriptive research most of the items included in a questionnaire are likely to be closed ended o Close ended means that most questions will provide a limited number of response categories and will ask the respondent to choose the best response o When single possible response then use on variable for the question and simply assign a character EX 1 female 2 male o When respondents can indicate more than one answer for a given question they create six variables to represent six of possible answers o They could record a 1 if the respondent selected a response and record 0 if she didn t o Ex 1 female 2 male Coding open ended items o Coding factual open ended items Example of factual question What wear were you born These types of questions are coded by their actual response Recorded by the actual year o Coding exploratory open ended items The first step is to go through each questionnaire and highlight each separate response given by each individual Next is specifying the categories or classes into which the respondents are to be placed The categories must be mutually exclusive and exhaustive so that every open ended response logically falls into one category Each response identified must be given the code number for one of the categories developed in the second step Multiple coders help reduce bias in the interpretation od the different response Building the data file When each coder has coded all the responses the coders meet to compare the results discuss differences in the codes assigned to particular responses and assign a final code for each response o To use a computer to analyze the data the codes representing respondents answers must be placed in data file that the computer can read o The process is automatic data are normally stored in spreadsheet format and can be downloaded for further analysis o Regardless of how the data was entered it helps to visualize the input in terms of a multiple column record where columns represent different variables based on items from the questionnaire and rows represent different respondents o The codebook Definition a document that contains explicit directions about how data from data collection forms are coded into the data file At minimum the codebook must contain The variable name to be used in statistical analyses for each variable included in the data file A description of how each variable is coded An explanation of how missing data are treated in the data file The codebook is a map to help the researcher navigate from completed questionnaires to the data file Cleaning the data Blunders office errors that occur during editing coding or especially data entry when done by hand o Most frustrating because they are caused by simple carelessness o The blunder can be seen by performing a frequency count o Frequency count tells us all of the different responses coded for a variable along with how many cases responded in each way o Blunders Can Be Located By Examining frequency distributions on all variables Checking a sample of questionnaires against the data file Double entry of data in which data are entered into two separate data files and then compared for discrepancies preferred Optical scanning can be used to read responses Double entry requires the data to be entered by two separate people in two separate data files and then the data files be compared for discrepancies o This approach would likely provide the cleanest response o This technique requires greater resources time effort money Optical scanning takes information directly from the data collection form and reads it into the data file Handling missing data Item nonresponse a source of non sampling error that arises when a respondent agrees to an interview but refuses or is unable to answer specific question Forcing answers will likely lead to o Response error when respondents simply choose a response so that they can get on with the survey o Nonresponse error when individuals become frustrated and simply terminate the process Possible strategies o Eliminate the case with the missing items from all further analysis This extreme strategy results in a pure data set with no missing information at all This strategy excludes data that may be perfectly useful for some analysis o Eliminate the case with the missing item in analyses using the variable When using this approach you ll need to continually report the number of cases on which an analysis is based because the sample size wont be constant across analyses Advantage all available data are used for each analysis o Substitute values for missing items Substitute values based on responses to other related items or by determining the mean median or mode for the variables The substitution of values makes maximum use of data because all the reasonably good cases are used It contains more work for the analysts and has some potential for bias o Contact the respondent again This approach is especially applicable if it appears that the respondent simply missed the item all together Chapter 12 Introduction Data analysis hinges on two considerations about the variable to be analyzed o Will the variable be analyzed in isolation univariate analysis or in relationship to one or more variables multivariate analysis o What level of measurement nominal ordinal interval ratio was used to measure the variable Basic univariate statistics categorical measures o Categorical measures a commonly used expression for nominal and ordinal measures Frequency analysis Other uses for frequencies o Definition consists of counting the number of cases that fall into the various response categories o Commonly used to report the overall results of marketing research o Percentages along with raw count for frequency analysis should studies always be included o Percentages should be rounded to whole numbers o
View Full Document