Business Statistics Exam 1 October 6 2010 Chapter 2 Descriptive Statistics coping with numbers Draw a picture graphs charts etc Calculate a few numbers which summarize the data mean median percentile Inferential Statistics drawing conclusions based on a sample of a population ex Age Sex Major Variable the aspect characteristic that differs from subject to subject individual to individual Data the value of the variables Quantitative numerical Variable numbers measurements income ex 20 Male Engineering ex age height Discrete Variables there is a natural gap between values Ex Number of kids you can t have 2 5 kids Continuous Variables the values can be arbitrarily close together Ex Weight height age Qualitative categorical Variables classifying each observation ex sex year major Ordinal Variables categories have a natural ordering Ex Year in school grades preference strongly agree agree etc Nominal Variables categories that have no natural ordering Ex Major eye color state Identifier Variable when there are as many categories as individuals and only one ex Student ID Number individual in each category Interval Data no meaningful zero point can t multiply or divide but the difference between two values is meaningful Ratio Data meaningful zero point can multiply and divide ex income weight height Time Series Data ordered data values across time ex temperature Ex data from one city over multiple years Cross Sectional Data data values observed at a single point in time Ex one year data from multiple cities Chapter 3 Bias sample does not represent population Generalizations are no longer valid Conclusions may no longer be true Selection Bias sample does not represent the entire population systematic tendency to exclude one kid of individual from the survey Non response Bias subjects do not answer or skip questions Response Bias subjects lie interviewer affects the answer of the subject How to get rid of biases randomize Sampling Error sample to sample variation different results from different random samples Undercoverage some proportion of the population is not sampled at all or has a smaller representation in the sample than the it has in the population always possible ex mail survey Population the entire group of individuals which we are interested but can t usually asses directly Parameter a number describing a characteristic of the population Sampling Frame a list of individuals from which the sample is drawn Sample the part of the population we actually examine and for which we do have data Sample size is what matters NOT the fraction of the population Statistic a number describing a characteristic of a sample Non statistical Sampling Techniques Convenience collected in the most convenient manner for the researcher Ex reporter on street Bias opinions limited to individuals present selection bias Voluntary individuals choose to be involved Ex Internet poll Bias Non response bias sample design favors a particular outcome Statistical Sampling Techniques Simple Random Sampling SRS every possible sample of a given size has an equal chance of being selected Ex Draw names out of a hat computer random number generator Stratified Random Sampling divide population into homogenous subgroups strata according to some characteristic ex gender income level age select a simple random sample from each subgroup combine samples from subgroups into one Cluster Sampling divide population into several heterogeneous clusters each representative of the population select a simple random sample of clusters combine selected clusters into one group Systematic Random Sampling randomly select every nth individual Multistage Samples sampling schemes that combine several methods Census when every member of the population is included in the sample Survey Design define the issue define the population of interest develop survey questions pre test to survey determine the sample size and sampling method select sample and administer the survey Types of questions Closed end select from a list Open end write any response Demographic about personal characteristics Chapter 4 Frequency Table shows the count number of occurrences for each category Relative Frequency Table shows percentages of the values in each category Cross Classification Table Contingency Table or Two Way Table summarizes the relationship between two categories with row and column totals Categorical Data Bar Graph displays the distribution of a categorical variable showing the counts for each category next to each other for easy comparison Relative Frequency Bar Graph replaces the counts with relative frequency Categories cannot overlap Segmented Bar Graph treats each bar as the whole and divides it proportionally into segments corresponding to the percentages in each group Is similar to comparing multiple pie charts Pie Chart show the whole group broken into several categories and displayed in proportion to the fraction of the whole in each category Categories cannot overlap Area Principle the area occupied by a part of the graph should correspond to the magnitude of the value it represents Perspective 3 Dimensional distorts the graph Contingency Table shows how the individuals are distributed along each variable depending on or contingent on the value of the other variable Marginal Distributions the frequency distribution of either one of the variables can be viewed in right or bottom margins of the contingency table Conditional Distribution shows how one variable is affected by the other one by looking at only the row in a contingency table Can compare conditional distributions in pie charts Independent Variable when the distribution of one variable is the same for all categories of another there is no association between variables Simpson s Paradox combining percentages over several categories can give different results than looking at categories percentages separately This is because lurking variables can distort percentages total but when broken down we are able to see the true data Ex Lurking variable in hospital example is patient condition total percentage may show that Hospital B had fewer death percentages but Hospital A had more patients in poor condition Chapter 5 Histogram plots the bin counts as the height of the bars Only for continuous data 1 Determine number of bins and their boundaries class width 2 Determine frequency in each bin If a data falls on an endpoint it goes into the upper bin When looking at histograms notice shape center
View Full Document