MSU STAT 421 - STAT 421 Chapter 1

STAT 421 Chapter 1

STAT 421 Chapter 1

- Pages:
- 15
- School:
- Montana State University - Bozeman
- Course:
- Stat 421 - Probability Theory

Chapter 1 1 K M Introduction p 1 What is statistics It consist of three major areas Data Collection sampling plans and experimental designs Descriptive Statistics numerical and graphical summaries of the data collected from a sample Inferential Statistics estimation confidence intervals and hypothesis testing of parameters of interest Statistical procedures are part steps 2 5 below of the Scientific Method first espoused by Sir Francis Bacon 1561 1626 who wrote to learn the secrets of nature involves collecting data and carrying out experiments The modern methodology 1 2 3 4 5 Observe some phenomenon State a hypothesis explaining the phenomenon Collect data Test Does the data support the hypothesis Conclusion If the test fails go back to step 2 Application of statistical thinking does not include whining or emotional arguments If you encounter a scientific claim that you disagree with scrutinize the steps of the scientific method used Statistics don t lie but liars do statistics Mark Twain What is mathematical statistics The study of the theoretical foundation of statistics What is probability theory The theoretical foundation of statistics POPULATION vs SAMPLE Individuals subjects units The objects from which data is collected Individuals may be people places animals things even time periods Population The entire group of individuals which can be either existent or conceptual that we want information about For example all grizzly bears in Yellowstone National Park all G E light bulbs made now and in the future all tosses with a weighted die Sample A subset of the population from which data is collected For example 22 tagged grizzly bears in Yellowstone National Park 1 box G E light bulbs 100 tosses with a weighted die Typically it is unrealistic to obtain data from the entire population of interest So one collects data from a sample and uses the sample results to draw conclusions about the population This process is called Inference 1 GOAL OF STATISTICS To make an inference about a population based on information contained in a sample from that population and to provide an associated measure of goodness for the inference 1 Variable Any characteristic of an individual which can be measured Two Types of Variables Categorical or Qualitative The possible values are categories Beware some category names are actually numbers e g zip codes and dates Numerical or Quantitative The possible values are numbers so that mathematical operations such as averaging make sense QUESTION Categorical or Numerical Individuals Population of interest 1 Lifetime of a battery 2 Type of battery 3 Distance to school 4 UPC Two Types of Numerical Variables Discrete The possible values are isolated points on the number line Discrete variables can be either finite e g the number of beers left in a six pack 0 1 2 3 4 5 or 6 infinite e g the number of full minutes until the next terrorist attack 0 1 2 3 Continuous The possible values are an interval on the number line e g the distance between any two students in this classroom in feet is in the interval 0 50 all real numbers between 0 and 50 including 0 and excluding 50 QUESTION Discrete or Continuous 1 Amount of money on you 2 Your height 3 Reaction time 4 Number of children you have 1 p 2 3 2 1 1 The statistical software package R The authors of your textbook offer a set of applets for use online at http www thomsonedu com statistics book content 0495110817 wackerly applets seeingstats index html Feel free to use these However students last semester reported problems accessing these applets from the computers available in Math I am not aware of Windows machines having any problems We ll be using the software package R whenever convenient It s free powerful and ubiquitous 1 1 1 Obtaining and Installing R Here s some install instructions 2 1 Get on the Internet and go to the web address http cran r project org This is the official site of the The Comprehensive R Archive Network CRAN Bookmark this address Lots of information manuals answers to frequently asked questions etc can be downloaded from this site 2 The first box on this page is labeled Download and Install R In that box click on the appropriate link For example MAC users will click on MAC OS X and Microsoft Windows users will click on the link Windows The rest of these instructions are specific to Windows users 3 On the new page click on the link named base 4 Click on the link Download R 2 13 1 for Windows Download the setup program R 2 13 1win exe to the hard drive on your computer 5 Exit from your Internet Browser and open Windows Explorer Go to the folder in which you saved R 2 13 1 win32 exe and run the program 6 You will be guided through the installation by a Setup Wizard There are many excellent resources for using R For example check out http cran rproject org doc contrib Verzani SimpleR pdf written by John Veranzi Special purpose software routines are bundled as separate packages Some packages are automatically downloaded when base R is downloaded To download additional packages execute R on your PC and then click on the tab Packages from one of the tabs at the top of the screen From the drop down menu click on Install package s and then choose the package s that you want to download The packages that we may need to download for this course are the following lattice pastecs MASS is another package which we will be using which you do NOT need to download because it is a part of base R 2 Revised from what appears in Chapter 7 1 of Robert Boik s Course Notes Statistics for Researchers STAT401 FALL 2006 3 1 1 2 Entering Data into R A researcher is interested in determining whether adding a certain type of bacteria called PC helps increase the firmness of cottage cheese Seven dairies make two identical batches of cottage cheese one with and one without the bacteria PC The results of the experiment are in a text file called dairy txt which is shown below and available at the course web site Farm Treatment Firmness A withPC 68 A withoutPC 61 B withPC 75 B withoutPC 69 C withPC 62 C withoutPC 64 D withPC 86 D withoutPC 76 E withPC 52 E withoutPC 52 F withPC 46 F withoutPC 38 G withPC 72 G withoutPC 68 Text data files that are tab or space delimited can be imported into R xls can also be directly imported This means that the names of the variables in the file can not have spaces in them e g don t use Cheese Firmness To get dairy txt into R execute the following command D read table dairy txt header

