DOC PREVIEW
UCLA STATS 10 - Stats 10 Lab2

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Anna Audler Stats 10: Section 3A UID: 304295761 TA: Luis Sosa Lab #2 Batter Up Question 1: According to the graph, there is a positive linear relationship, which means that as the number of at-bats increases, the number of runs also increases. However, the graph shows a moderately weak association between the number of at-bats and runs, since the data is somewhat spread/scattered. The graph says that our ability to predict the number of runs based on the number of at-bats is not strong because of the weak association and spread shown through the data. 550600650700750800850900at_bats5400 5450 5500 5550 5600 5650 5700 5750Batting11Scatter Plot5506006507007508008509005400 5450 5500 5550 5600 5650 5700 5750at_batsruns = 0.6305at_bats - 2.79e+03 Sum of squares = 123700; r2 = 0.37Batting11Scatter PlotQuestion 2: If I had to summarize the graph with a single line, I would place it where it would somewhat cross all the points in the graph with an equal distribution of points being above and below the line. For the movable line, I noticed that the “sums of squares” decreases as the line better fits the data because a lower value for sums of squares means that the line describes the data more accurately. Question 3: The sums of squares for the “least squares line” is a much smaller value than the line I chose due to Fathom placing the line precisely where the sums of squares is absolute least. 550600650700750800850900at_bats5400 5450 5500 5550 5600 5650 5700 5750runs = 1.0335at_bats - 5.03e+03 Sum of squares = 160600Batting11Scatter PlotQuestion 4: The residual plot shows that the relationship between at-bats and runs is a good fit because there is data above and below the regression line with no apparent patterns. Question 5: If a team manager saw the regression line and not the actual data, he would predict that a team would get 684 runs if they get 5508 at-bats. This prediction is underestimated for the Cleveland Indians (704 runs; error +20) and overestimated for the Los Angeles Angels (667 runs; -17), Chicago White Sox (654 runs; -30), and Florida Marlins (625 runs; -59). 5506006507007508008509005400 5450 5500 5550 5600 5650 5700 5750at_batsruns = 0.631at_bats - 2.79e+03 Sum of squares = 123700; r2 = 0.37-15001505400 5450 5500 5550 5600 5650 5700 5750at_batsBatting11Scatter Plot5506006507007508008509005400 5450 5500 5550 5600 5650 5700 5750at_batsruns = 0.6305at_bats - 2.79e+03; r2 = 0.37Batting11Scatter PlotQuestion 6: Given my strong knowledge of baseball, I believe that batting average would have the lowest sums of squares compared to at-bats because the batting average represents the ratio of a batter’s safe hits per official times at-bat, which would have a better association to the number on runs scored. There is a more visible linear relationship compared to the at-bats graph. In addition, the sums of squares for batting average is significantly lower than the at-bats value, which means that batting average better describes the association with runs scored than the number of at-bats. Question 7: The slope of the regression equation for the at-bats graph is 0.6305, which means that for every at-bat, a team would get 0.6305 runs. The slope of the regression equation for the batting average graph is 5.24e+03, which means for every .01 increase in batting average, teams should expect 5.24e+03 increase in runs scored. Question 8: The R2 value for at-bats is 0.37 and R2 for batting average is 0.66. This means that the relationship between batting average and runs is stronger than the relationship between at-bats and runs. A higher R2 value means that al of the variance is explained by the variable on the x-axis and we want R2 to be as close to 1 as possible. 550600650700750800850900bat_avg0.23 0.24 0.25 0.26 0.27 0.28runs = 5.24e+03bat_avg - 643 Sum of squares = 67850; r2 = 0.66Batting11Scatter PlotQuestion 9: Another variable that would best predict runs scored would be total bases covered. There is a linear relationship, which shows the sums of squares value being lower than that of batting average, and R2 is 0.89, which means that there is a stronger association between total bases covered and runs scored. This variable would be considered a better predictor of runs scored compared to batting average, which is not a surprise to me since a base runner would need to run across all 4 bases in order to score. 550600650700750800850900totalbases1800 2000 2200 2400 2600runs = 0.424totalbases - 242 Sum of squares = 22070; r2 = 0.89Batting11Scatter PlotQuestion 10: According to the data of the newer variables, the researchers were successful in being able to find better predictors of runs scored. For example, the variable “on-base plus slugging” had a much more stronger positive linear relationship, an even lower sums of squares value, and R2 was 0.93 which is much closer than the value for total bases in the original data. This result makes sense because the more a batter is on base and the more powerful a batter he is, the more likely he is able to score. Summary Question: The topics covered in this lab consist of the material covered in chapter 4 from the textbook, which include residuals and linear models. From lecture, the topics covered included correlation, linear models, residuals, “best fit” means least squares, least squares line, slope, R2, and regressions. Some of these concepts were covered in the previous homework assignment for chapter 4 and in the midterm review topics. Some topics not included were how intercepts figure into the least squares line and the idea of extrapolation. 550600650700750800850900OPS0.64 0.68 0.72 0.76 0.80runs = 1.92e+03OPS - 687 Sum of squares = 12840; r2 = 0.93NewBatting11Scatter


View Full Document

UCLA STATS 10 - Stats 10 Lab2

Download Stats 10 Lab2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Stats 10 Lab2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Stats 10 Lab2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?