UCLA STATS 10 - Ch07 - D2491827

Home> Schools> University of California, Los Angeles> Statistics (STATS) > STATS 10> Ch07

DOC PREVIEW

UCLA STATS 10 - Ch07

School name University of California, Los Angeles

Course Stats 10- Introduction to Statistical Reasoning

Pages 14

This preview shows page 1-2-3-4-5 out of 14 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 14 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 14 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 14 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 14 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 14 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 14 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1STAT 10, UCLA, Ivo DinovSlide 1UCLA STAT 10Introduction toStatistical ReasoningInstructor: Ivo Dinov, Asst. Prof. In Statistics and NeurologyTeaching Assistants: Yan Xiong, Will Anderson,UCLA StatisticsUniversity of California, Los Angeles, Winter 2002http://www.stat.ucla.edu/~dinov/STAT 10, UCLA, Ivo DinovSlide 2Chapters 7-10: Lines in 2D(Regression and Correlation)Vertical LinesHorizontal LinesOblique linesIncreasing/DecreasingSlope of a lineInterceptY=α X + β, in general.Math Equation for the Line?STAT 10, UCLA, Ivo DinovSlide 3Chapters 7-10: Lines in 2D(Regression and Correlation)Draw the following lines:Y=2X+1Y=-3X-5Line through (X1,Y1) and (X2,Y2). (Y-Y1)/(Y2-Y1)= (X-X1)/(X2-X1). Math Equation for the Line?STAT 10, UCLA, Ivo DinovSlide 4Approaches for modeling data relationshipsRegression and CorrelationThere are random and nonrandom variablesCorrelation applies if both variables (X/Y) are random (e.g., We saw a previous example, systolic vs. diastolic blood pressure SISVOL/DIAVOL) and are treated symmetrically.Regression applies in the case when you want to single out one of the variables (response variable, Y) and use the other variable as predictor (explanatory variable, X), which explains the behavior of the response variable, Y.STAT 10, UCLA, Ivo DinovSlide 5Causal relationship? – infant death rate (per 1,000) in 14 countries4060 80% Breast feeding at 6 months206010014020 40 60 80 100% Access to safe water406080Predict behavior of Y (response)Based on the values of X(explanatory var.) Strategies foruncovering the reasons (causes)for an observed effect.Strong evidence (linear pattern)of death rate increase with increasing level of breastfeeding (BF)?Naïve conclusion breast feeding isbad? But high rates of BF is associated with lower access to H2O.STAT 10, UCLA, Ivo DinovSlide 6Regression relationship = trend + residual scatter9000 10000 11000 12000Disposable income ($)9000 10000 11000 12000(a) Sales/incomeDisposable income ($)From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999. Regression is a way of studying relationships between variables (random/nonrandom) for predicting or explaining behavior of 1 variable (response) in terms ofothers (explanatory variables or predictors).2STAT 10, UCLA, Ivo DinovSlide 71000 2000 3000 4000Ventilation1000 2000 3000 4000Ventilation(b) Oxygen uptakeFrom Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.Trend ( does not have to be linear) + scatter (could be of any type/distribution)STAT 10, UCLA, Ivo DinovSlide 815 20 25 30 35 40102030405060Gestational age (wk)15 20 25 30 35 40102030405060(c) Liver lengthsGestational age (wk)From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 1999.Trend + scatter (fetus liver length in mm)Change of scatter with ageSTAT 10, UCLA, Ivo DinovSlide 9Trend + scatterDotted curves (confidence intervals) represent the extend of the scatter.200030004000Weigh t (lbs)5000 200030004000Weight (lbs)5000(a) Scatter plot (b) With trend plus scatterOutliersFigure 3.1.7Displacement versus weight for 74 models of automobile.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.STAT 10, UCLA, Ivo DinovSlide 10Looking verticallyFlatter line gives better prediction, since it approx. goes through themiddle of the Y-range, for each fixed x-value (vertical line)xxyy (a) Which line? (b) Flatter line givesbetter predictions.Figure 3.1.8 Educating the eye to look vertically.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.STAT 10, UCLA, Ivo DinovSlide 11Outliers – odd, atypical, observations (errors, B, or real data, A)100 300 500Diastolic volumeBAFigure 3.1.9 Scatter plot from the heart attack data.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.STAT 10, UCLA, Ivo DinovSlide 12A weak relationship58 abused children are rated(by non-abusive parents and teachers) on a psychological disturbancemeasure.How do we quantify weak vs. strongrelationship?40 60 80Parent’s rating20Figure 3.1.10Parent's rating versus teacher'srating for abused children.From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.3STAT 10, UCLA, Ivo DinovSlide 13In observational data, strong relationshipsare not necessarily causal. It is virtually impossible to conclude a cause-and-effect relationship between variables using observational data!A note of caution!STAT 10, UCLA, Ivo DinovSlide 14Essential Points1. What essential difference is there between the correlation and regression approaches to a relationship between two variables? (In correlationindependent variables; regression response var depends on explanatory variable.)2. What are the most common reasons why people fit regression models to data? (predict Y or unravel reasons/causes of behavior.)3. Can you conclude that changes in X caused the changes in Y seen in a scatter plot if you have data from an observational study? (No, there could be lurking variables, hidden effects/predictors, also associated with the predictor X, itself, e.g., time is often a lurking variable, or may be that changes in Y cause changes in X, instead of the other way around).STAT 10, UCLA, Ivo DinovSlide 15Essential Points5. When can you reliably conclude that changes in X cause the changes in Y? (Only when controlled randomized experiments are used – levels of X are randomly distributed to available experimental units, or experimental conditions need to be identical for different levels of X, this includes time.STAT 10, UCLA, Ivo DinovSlide 16Correlation Coefficient Correlation coefficient (-1<=R<=1): a measure of linear association, or clustering around a line of multivariate data. Relationship between two variables (X, Y) can be summarized by: (µX, σX), (µY, σY) and the correlation coefficient, R. R=1, perfect positive correlation (straight line relationship), R =0, no correlation(random cloud scatter), R = –1, perfect negative correlation. Computing R(X,Y): (standardize, multiply, average)−−−−∑∑∑∑====−−−−−−−−====yykxxkyNkxNYXRσσσσµµµµσσσσµµµµ111),(X={x1, x2,…, xN,}Y={y1, y2,…, yN,}(µX, σX), (µY, σY)sample mean / SD. STAT 10, UCLA, Ivo DinovSlide

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5 out of 14 pages.

UCLA STATS 10 - Ch07

Sign up for free to view:

Please select your school