# Pitt CS 2750 - Linear regression (13 pages)

Previewing pages*1, 2, 3, 4*of 13 page document

**View the full content.**## Linear regression

Previewing pages *1, 2, 3, 4*
of
actual document.

**View the full content.**View Full Document

## Linear regression

0 0 57 views

Other

- Pages:
- 13
- School:
- University of Pittsburgh
- Course:
- Cs 2750 - Machine Learning

**Unformatted text preview: **

CS 2750 Machine Learning Lecture 8 Linear regression cont Linear methods for classification Milos Hauskrecht milos cs pitt edu 5329 Sennott Square CS 2750 Machine Learning Coefficient shrinkage The least squares estimates often have low bias but high variance The prediction accuracy can be often improved by setting some coefficients to zero Increases the bias reduces the variance of estimates Solutions Subset selection Ridge regression Principal component regression Next ridge regression CS 2750 Machine Learning 1 Ridge regression Error function for the standard least squares estimates 1 J n w yi w T x i 2 n i 1 n 1 We seek w arg min yi w T x i 2 n i 1 n w Ridge regression 1 J n w yi w T x i 2 w n i 1 n Where 2 w d w i 0 2 i and 2 0 What does the new error function do CS 2750 Machine Learning Ridge regression Standard regression J n w Ridge regression J n w w 2 d w 1 n 1 n y i 1 n y i 1 n i i w T xi 2 w T xi 2 w 2 2 i penalizes non zero weights with the cost proportional to a shrinkage coefficient If an input attribute x j has a small effect on improving the error function it is shut down by the penalty term Inclusion of a shrinkage penalty is often referred to as regularization i 0 CS 2750 Machine Learning 2 Supervised learning Data D d1 d 2 d n a set of n examples d i x i yi x i is input vector and y is desired output given by a teacher Objective learn the mapping f X Y s t y i f x i for all i 1 n Two types of problems Regression Y is continuous Example earnings product orders Classification Y is discrete Example temperature heart rate company stock price disease Today binary classification problems CS 2750 Machine Learning Binary classification Two classes Y 0 1 Our goal is to learn to classify correctly two types of examples Class 0 labeled as 0 Class 1 labeled as 1 We would like to learn f X 0 1 Zero one error loss function 1 Error 1 x i y i 0 f x i w yi f x i w yi Error we would like to minimize E x y Error 1 x y First step we need to devise a model of the

View Full Document