DOC PREVIEW
Berkeley COMPSCI 188 - Lecture 10: Perceptrons

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 188: Artificial Intelligence Spring 2006AnnouncementsTodayGeneral Naïve BayesExample: Spam FilteringEstimation: Laplace SmoothingSlide 7Estimation: Linear InterpolationReal NB: SmoothingTuning on Held-Out DataSpam ExampleConfidences from a ClassifierPrecision vs. RecallSlide 14Errors, and What to DoWhat to Do About Errors?FeaturesFeature ExtractorsGenerative vs. DiscriminativeSome (Vague) BiologyThe Binary PerceptronExample: SpamBinary Decision RuleThe Multiclass PerceptronExampleThe Perceptron Update RuleSlide 27Mistake-Driven ClassificationProperties of PerceptronsIssues with PerceptronsSummaryCS 188: Artificial IntelligenceSpring 2006Lecture 10: Perceptrons2/16/2006Dan Klein – UC BerkeleyMany slides from either Stuart Russell or Andrew MooreAnnouncementsOffice hours:Dan’s W 3-5 office hours moved to F 3-5 (just this week)Project 2: Out nowWritten 1 back (check in front)Fill in your final exam time surveys! (in front)TodayNaïve Bayes modelsSmoothingReal world issuesPerceptronsMistake-driven learningData separation, margins, and convergenceGeneral Naïve BayesThis is an example of a naive Bayes model:Total number of parameters is linear in n!CE1EnE2Example: Spam FilteringModel:Parameters:the : 0.016to : 0.015and : 0.012...free : 0.001click : 0.001...morally : 0.001nicely : 0.001...the : 0.021to : 0.013and : 0.011...free : 0.005click : 0.004...screens : 0.000minute : 0.000...ham : 0.66spam: 0.33Estimation: Laplace SmoothingLaplace’s estimate:Pretend you saw every outcome once more than you actually didCan derive this as a maximum a posteriori estimate using Dirichlet priors (see cs281a)H H TEstimation: Laplace SmoothingLaplace’s estimate (extended):Pretend you saw every outcome k extra timesWhat’s Laplace with k = 0?k is the strength of the priorLaplace for conditionals:Smooth each condition independently:H H TEstimation: Linear Interpolation In practice, Laplace often performs poorly for P(X|Y):When |X| is very largeWhen |Y| is very largeAnother option: linear interpolationGet unconditional P(X) from the dataMake sure the estimate of P(X|Y) isn’t too different from P(X)What if  is 0? 1?For even better ways to estimate parameters, as well as details of the math see cs281a, cs294-5Real NB: SmoothingFor real classification problems, smoothing is critical… and usually done badly, even in big commercial systemsNew odds ratios:helvetica : 11.4seems : 10.8group : 10.2ago : 8.4areas : 8.3...verdana : 28.8credit : 28.4order : 27.2<font> : 26.9money : 26.5...Do these make more sense?Tuning on Held-Out DataNow we’ve got two kinds of unknownsParameters: the probabilities P(Y|X), P(Y)Hyper-parameters, like the amount of smoothing to do: k, Where to learn?Learn parameters from training dataMust tune hyper-parameters on different dataWhy?For each value of the hyper-parameters, train and test on the held-out dataChoose the best value and do a final test on the test dataSpam ExampleWord P(w|spam) P(w|ham) Tot Spam Tot Ham(prior) 0.33333 0.66666 -1.1 -0.4Gary 0.00002 0.00021 -11.8 -8.9would 0.00069 0.00084 -19.1 -16.0you 0.00881 0.00304 -23.8 -21.8like 0.00086 0.00083 -30.9 -28.9to 0.01517 0.01339 -35.1 -33.2lose 0.00008 0.00002 -44.5 -44.0weight 0.00016 0.00002 -53.3 -55.0while 0.00027 0.00027 -61.5 -63.2you 0.00881 0.00304 -66.2 -69.0sleep 0.00006 0.00001 -76.0 -80.5P(spam | w) = 0.989Confidences from a ClassifierThe confidence of a probabilistic classifier:Posterior over the top labelRepresents how sure the classifier is of the classificationAny probabilistic model will have confidencesNo guarantee confidence is correctCalibrationWeak calibration: higher confidences mean higher accuracyStrong calibration: confidence predicts accuracy rateWhat’s the value of calibration?Precision vs. RecallLet’s say we want to classify web pages ashomepages or notIn a test set of 1K pages, there are 3 homepagesOur classifier says they are all non-homepages99.7 accuracy!Need new measures for rare positive eventsPrecision: fraction of guessed positives which were actually positiveRecall: fraction of actual positives which were guessed as positiveSay we guess 5 homepages, of which 2 were actually homepagesPrecision: 2 correct / 5 guessed = 0.4Recall: 2 correct / 3 true = 0.67Which is more important in customer support email automation?Which is more important in airport face recognition?-guessed +actual +Precision vs. RecallPrecision/recall tradeoffOften, you can trade off precision and recallOnly works well with weakly calibrated classifiersTo summarize the tradeoff:Break-even point: precision value when p = rF-measure: harmonic mean of p and r:Errors, and What to DoExamples of errorsDear GlobalSCAPE Customer, GlobalSCAPE has partnered with ScanSoft to offer you the latest version of OmniPage Pro, for just $99.99* - the regular list price is $499! The most common question we've received about this offer is - Is this genuine? We would like to assure you that this offer is authorized by ScanSoft, is genuine and valid. You can get the . . .. . . To receive your $30 Amazon.com promotional certificate, click through to http://www.amazon.com/appareland see the prominent link for the $30 offer. All details are there. We hope you enjoyed receiving this message. However, if you'd rather not receive future e-mails announcing new store launches, please click . . .What to Do About Errors?Need more features– words aren’t enough!Have you emailed the sender before?Have 1K other people just gotten the same email?Is the sending information consistent? Is the email in ALL CAPS?Do inline URLs point where they say they point?Does the email address you by (your) name?Naïve Bayes models can incorporate a variety of features, but tend to do best in homogeneous cases (e.g. all features are word occurrences)FeaturesA feature is a function which signals a property of the inputExamples:ALL_CAPS: value is 1 iff email in all capsHAS_URL: value is 1 iff email has a URLNUM_URLS: number of URLs in emailVERY_LONG: 1 iff email is longer than 1KSUSPICIOUS_SENDER: 1 iff reply-to domain doesn’t match originating serverFeatures are anything you can think of code to evaluate on an


View Full Document

Berkeley COMPSCI 188 - Lecture 10: Perceptrons

Documents in this Course
CSP

CSP

42 pages

Metrics

Metrics

4 pages

HMMs II

HMMs II

19 pages

NLP

NLP

23 pages

Midterm

Midterm

9 pages

Agents

Agents

8 pages

Lecture 4

Lecture 4

53 pages

CSPs

CSPs

16 pages

Midterm

Midterm

6 pages

MDPs

MDPs

20 pages

mdps

mdps

2 pages

Games II

Games II

18 pages

Load more
Download Lecture 10: Perceptrons
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 10: Perceptrons and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 10: Perceptrons 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?