DOC PREVIEW
BU CS 565 - Lecture Outline

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture outlineEager vs Lazy learnersk-nearest neighbor classifiersk-nearest neighbor classificationCharacteristics of nearest-neighbor classifiersBayes TheoremBayes Theorem for ClassificationBuilding the ClassifierSlide 9Computing posterior probabilitiesNaïve Bayes ClassifierNaïve Bayes ClassifierConditional probabilities for categorical attributesEstimating conditional probabilities for continuous attributes?Naïve Bayes Classifier: ExampleNaïve Bayes Classifier: ExampleCharacteristics of Naïve Bayes ClassifierLecture outline•Classification•Naïve Bayes classifier•Nearest-neighbor classifierEager vs Lazy learners•Eager learners: learn the model as soon as the training data becomes available•Lazy learners: delay model-building until testing data needs to be classified–Rote classifier: memorizes the entire training datak-nearest neighbor classifiersXXX(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor k-nearest neighbors of a record x are data points that have the k smallest distance to xk-nearest neighbor classification•Given a data record x find its k closest points–Closeness: Euclidean, Hamming, Jaccard distance •Determine the class of x based on the classes in the neighbor list–Majority vote–Weigh the vote according to distance• e.g., weight factor, w = 1/d2–Probabilistic votingCharacteristics of nearest-neighbor classifiers•Instance of instance-based learning•No model building (lazy learners)–Lazy learners: computational time in classification–Eager learners: computational time in model building•Decision trees try to find global models, k-NN take into account local information•K-NN classifiers depend a lot on the choice of proximity measureBayes Theorem•X, Y random variables•Joint probability: Pr(X=x,Y=y)•Conditional probability: Pr(Y=y | X=x)•Relationship between joint and conditional probability distributions•Bayes Theorem:)Pr()|Pr()Pr()|Pr(),Pr( XXYYYXYX )Pr()Pr()|Pr()|Pr(XYYXXY Bayes Theorem for Classification•X: attribute set•Y: class variable•Y depends on X in a non-determininstic way•We can capture this dependence using Pr(Y|X) : Posterior probabilityvsPr(Y): Prior probabilityBuilding the Classifier•Training phase:–Learning the posterior probabilities Pr(Y|X) for every combination of X and Y based on training data•Test phase:–For test record X’, compute the class Y’ that maximizes the posterior probabilityPr(Y’|X’)Bayes Classification: ExampleX’=(Home Owner=No, Marital Status=Married, AnnualIncome=120K)Compute: Pr(Yes|X’), Pr(No|X’) pick No or Yes with max Prob. How can we compute these probabilities??Computing posterior probabilities•Bayes Theorem•P(X) is constant and can be ignored•P(Y): estimated from training data; compute the fraction of training records in each class•P(X|Y)?)Pr()Pr()|Pr()|Pr(XYYXXY Naïve Bayes Classifier•Attribute set X = {X1,…,Xd} consists of d attributes•Conditional independence:–X conditionally independent of Y, given X: Pr(X|Y,Z) = Pr(X|Z)–Pr(X,Y|Z) = Pr(X|Z)xPr(Y|Z)diiyYXyYX1)|Pr()|Pr(Naïve Bayes Classifier•Attribute set X = {X1,…,Xd} consists of d attributesdiiyYXyYX1)|Pr()|Pr()Pr()|Pr()Pr()|Pr(1XYXYXYdiiConditional probabilities for categorical attributes•Categorical attribute Xi•Pr(Xi = xi|Y=y): fraction of training instances in class y that take value xi on the i-th attributePr(homeOwner = yes|No) = 3/7Pr(MaritalStatus = Single| Yes) = 2/3Estimating conditional probabilities for continuous attributes?•Discretization?•How can we discretize?Naïve Bayes Classifier: Example•X’ = (HomeOwner = No, MaritalStatus = Married, Income=120K)•Need to compute Pr(Y|X’) or Pr(Y)xPr(X’|Y)•But Pr(X’|Y) is–Y = No:•Pr(HO=No|No)xPr(MS=Married|No)xPr(Inc=120K|No) = 4/7x4/7x0.0072 = 0.0024–Y=Yes:•Pr(HO=No|Yes)xPr(MS=Married|Yes)xPr(Inc=120K|Yes) = 1x0x1.2x10-9 = 0Naïve Bayes Classifier: Example•X’ = (HomeOwner = No, MaritalStatus = Married, Income=120K)•Need to compute Pr(Y|X’) or Pr(Y)xPr(X’|Y)•But Pr(X’|Y = Yes) is 0?•Correction process:mnm pnyYxXcjii )|Pr(nc: number of training examples from class yj that take value xin: total number of instances from class yj m: equivalent sample size (balance between prior and posterior)p: user-specified parameter (prior probability)Characteristics of Naïve Bayes Classifier•Robust to isolated noise points –noise points are averaged out•Handles missing values –Ignoring missing-value examples•Robust to irrelevant attributes–If Xi is irrelevant, P(Xi|Y) becomes almost uniform•Correlated attributes degrade the performance of NB


View Full Document

BU CS 565 - Lecture Outline

Documents in this Course
Load more
Download Lecture Outline
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Outline and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Outline 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?