Unformatted text preview:

Lecture outline• Classification• Naïve Bayes classifier• Nearest-neighbor classifierEager vs Lazy learners• Eager learners: learn the model as soon as the training data becomes available• Lazy learners: delay model-building until testing data needs to be classified– Rote classifier: memorizes the entire training datak-nearest neighbor classifiersXXX(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbork-nearest neighbors of a record x are data points that have the k smallest distance to xk-nearest neighbor classification• Given a data record x find its k closest points– Closeness: Euclidean, Hamming, Jaccard distance • Determine the class of x based on the classes in the neighbor list– Majority vote– Weigh the vote according to distance• e.g., weight factor, w = 1/d2– Probabilistic votingCharacteristics of nearest-neighbor classifiers• Instance of instance-based learning• No model building (lazy learners)– Lazy learners: computational time in classification– Eager learners: computational time in model building• Decision trees try to find global models, k-NN take into account local information• K-NN classifiers depend a lot on the choice of proximity measureBayes Theorem• X, Y random variables• Joint probability: Pr(X=x,Y=y)• Conditional probability: Pr(Y=y | X=x)• Relationship between joint and conditional probability distributions• Bayes Theorem:)Pr()|Pr()Pr()|Pr(),Pr( XXYYYXYX)Pr()Pr()|Pr()|Pr(XYYXXYBayes Theorem for Classification• X: attribute set• Y: class variable• Y depends on X in a non-determininstic way• We can capture this dependence using Pr(Y|X) : Posterior probabilityvsPr(Y): Prior probabilityBuilding the Classifier• Training phase:– Learning the posterior probabilities Pr(Y|X) for every combination of X and Y based on training data• Test phase:– For test record X’, compute the class Y’ that maximizes the posterior probabilityPr(Y’|X’)Bayes Classification: ExampleX’=(Home Owner=No, Marital Status=Married, AnnualIncome=120K)Compute: Pr(Yes|X’), Pr(No|X’) pick No or Yes with max Prob.How can we compute these probabilities??Computing posterior probabilities• Bayes Theorem• P(X) is constant and can be ignored• P(Y): estimated from training data; compute the fraction of training records in each class• P(X|Y)?)Pr()Pr()|Pr()|Pr(XYYXXYNaïve Bayes Classifier• Attribute set X = {X1,…,Xd} consists of dattributes• Conditional independence:– X conditionally independent of Y, given X: Pr(X|Y,Z) = Pr(X|Z)– Pr(X,Y|Z) = Pr(X|Z)xPr(Y|Z)diiyYXyYX1)|Pr()|Pr(Naïve Bayes Classifier• Attribute set X = {X1,…,Xd} consists of dattributesdiiyYXyYX1)|Pr()|Pr()Pr()|Pr()Pr()|Pr(1XYXYXYdiiConditional probabilities for categorical attributes• Categorical attribute Xi• Pr(Xi = xi|Y=y): fraction of training instances in class y that take value xion the i-th attributePr(homeOwner = yes|No) = 3/7Pr(MaritalStatus = Single| Yes) = 2/3Estimating conditional probabilities for continuous attributes?• Discretization?• How can we discretize?Naïve Bayes Classifier: Example• X’ = (HomeOwner = No, MaritalStatus = Married, Income=120K)• Need to compute Pr(Y|X’) or Pr(Y)xPr(X’|Y)• But Pr(X’|Y) is– Y = No:• Pr(HO=No|No)xPr(MS=Married|No)xPr(Inc=120K|No) = 4/7x4/7x0.0072 = 0.0024– Y=Yes:• Pr(HO=No|Yes)xPr(MS=Married|Yes)xPr(Inc=120K|Yes) = 1x0x1.2x10-9= 0Naïve Bayes Classifier: Example• X’ = (HomeOwner = No, MaritalStatus = Married, Income=120K)• Need to compute Pr(Y|X’) or Pr(Y)xPr(X’|Y)• But Pr(X’|Y = Yes) is 0?• Correction process:mnmpnyYxXcjii)|Pr(nc: number of training examples from class yjthat take value xin: total number of instances from class yjm: equivalent sample size (balance between prior and posterior)p: user-specified parameter (prior probability)Characteristics of Naïve BayesClassifier• Robust to isolated noise points – noise points are averaged out• Handles missing values – Ignoring missing-value examples• Robust to irrelevant attributes– If Xiis irrelevant, P(Xi|Y) becomes almost uniform• Correlated attributes degrade the performance of NB


View Full Document

BU CS 565 - Lecture notes

Documents in this Course
Load more
Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?