DOC PREVIEW
Berkeley COMPSCI 188 - Written Assignment 4 - Classification

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS188: Artificial Intelligence, Spring 2009Written Assignment 4: ClassificationDue April 30 at the beginning of lectureYou can work on this in groups, but everyone should turn in his/her own work.Don’t forget your name, login, and GSI.Name: Login: GSI:1 Question 1: Course Gossip (9 points)In this question, you will train classifiers to predict whether sentences are about CS 188 or CS 186 (databases)using a bag-of-words Naive Bayes classifier. Each sentence is labeled with the class to which it pertains.Training set Held-out set Test set(188) agents need good models. (188) agents need memory. (186) data have data models.(188) agents need data. (186) DBs have data.(186) buffers need memory.(186) DBs need data models.a) Write down all of the maximum likelihood (relative frequency) parameters for a bag-of-words naiveBayes classifier trained on the training set above. Let Y be the class for a sentence, and W be a word. Youmay omit any parameters equal to 0. Ignore punctuation. Note: Bag-of-words classifiers assume that thewords at every sentence position are identically distributed. Repeated words affect both the likelihood of aword during estimation and sentence probabilities during inference.b) According to your classifier, what is the probability that the first held-out sentence “agents need mem-ory” is about 188?c) Using Laplace (i.e., add one) smoothing for all of your parameters, what is the probability of seeing thetest s entence “data have data models”: P (W1= data, W2= have, W3= data, W4= models)? Assume thatthe only words you ever expect to see are those in your training and held-out sets. Hint: sum over Y , usingthe new estimates after smoothing.d) Using Laplace smoothing, what is the probability according to your classifier that the test sentence“data have data models” is about 186?e) Suggest an additional feature that would allow the classifier to correctly conclude that “data have datamodels” is about 186 when trained on this training set.22 Question 2: Red Light, Green Light (7 points)You meet the chief administrative officer for intersection oversight for the California department of trans-portation. Her job is to report whether the stoplights at intersections are working correctly. You remark, “Ibet I could automate your job.” She scoffs in disbelief, but humors you by showing you some data.X1X2Yred (-1) red (-1) broken (+1)red (-1) green (+1) working (-1)green (+1) red (-1) wo rking (-1)green (+1) green (+1) broken (+1)a) The data above have been annotated with feature and output values. Y is the label variable to predict.Circle all of the following feature sets make the training data linearly separable.(i) X1, X2(ii) X1only (iii) min(X1, X2), X1, X2(iv) min(X1, X2), X2(v) |X1+ X2|b) Using just X1and X2as features, fill in the resulting weight vectors after two-class perceptron updates(one vector of weights) for the first and second data points sequentially, starting with the initial vector below.X1X2Initial Weights 2 1Weights after observing (X1= −1, X2= −1, Y = 1)Weights after observing (X1= −1, X2= 1, Y = −1)b) Using just X1and X2as features, fill in the resulting weight vectors after two-class MIRA updates forthe first and second data points sequentially, starting with the initial vector below. Assume the maximumupdate size C is 5.X1X2Initial Weights 1 2Weights after observing (X1= −1, X2= −1, Y = 1)Weights after observing (X1= −1, X2= 1, Y = −1)c) Is it possible in any classification problem for a perceptron classifier to reach pe rfect training set accuracyin fewer updates than a MIRA classifier, if they both start with the same initial weights and examine thetraining data in the s ame order? Briefly justify your answer.33 Question 3: One-Sentence Answers (4 points)a) What is one advantage of Naive Bayes over MIRA?b) What is one advantage of MIRA over Naive Bayes?c) What is one advantage of a nearest-neighbor classifier over Naive Bayes?d) Invent a method of combining a Naive Bayes classifier with a MIRA classifier and describe


View Full Document

Berkeley COMPSCI 188 - Written Assignment 4 - Classification

Documents in this Course
CSP

CSP

42 pages

Metrics

Metrics

4 pages

HMMs II

HMMs II

19 pages

NLP

NLP

23 pages

Midterm

Midterm

9 pages

Agents

Agents

8 pages

Lecture 4

Lecture 4

53 pages

CSPs

CSPs

16 pages

Midterm

Midterm

6 pages

MDPs

MDPs

20 pages

mdps

mdps

2 pages

Games II

Games II

18 pages

Load more
Download Written Assignment 4 - Classification
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Written Assignment 4 - Classification and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Written Assignment 4 - Classification 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?