CMU CS 10601 - assignment - D667044

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 10601> assignment

DOC PREVIEW

CMU CS 10601 - assignment

School name Carnegie Mellon University

Course Cs 10601- Introduction to Machine Learning

Pages 2

This preview shows page 1 out of 2 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 2 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 2 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

10-601 Machine Learning: Assignment 3• The assignment is du e at 3:00pm (beginning of class) on Wednesday, Feb 20.• Write your name at the top right-hand corner of each page submitted.• Each student must hand in a writeup. See the course webpage for the collaboration policy.• For the programming portions of the assignment, you can use whatever programming languageyou are comfortable with. Submit co de into the /hw3/ folder in your afs directory.•1 Q1 Logistic Regresion... [20 pts]Exercise 3 from Naive Bayes and Logistic Regression reading.2 Q2 ...And Gradient Descent [35 pts]1. Download the portion of Fisher’s Iris Flower data available on the class webpage. Implementlogistic regression using gradient descent for weight updates, as discuss ed in the reading.Report the weights. Use 10-folds cross validation and report the classification accuracy.2. Now repeat the experiment, this time using regularization as described in the reading. Varyyour parameters η and λ. What happens when you change these parameters?3. Do you notice any differences in the results of the two different methods? Why or why notmight that happen?3 Q3 Statistical Tests [20 pts]1. Depending on what we are trying to measure, we may choose to use a different hypothesistest.Suppose you had a set of webpages X, labeled with the type of page Y (product, shoppingcart, etc) D = Dij=< xij, yij>, representing sets of pages i ∈ P from websites j ∈ S. Youwould like to compare hypotheses (in this case, classifiers) hA: X → Y and hB: X → Y .You have two options for testing the hypotheses:Option 1: Split D into sets D∗1, ..., D∗Swhere each set is all the pages from one website (n otethat some websites may h ave more pages than others). Do a paired test.Option 2: Split D evenly into |S| different sets (each set containing pages sampled from allwebsites), and perform a paired test.1Option 3: Apply a test of hAand hBto all data in D, and compare.Which option would you expect to yield more accurate results for testing whether hAismore accurate than hBfor future websites? Which op tion would be best for getting the trueaccuracy of hAand hB? Explain your decisions.2. Going back to the 20-sided die example, suppose that your friend’s die scored a “critical hit”(19 or 20), 30 times out of 120 rolls. If pcis the probability of a critical hit (pc= 0.1 in a fairdie), what is the p-value of havin g a fair die in that situation? What if th ere were 50 criticalhits in 200 rolls? Are the p-values the same? Why or why not?3. Given a fair 20-sided die (p = .1), what is a 95% confidence interval for the number of criticalhits you could expect out of 200 rolls? Out of 1000

View Full Document

CMU CS 10601 - assignment

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 2 pages.

CMU CS 10601 - assignment

Sign up for free to view:

Please select your school