DOC PREVIEW
Brandeis CS 101A - Programming Project #2

This preview shows page 1 out of 2 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 2 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Programming Project #2Along Party LinesThis problem set is due 11/20.Since it is an important year for U.S. elections, we’ll do a political programming project. Ina separate file, we’ve included the party membership and voting records of various member ofcongress. We’d like you to use this data to classify congressmen as Republicans or Democrats.We’d like you to use Bayesian Classifiers and Decision Trees. We’ve included their votes on anumber of issues. The data set is from the 80s, so the issues aren’t current (you can envision thecongressmen with funny hair if you’d like). And the issues are:1. Handicapped Infants2. Water Project Cost Sharing3. Adopt the Budget Resolution4. Physician Fee Freeze5. Aid to El Salvador6. Religious Groups in Schools7. Anti Satellite Test Ban8. Aid to Nicaraguan Contras9. MX Missile10. Immigration11. Synfuels Corporation Cutback12. Education Spending13. Supe rfund Right to Sue14. Crime15. Duty Free Exports16. South Africa Export Administration Act1An individual congressmen is represented by a line in the training file:no,yes,yes,no,no,maybe,yes,yes,yes,yes,yes,no,maybe,yes,yes,yes,democratWhich shows their votes on the issues and their political party. Congressmen who didn’t voteduring a particular vote are given a “maybe”. You may notice that some congressmen have notvoted at all.You have three files:1. training-large.csv: The main training set2. training-small.csv: A smaller training set3. testing.csv: The testing set1 Part 1: ClassificationFor each part, create a classifier using the suggested python modules. Train the classifier on thesmall training set and then test using the test data. Collect precision and recall information. Dothe same thing for the larger training set.Please document your code and make the structure clean, readable, and intuitive.1.1 Part 1a: Bayesian ClassifierUse a na¨ıve Bayes classifier to classify the politicians.You should use the Reverend classifier: http://divmod.org/trac/wiki/DivmodReverend. Youcannot simply just plug in the classifier; you will have to write a tokenizer to split up the traininginput appropriately.1.2 Part 1b: Decision TreesYou can find a Python implimentation of the ID3 algorithm here (along with a long tutorial):http://www.onlamp.com/pub/a/python/2006/02/09/aidecision trees.html2 Part 2: DiscussionNow that you’ve classified things, please discuss the different classification techniques. Did differentclassifiers do better with this data set? How did the size of the training set effect the results? Diddifferent methods work better with different size training sets?This should definitely be longer than a paragraph, and should be as long as you need to getyour ideas


View Full Document

Brandeis CS 101A - Programming Project #2

Download Programming Project #2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Programming Project #2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Programming Project #2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?