DOC PREVIEW
CMU CS 10701 - Active Learning

This preview shows page 1-2-3-4-27-28-29-30-55-56-57-58 out of 58 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 58 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Ac ve Learning Burr Se0les Machine Learning 10 701 15 781 April 19 2011 some slides adapted from Aarti Singh Rui Castro Rob Nowak Supervised Learning raw unlabeled data random sample x 1 x2 x3 labeled training instances x1 y1 x2 y2 x3 y3 supervised learner induces a classifier expert oracle analyzes experiments to determine labels Semi Supervised Learning exploit the structure in unlabeled data raw unlabeled data random sample x 1 x2 x3 labeled training instances x1 y1 x2 y2 x3 y3 semi supervised learner induces a classifier expert oracle analyzes experiments to determine labels Ac ve Learning inspect the unlabeled data raw unlabeled data x 1 x2 x3 request labels for selected data x1 x2 active learner induces a classifier x1 y1 x2 y2 expert oracle analyzes experiments to determine labels The 20 Ques ons Game Are you female No Do you have a moustache Yes our goal is to pose the most informa ve queries how can we automate this process Thought Experiment suppose you are on an Earth convoy sent to colonize planet Zelgon people who ate the smooth Zelgian fruits found them tasty people who ate the spikey Zelgian fruits got sick Determining Poison vs Yummy Fruits there is a con nuous range of spikey to smooth fruit shapes on Zelgon you need to learn how to recognize fruits as poisonous or safe and you need to do this while risking as li0le as possible i e colonist health Learning a Change Point goal learn threshold as accurately as possible using as few labeled instances as possible The Problem with Passive Learning in passive supervised learning the instances must be chosen before any tests are done error rate requires us to risk O 1 people s health Can We Do Be0er this is just a binary search requiring O 1 fruits samples and only O log2 1 tests queries your rst ac ve learning algorithm Rela onship to Ac ve Learning key idea the learner chooses the training data on Zelgon whether a fruit was poisonous safe in general the true label of some instance goal reduce the training costs on Zelgon the number of lives at risk in general the number of queries labor costs disk storage space etc Prac cal Query Scenarios query synthesis Anguin 1988 selec ve sampling Atlas et al 1989 pool based ac ve learning Lewis Gale 1994 Query Synthesis learner xi yi label add to training set Input Space or Distribu on xi query oracle expert xi generate an informative instance ne novo Problems with Query Synthesis an early real world applica on neural net queries synthesized for handwri0en digits Lang Baum 1992 problem humans couldn t interpret the queries ideally we can ensure that the queries come from the underlying natural distribu on 14 Selec ve Sampling learner xi yi label add to training set xi observe a new instance xi decide to query or ignore oracle expert Con nuous Data Source Selec ve Sampling 2 advantage ensures that query instances come from the true underlying data distribu on assump on drawing an instance from the distribu on is signi cantly less expensive than obtaining its label onen true in prac ce e g downloading Web documents vs assigning topic labels to them Pool Based Ac ve Learning learner xi yi label add to training set Large Pool of Unlabeled Data xi query oracle expert xi choose the best out of the sample pool Learning Curves active learning better passive learning text classification baseball vs hockey Who Uses Ac ve Learning Sentiment analysis for blogs Noisy relabeling Prem Melville Biomedical NLP IR Computer aided diagnosis Balaji Krishnapuram MS Outlook voicemail plug in Kapoor et al IJCAI 07 A variety of prototypes that are in use throughout the company Eric Horvitz While I can confirm that we re using active learning in earnest on many problem areas I really can t provide any more details than that Sorry to be so opaque David Cohn OK How Do We Select Queries let s interpret our Zelgian fruit binary search in terms of a probabilis6c classi er 1 0 0 0 0 5 0 5 0 5 Lewis Gale SIGIR 94 Uncertainty Sampling query instances the learner is most uncertain about 400 instances sampled from 2 class Gaussians random sampling 30 labeled instances accuracy 0 7 uncertainty sampling 30 labeled instances accuracy 0 9 Uncertainty Measures least confident smallest margin entropy note for binary tasks these are equivalent Mul Class Uncertainty illustration of preferred dark red posterior distributions in a 3 label classification task note for mul class tasks these are not equivalent Informa on Theore c Interpreta on the surprisal I is a measure in bits nats etc of the informa on content for outcome y of variable Y so this is how informa ve the oracle s label y will be but the learner doesn t know the oracle s answer yet we can es mate it as an expecta6on over all possible labels which is entropy based uncertainty sampling Uncertainty Sampling in Prac ce pool based ac ve learning evaluate each x in U rank and query the top K instances retrain repeat selec ve sampling threshold a region of uncertainty e g 0 2 0 8 observe new instances but only query those that fall within the region retrain repeat Simple and Widely Used text classi ca on Lewis Gale ICML 94 POS tagging Dagan Engelson ICML 95 Ringger et al ACL 07 disambigua on Fujii et al CL 98 parsing Hwa CL 04 informa on extrac on Sche er et al CAIDA 01 Se0les Craven EMNLP 08 word segmenta on Sassano ACL 02 speech recogni on Tur et al SC 05 translitera on Kuo et al ACL 06 transla on Ha ari et al NAACL 09 Cohn et al ML 94 Uncertainty Sampling FAIL initial random sample misses the right triangle neural net uncertainty sampling only queries the left side What To Do plain uncertainty sampling only uses the con dence of a single classi er some mes called a point es mate for parametric models this classi er can become overly con dent about instances is really knows nothing about instead let s consider a di erent no on of uncertainty about the classi er itself Remember Version Spaces the set of all classi ers that are consistent with the labeled training data the larger the version space V the less likely each possible classi er is we want queries to reduce V Alien Fruits Revisited let s try interpre ng our binary search in terms of a version space search possible classi ers thresholds 8 42 1 Simple Version Space Algorithm enumerate all legal hypotheses or compute V analy cally the op mal query is the one that most reduces the size of V in expecta on over y ideally we can halve the size of the version space binary search does this in 1D e g Zelgian fruits


View Full Document

CMU CS 10701 - Active Learning

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download Active Learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Active Learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Active Learning and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?