Read Chapter 7 of Machine Learning Suggested exercises 7 1 7 2 7 5 7 7 Function Approximation Given Instance space X e g X is set of boolean vectors of length n x 0 1 1 0 0 1 Hypothesis space H set of functions h X Y e g H is the set of boolean functions Y 0 1 defined by conjunction of constraints on the features of x Training Examples D sequence of positive and negative examples of an unknown target function c X 0 1 x1 c x1 xm c xm Determine A hypothesis h in H such that h x c x for all x in X Function Approximation Given Instance space X e g X is set of boolean vectors of length n x 0 1 1 0 0 1 Hypothesis space H set of functions h X Y e g H is the set of boolean functions Y 0 1 defined by conjunctions of constraints on the features of x Training Examples D sequence of positive and negative examples of an unknown target function c X 0 1 x1 c x1 xm c xm Determine A hypothesis h in H such that h x c x for all x in X A hypothesis h in H such that h x c x for all x in D What we want What we can observe Instances Hypotheses and More General Than i e minimizes the number of queries needed to converge to the correct hypothesis D Set of training examples instances drawn at random from Probability distribution P x Can we bound in terms of D D Set of training examples instances drawn at random from Probability distribution P x true error less Any learner that outputs a hypothesis consistent with all training examples i e an h contained in VSH D What it means Haussler 1988 probability that the version space is not exhausted after m training examples is at most Suppose we want this probability to be at most 1 How many training examples suffice 2 If then with probability at least 1 Sufficient condition Holds if L requires only a polynomial number of training examples and processing per example is polynomial true error training error degree of overfitting Additive Hoeffding Bounds Agnostic Learning Given m independent coin flips of coin with Pr heads bound the error in the estimate Relevance to agnostic learning for any single hypothesis h But we must consider all hypotheses in H So with probability at least 1 every h satisfies General Hoeffding Bounds When estimating parameter a b from m examples When estimating a probability 0 1 so And if we re interested in only one sided error then
View Full Document