ILLINOIS CS 446 - 082917.1 - D3425947

Home> Schools> University of Illinois - urbana> (CS) > CS 446> 082917.1

DOC PREVIEW

ILLINOIS CS 446 - 082917.1

School name University of Illinois - urbana

Course Cs 446- Machine Learning

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS446: Machine Learning, Fall 2017Lecture 1: Introduction to Machine LearningLecturer: Sanmi Koyejo Scribe: Omar Elabd, Aug. 29th, 2017Introduction to Machine LearningMachine learning is an application of artificial intelligence that provides systems the abilityto automatically learn and improve from experience without being explicitly programmed.Table 1 shows a non-comprehensive list of machine learning algorithms categorized by theirtypes.Probabilistic Non-ProbabilisticSupervisedNaive BayesArtificial Neural NetworksSupport Vector Machines (SVM)Decision TreesRandom ForestsSupport Vector Machines (SVM)Linear Discriminant Analysis (LDA)Adaptive Boosting (AdaBoost)UnsupervisedGenerative adversarial networks (GANs) k-means clusteringLatent Dirichlet allocation (LDA) Principal Component Analysis (PCA)Gaussian mixture model (GMM) t-distributed Stochastic Neighbor Embedding (t-SNE)Hierarchical clusteringTable 1: Categorized Machine Learning Algorithms•Gradient Descent and Expectation Maximization are both techniques for optimizingmachine learning algorithms.Probabilistic vs. Non-Probabilistic•Probabilistic: Built around learning a probability distribution in order to performmachine learning.• Non-Probabilistic: Does not involve learning probabilistic distributions directly.12 Lecture 1: Introduction to Machine LearningSupervised LearningSupervised learning is a machine learning task of inferring a function from labeled trainingdata.•Where the training data is defined as{(xi, yi)}, wherexiare instances or inputs, andyiare the corresponding labels.•The goal of supervised machine learning is to accurately predict a labelyfor an unseenx.Unsupervised LearningUnsupervised machine learning is a machine learning task of inferring a function to describehidden structure from “unlabeled” data.•Where the training data is defined as{(xi)}, wherexiare instances or inputs. However,there are no labels associated with the training data.•The goal of unsupervised machine learning is to accurately model the underlyingstructure of the xi’sSemi-supervised LearningSemi-supervised learning is a class of supervised learning tasks and techniques that makeuse of labeled and unlabeled data for training.Reinforcement LearningReinforcement learning is an area of machine learning concerned with how software agentsought to take actions in an environment so as to maximize some notion of cumulative reward.• Could be considered probabilistic, unsupervised.• Will be covered at the end of the semester, time permitting.Learning• Given a set of n pairs (xi, yi) . . . (xn, yn) where xi∈ X = Rdand yi∈ [−1, +1]Lecture 1: Introduction to Machine Learning 3– Note that yiis binary, either -1 or +1•Our goal is to learn a functionh, such thath(xn+1)→ ˆyn+1will result in an accurateprediction.•Approach 1: Use a constant functionh(x) =c, wherecis a constant. In this example,we can definecto be +1 or -1, if we chosecto be +1 then regardless of our inputx,the classifier would always return +1.Pros:– Given new datasets, will give consistent results.Cons:– Does not adapt to the problem we are trying to solve.•Approach 2: Memoization - As input points are seen, they are stored in a database. Ifthe input point had a certain label, we return the result of what has previously beenseen.h(x) =(y(x), if x has been seen before?, if x is newPros:– Training costs nothing, as no training is involved.– Can in some cases perform well.Cons:– Not 100% accurate, as we can have the same xiwith different yi’s∗ No noise tolerance.– High storage cost.– High cost for performing a look up.∗ This cost can be reduced by using techniques such as hashing– Test is expensive.∗ Cost depends on the performance of the lookup function.∗ Size of the model will scale with the size of the dataset.Inductive Bias•Assumptions that a machine learning model makes in order to make predictions onnew data.4 Lecture 1: Introduction to Machine LearningNearest Neighbor ClassifierThe Nearest Neighbor algorithm is a classification algorithm which uses the labels of it’skclosest neighbors to determine the label for a new unseen instancex∗. For values ofkotherthan 1, the majority can be used to determine the label of x∗.•The Nearest Neighbor Classifier has the implicit assumption (inductive bias) thatnearby points will have similar labels. This assumption may not be accurate for alldatasets.1-Nearest NeighborFor a single nearest neighbor, the classifier h can be expressed as:h(x) = N1(X , D)where,N1(X , D) represents the closestsinglepoint toXin a datasetD. This can beexpressed as follows:N1(X , D) = argminz∈Dd(X , Z)where d is some distance function (e.g. euclidean or mahalanobis)k-Nearest NeighborsFor k nearest neighbors, the classifier function h can be expressed as:h(X) = sign1kXz∈Nk(X,D)YzWhere, Nk(X , D) = set of k points closest to X in a dataset D.Nearest Neighbor Performance•Performance depends on hyperparameters (e.g. the choice of distance function to useand the choice of the number of nearest neighbors).•Hyperparameters are the parameters that are set before the training process, asopposed to parameters derived in

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

ILLINOIS CS 446 - 082917.1

Sign up for free to view:

Please select your school