DOC PREVIEW
UT Dallas CS 6375 - 01 - A Checkers Learning Problem

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1PROBLEMAPPROACHTARGET FUNCTIONTARGET FUNCTIONTARGET FUNCTIONProblem RepresentationTARGET FUNCTIONESTIMATING TRAINING VALUESADJUSTING THE WEIGHTSLMS TrainingThe Final DesignA CHECKERS LEARNING PROBLEM“Machine Learning” By Tom MitchellPROBLEM•Task T: playing checkers•Performance measure P: percent of games won in the world tournament•Training experience E: games played against itselfAPPROACH1. The exact type of knowledge to be learned2. A representation for this target knowledge3. A learning mechanismThe type of training experience available can have a significant impact on success or failure of the learnerTARGET FUNCTION •to reduce the problem of improving performance P at task T to the problem of learning some particular target functionLegal Moves Won or LostIndirect Training ExperienceGiven Broad Moves required to winDirect Training ExperienceTARGET FUNCTION •Evaluation function that assigns a numerical score to any given board state.•V : B →Ƞ to denote that V maps any legal board state from the set B to some real value (we use Ƞ to denote the set of real numbers).1. if b is a final board state that is won, then V(b) = 1002. if b is a final board state that is lost, then V(b) = -1003. if b is a final board state that is drawn, then V(b) = 04. if b is a not a final state in the game, then V(b) = V(b' ), where b' is the best final board state that can be achieved starting from b and playing optimally until the end of the game (assuming the opponent plays optimally, as well).TARGET FUNCTION •operational description of the ideal target function V is required.•Learning algorithms is expected to acquire only some approximation to the target function, and for this reason the process of learning the target function is often called function approximationOn one hand, we wish to pick a very expressive representation to allow representing as close an approximation as possible to the ideal target function V. On the other hand, the more expressive the representation, the more training data the program will require in order to choose among the alternative hypotheses it can represent.Problem RepresentationA simple representation: for any given board state, the function will be calculated as a linear combination of the following board features.•xl: the number of black pieces on the board•x2: the number of red pieces on the board•x3: the number of black kings on the board•x4: the number of red kings on the board•x5: the number of black pieces threatened by red (i.e., which can be captured on red's next turn)•X6: the number of red pieces threatened by black•TARGET FUNCTION Thus, our learning program will represent (b) as a linear function of the form where through are numerical coefficients, or weights, to be chosen by the learning algorithm. Learned values for the weights through will determine the relative importance of the various board features in determining the value of the board, whereas the weight will provide an additive constant to the board value.•ESTIMATING TRAINING VALUES•In order to learn the target function we require a set of training examples, each describing a specific board state b and the training value Vtrain(b) for b. In other words, each training example is an ordered pair of the form .•Rule for estimating training values. ← (Successor(b))•ADJUSTING THE WEIGHTS•One common approach is to define the best hypothesis, or set of weights, as that which minimizes the square error E between the training values and the values predicted by the hypothesis .Thus, we seek the weights, or equivalently the , that minimize E for the observed training examples.•LMS TrainingLeast mean squares or LMS training rule is one of several algorithms to incrementally refine the weights. LMS weight update rule.•For each training example •Use the current weights to calculate •For each weight , update it as)•The Final


View Full Document

UT Dallas CS 6375 - 01 - A Checkers Learning Problem

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download 01 - A Checkers Learning Problem
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 01 - A Checkers Learning Problem and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 01 - A Checkers Learning Problem 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?