DOC PREVIEW
Berkeley COMPSCI 182 - Reinforcement Learning

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 182Reinforcement LearningAn example RL domain•Solitaire–What is the state space?– What are the actions?– What is the transition function?• Is it deterministic?–What are the rewards?•(What about Tetris?)MDPs•Markov Decision Processes•What makes them “Markov”?•General routine– Start with a state, s–a = π(s)– s' = T(s,a)– r = R(s,a,s')–s = s'; repeatPolicies and values•What are policies?•What are value functions?•How are they related?Bellman equation•How are V(s) and Q(s,a) related?Reward and utility•Do you keep track of utility?•Do you have a value function V(s) or Q(s,a)?•How do you value future rewards?Policies etc.•Consider “micro pac-man world”–4 squares, 1 ghost, move in 4 cardinal directions or stay still– What's a reasonable policy for the domain?– What are the Q-values for this policy?–What would the RL algorithms do from here?•value iteration a.k.a. dynamic programming• Q-learningIssues with RL•What happens when the state space gets big?–or continuous?•What if there's someone else in the environment?•How do you learn faster than thousands of


View Full Document

Berkeley COMPSCI 182 - Reinforcement Learning

Documents in this Course
Load more
Download Reinforcement Learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Reinforcement Learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Reinforcement Learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?