Unformatted text preview:

Adaptation Chapter 6 Adaptation Learning before we assumed that people weren t learning anything the world was static Previous chapters about calculated rationality some sort of choice in a static system Here we assume that person acts the world reacts and then the person needs to choose to adapt to world s response we ve been assuming that both of the two parties weren t reacting to the first trial But no we get into the reality of interaction with history Example consider history of friendship when making choices Thus adaptively rational Adaptation and Communication Rats in a maze the cheese is a message that means turn toward here rat is assumed to be working in a T maze Probability of going left and right cheese is reward Studying how we learn The cheese is the message rat receives the message and searches for reward Rat arranges his life so he gets more rewards Ex You see what things get you dumped and you quit doing that A kid gets in trouble to get attention the trouble is a message that says attend to me There are all other kinds of applications you teach your bf do this and not that by kissing ect Someone or the world communicates to a person who then learns and adapts Teaching persuading influencing attracting does all comm ask for adaptation by the receiver Basic Model Reinforcement learning is main topic of the chapter you are rewarded for one choice not rewarded for another Simplified further no punishments Also known as operant behavior o Alternative learning models S R learning modeling behavior etc Ex of rat Alfred in maze o Initially random beh o Finds some reinforcement o Adapts beh to the prospect of reward Alfred Trial and error learning Behavior becomes less and less random At first 0 and you are getting ready for trial 1 pR 0 pL 0 50 you absolutely have to keep track on what trial you are on You are on trial pR is probability of a right turn 0 refers to time zero before the first trial Since the food is always on the right over time the probability of turning right increases since its always on the right Alfred will learn to turn right and probability of turning right will increase His probability at trial 0 is 50 but that will rise o pR t 1 pR t some increment because the probility will increase that is the increment After you find cheese you are more likely to go right again so we ll add some increment to the equation So we can understand our problem as needing to model the increment we want to model that and know how quickly he adapts Modeling the Increment Model 1 a constant increment model bad model we wont use it o Suppose the increment is 2 and it goes up evenly o PR 0 50 o PR 1 70 prob on trial 1 in anticipation of trial 2 o PR 2 90 o PR 3 1 10 prob on trial 3 in anticipation of trial 4 Well it looked reasonable for a while so we have to do something else other than adding a constant increment Constant increment model is bad bc it leads to impossible predictions Model 2 A constant proportion model this is the good model that we will actually use If the cheese is always on the right then 1 prob of going right at time t this is WHAT HE HAS LEFT TO LEARN what he has REMAINING TO LEARN it s the AMOUNT OF ERROR 1 PR t represents how much Alfred has yet to learn Suppose we model him this way at each trial Alfred learns a constant proportion of what he has left to learn every time he learns he reduces by some PROPORTATION of what he has left to learn We are tyring to find the increment Model 2 requires some constant a they will give to us that represents the proportion Alfred learns Modeling the Increment on each trial PR 1 PR 0 a 1 PR 0 This means the probability of going right on some trial that probability from previous trial plus the rate of learning a what had yet to be learned on the prior trial this means the next trials probability is the same as the one before plus an increment Little a is measurable and it wont be the same from person to person Some people learn faster than others Some people get a reward and do that over and over some people get a reward and then think they could get a bigger reward by doing something different next time Model 2 never gives a result where the probability is 1 This model never gives a result where the probability is greater than 1 You will never get all the way up to 100 Since a is a fraction at each point you just add a fraction of what was left to learn Produces an asymptotic curve where Alfred s behavior approaches PR 1 P 255 The asymptote Modeling Adaptation Model 2 fits observed results reasonably well Predicts o Learning occurs o Probabilities rise faster at the start of learning you have more to correct at the beginning and you will correct a fraction of it more every time o Stupid choices remain possible though decreasingly likely rat can say oh what the hell lets o We can measure learning rate a how long does it take us to learn It will be important in go to the left applications Adaptation Equations Assumptions Expanding on model 2 the constant proportion model 6 assumptions specified Alternate behs available eg Left Right State of individ described by probabilities that add up to 1 so if your prob of going right is 40 than your prob of going left is 60 World responds differently to various behs eg Cheese No Cheese the world is going to respond to you and it will reward some things you do and not reward other things you do And we are ignoring the possibility of punishments State of world described by probabilities of various responses to each beh eg Right 70 Cheese so far we ve had cheese on the right 100 of the time but as we move forward there could be cheese on the left 30 of the time or even cheese on both sides Set of possible events is combination of possible behs crossed by possible world responses eg Left No Reward Left Reward o 4 possibilies left and cheese left and no cheese right and cheese right and no cheese Specify adaptation equations for each possible event eg event 1 Left No Reward there are 4 possibilies so there will be 4 equations Adaptation Equations So the model generates sets of equations and these describe the learning adaptation the process will show how the probabilities will change over time Eventually there will be a 99 the mouse will go to right and it started at 50 50 o Accomplished by explaining how the probabilities change Text model will be restricted to o Two choices eg Left Right …


View Full Document

UMD COMM 402 - Chapter 6: Adaptation

Download Chapter 6: Adaptation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 6: Adaptation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 6: Adaptation and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?