Study Guide 1 Review Sheet for COMM402 Final Exam Chapter 6 Adaptive behavior 1 What is the general picture of adaptation and its relationship to communication o An action is taken o The world responds to the action o And the individual infers something about the world o And then adapts his behavior so as to secure desirable responses Thus humans are adaptively rational they are assumed to learn in a regular manner from trial and error Adaptation learning Here we assume that the person acts the world reacts and then the person needs to choose to adapt to world s response Someone or the world communicates to a person who then learns and adapts All of the observations discussed in the chapter involve a response in which the responses seem to change over time Responder appears to adapt on the basis of experience Expect choice processes to increase the effectiveness of the behavior in achieving individual goals thus adaptively rational Ex rats in a maze the cheese is a message that means turn towards here Ex A kid gets in trouble to get attention the trouble is a message that says attend to me Teaching persuading influencing attracting False learning when people learning in an apparently intelligent way come to believe things that are not true type of adaptation model 2 What is the basic model of the chapter Reinforcement learning Alternative learning models are S R learning modeling behaviors etc Ex rat Alfred in a maze initial random behavior finds some reinforcement adapts is the main topic of the chapter to behavior to the prospect of reward 3 Understand the Alfred example Trial and error learning At first rat Alfred is placed at the head of the maze then wanders around exploring until it eventually ends up in one of the two goal boxes all of the doors are one way so once the mouse goes in it cannot come back out and the mouse cannot see what is in the box until after it enters Eventually Alfred turns left but he does not know that the reward the food is in the So the next time at some trial half an hour later he goes from the beginning to the right hand box right box and gets the food During the early trials Alfred will turn left occasionally but as the experiment continues he will gradually turn right more often and eventually he will turn right every time 1 Study Guide 2 So behavior of Alfred becomes less and less random At first Pr 0 Pl 0 50 0 refers to time zero before the first trial Pr refers to the probability of going right Pl refers to the probability of going left Since the food is always on the right over time the probability of turning right increases Pr t 1 Pr t some increment So we can understand our problem as needing to model the increment 4 How could we model the increment What are the possible models and which one works Model 1 A constant increment model This model is incorrect because it leads to impossible results We will assume that Alfred initially turned right was rewarded and that the learning increment for turning right is 20 so the equation is Pr t 1 Pr t 20 o Pr 0 50 no learning increment yet at time 0 o Pr 1 70 50 20 o Pr 2 90 70 20 o Pr 3 1 10 90 20 This looked reasonable for awhile but eventually the probability exceeds 1 which is not possible and it does not fit the data well Model 2 A constant proportion model This model requires some constant a that represents the proportion Alfred learns at The quantity 1 Pr represents the amount that Alfred has yet to learn about his each trial maze this is the increment We assume that in each trial Alfred learns a constant proportion of the amount he has yet to learn a or the rate 1 Pr reflects how smart you are on that particular problem Pr 1 Pr 0 increment SO Pr 1 Pr 0 a 1 Pr 0 how right he was in the first place minus how wrong he was in the first place This means that the probability of going right on some trials equals the probability from the previous trial plus the rate of learning a what had yet to be learned on the prior trial a is always positive and always remains constant you can see in the equation that every time he finds cheese the probability of him going right increases however the increment gets smaller and smaller as he gets closer to 100 correct behavior THIS WORKS because model 2 never gives a result where the probability is greater Since a is a fraction at each point you just add a fractionggvvvvvvvvv of what was than 1 left to learn This produces an asymptotic curve where Alfred s behavior approaches Pr 1 So behavior that is reinforced tends to become more frequent or probable Model 2 predicts that 2 Study Guide 3 o Learning occurs o Probabilities rise faster at the start of learning o Stupid choices remain possible though decreasingly likely o We can measure learning rate a 5 What are the assumptions of the various adaptation equations 1 Alternative behaviors available for the individual ex left or right Betsy may or may not have an accident Herman may go to the lab or to class 2 State of individual described by probabilities that add up to 1 ex initial state of Alfred is going left or right which is 50 3 World responds differently to various behaviors ex give the mouse cheese or don t smiling giving love giving attention biting etc the alternative responses of the world 4 State of the world described by probabilities of various responses to each behavior ex give food on the left 70 of the trials and on the right 30 of the trials or always give food on both sides it s the rules given 5 Set of possible events combination of possible behaviors crossed by possible world responses combination of going left going right and reward no reward combination of assumption 1 and 3 all possible events that can occur within the model and probabilities 6 Specify adaptation equation for each possible event Event 1 individual goes left reward is given Event 2 individual goes left no reward is given Event 3 individual goes right reward is given Event 4 individual goes right no reward is given 6 Make sure you understand the four adaptation equations on Page 259 So the model generates sets of equations and these describe the learning adaptation this is accomplished by exploring how the probabilities change Restricted to two choices left or right and two outcomes responses cheese reward or no cheese no reward There is only one person or party doing the learning o E1 left and reward Pl t 1 Pl t a 1 Pl t increase Pl decrease Pr o E2 left and NO reward Pl t 1 Pl t bPl t decrease Pl increase Pr o E3 right …
View Full Document