Unformatted text preview:

CHAPTER 6 ADAPTIVE BEHAVIOR 05 13 2012 o An action is taken the world responds to the action and the individual infers something about the world and then adapts his behavior so as to secure desirable responses o consider humans as adaptively rational assumed to learn from trial and error all examples involve a response of human or animal response changes over time human experiment w various alternatives and choose some more often then others b c of the pleasurable unpleasurable consequences they ve experienced following the choice choice processes increase effectiveness of behavior in achieving individual goals o False Learning when people learning in an apparently intelligent way come to believe things that are not true THE BASIC MODEL on in chapter o Reinforcement Learning main adaptation process focused o Observe adaptive behavior of an animal in a maze T maze because of shape of T Mouse placed in starting box at head of maze wanders around ends up in 1 of 2 goal boxes All doors are 1 way once mouse goes thru them it cant go back out o ALFRED Mouse cannot see whats in goal box until it enters it Put food in right hand goal box and left the left goal box empty Alfred then placed in starting box w no idea that there is food in the right goal box First trial Alfred eventually goes left after some time Half an hour later leaves quickly and eventually realizes to turn right instead of left to discover food Every half hour put food in right hand goal box and put Alfred in starting box and observed behavior learning After more and more trials eventually turns right each trials time Original behavior random until he learned that some kinds of actions bring pleasant rewards now performs those actions instead Overall had adapted to the situation Behavior turning left or right that is reinforced rewarded becomes more frequent whereas behavior that is not rewarded becomes less frequent Applies to humans as well Ex mother teaching child the alphabet holds up letter child makes sound mother smiles if incorrect no smile Subject child Behavior sound child makes Reward mother s smile Pr 0 initial probability of turning right before 1st trial Pl 0 initial probability of turning left before 1st trial Overtime Pr will increase and Pl will decrease b c only right goal box has food and animal is capable of learning the environment Pr t 1 Pr t some increment Pr at time t 1 is related to Pr at tie t Pr t 1 has increased and that the amnt of this increase amnt that has been learned as result of trial is increment A Constant Increment Model fail assume Alfred happed to turn right initially was rewarded and that learning increment for turning right is 0 2 Pr t 1 Pr t 0 2 If Alfred was originally neutral in turning preference that is Pr 0 0 5 and if he happens to turn right on 1st trial then Pr 1 Pr 0 0 2 0 5 0 2 0 7 o At beginning of 1st trial Alfreds probability of turning right 0 7 If he turns right at 2nd trial o Pr 2 Pr 1 0 2 0 7 02 0 9 After 2 trials Alfred has almost completely adapted to situation 90 chance of turning correctly right impossible Pr 3 Pr 2 2 9 2 1 1 which is Trouble w assumption learning increment 2 which could be too large and making increment smaller will still lead to impossible numbers in larger trials o You need a different adaptation equation that stays within 0 1 probability range o Need a model w variable increment A Constant Proportion Model quantity 1 Pr represent amount that Alfred has yet to learn about maze Current probability of turning right is 0 7 then amount he would have left to learn is 0 3 1 7 Assume that in each trial Alfred learns a constant proportion of the amount he has left to learn a learning proportion rate then increment is a 1 Pr suppose Alfred s initial chance of turning right is 0 5 and learning rate a is 0 3 and he turns right the 1st time Pr 1 Pr 0 increment Pr 1 Pr 0 a 1 Pr 0 0 5 0 3 1 0 5 0 5 0 15 0 65 Alfred again makes correct turn right on 2nd trial Pr 2 Pr 1 a 1 Pr 1 65 3 1 65 0 755 Learning rate increment is always positive b c he learns something every trial Increment gets smaller as he gets closer to 100 correct behavior General Principle of Learning behavior that is reinforced teds to become more frequent or probable 6 ASSUMPTIONS OF OUR MODEL o 1 Alternative behaviors for the individual may turn left or right o 2 State of the individual probabilities add up to 1 o 3 Alternative responses of the world cheese no cheese o 4 State of the world world has some set of rules for own behavior probabilities of various responses to each behavior right 70 cheese o 5 Set of Possible Events combination of possible behaviors crossed by possible world response go left no reward go right reward o 6 Adaptation Equations specific equation for each possible event event 1 left no reward ADAPTATION EQUATIONS o E1 Left and Reward Pl t 1 Pl t a 1 Pl t o E2 Left and No Reward Pl t 1 Pl t bPl t o E3 Right and Reward Pr t 1 Pr t a 1 Pr t o E4 Right and No Reward Pr t 1 Pr t bPr t a learning rate associated with reward behavior number in the range 0 1 shows rate of response to reinforcement low values slow learning high value fast learning b learning rate associated with nonreward range of possible values and its interpretation are identical to that of a these 4 equations have few simple principles behavior that is reinforced becomes more probable and behavior that is not reinforced becomes less probable rewards cause behavior to change at rate a nonrewards cause behavior to change at rate b amount of adaptation on any trial is always a constant fraction a or b of the amount left to be learned MEANING OF PROBABILITY IN ADAPTATION MODELS o Probability assumption is a simplification of the real world o We only need to observe behavior o Use probability as shorthand aggregation of individuals complexity o What matters is that the worlds behavior is not constant SIGNIFICANCE OF A AND B o a learning rate associated with rewarded behavior o b learning rate associated with non reward both numbers between 0 and 1 small values slow changes higher values rapid changes o motivation used to summarize the attributes of a reward or the current state of the subject that make a given reward important to a given subject at a specific time o learning ability ability of the subject to draw inferences and modify behavior on basis of experience o if ability or motivation vary from situation to situation or …


View Full Document

UMD COMM 402 - CHAPTER 6

Download CHAPTER 6
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CHAPTER 6 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CHAPTER 6 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?