Show Me the Money Dmitry Kit Outline Overview Reinforcement Learning Other Topics Conclusions Learning Models Hebbian Learning Strengthens the relationship between neurons that exhibit similar activity patterns and or in close proximity Might Explain Topological Features of the Brain Population Coding Basis function learning Area allocation to different functions Reinforcement Learning RL Strengthens the relationship between choices that are causally connected in obtaining some reward RL Framework and the Brain Reward Signal Representation Dopamine Local Action selection structures Lateral Intra parietal Area LIP Supplementary Eye Field SEF Frontal Eye Field FEF Global Mechanism for action selection Basal Ganglia Outline Overview Reinforcement Learning Reward Signal Dopamine Decision Variables SEF LIP Other Global Mechanism for Choice Basal Ganglia Other Topics Conclusions Reward Signal Dopamine Located in Nigra Pars Compacta SNc Modulates neurons in many different regions Tonic low frequency activity Only sends the error signal between expected and actual rewards RL and Dopamine Decision Variables in LIP Contains neurons that code for expected gain relative rewards between different actions This activity was observed to be before the choices were actually presented and movement was made Suggesting that these neurons were used to decide on appropriate action LIP Neuron Activity Expectation of High reward produced high firing frequency black line Low reward produced high firing frequency gray line The firing rate was correlated with gain expectation early in the trial Overall Neural Activity in LIP A large portion of examined neurons showed a significant activity related to gain expectation outcome probability and estimated value These were mostly exhibit in the early part of the trial These neurons were also modulated by the actual movement Neural Features of SEF Three types of neurons found Active upon failure to perform task Active upon success Not responsible for executing actions Not related to spatial stimuli Possible error signal coding Not a response to visual stimuli Not responsible for motor control Related to some internal coding of performance Active before and during the delivery of reinforcement Possibly interconnected to other regions of the brain Seem to code expected reward versus actual reward received Function of SEF Monitoring and controlling Perception and production systems during decision making Error Correction Production of responses that are not well learned Overcoming habitual responses Evidence Neurons do not generate eye movements Monitor performance and reward Reward Coding in Other Structures Neurons in orbitofrontal cortex show Selectivity to the type of physical reward Solid Liquid Etc Distinguish between rewards and punishers Some neurons in amygdala respond to magnitude of reward Local Choices Multiple areas in charge of decision making Frontal Eye Field FEF LIP Supplementary Eye Field SEF Etc Might have different goals Need a global mechanism to arbitrate between these different goals Physiology Basal Ganglia Located at the base of the cerebrum Consists of Caudate Nucleus CD and putamen PUT collectively called striatum Globus pallidus Receives direct input from cerebral cortex Substantia Nigra External Segment GPe Internal Segment GPi Subthalamic Nucleus STN Input from cerebral cortex and part of the thalamus Pars reticulate SNr Pars compacta SNc Output Stations GPi and SNr To thalamus and brain stem motor areas Anatomical Locations BG BG Function Controls Thalamocortical networks Mainly involved in hand or arm movements Brain stem motor networks Superior Colliculus the pedunculopontine nucleus locomotion periaqueductal gray eye head orienting vocalization autonomic responses BG SC connection Exists in many lower mammals Method of control CD inhibits neural activity in SNr SNr projects inhibitory connections to SC Inhibition is the main method of control Appropriate action is selected by inhibiting all except the desired action Neural Properties of BG Contains memory guided neurons Contains neurons that code expectation of task specific events SNr Only effected by planned movements Response fields of neurons is the same to those they connect to in SC Circuit Diagram Coordinated Activity Model Use GPe to select just the activity you need Focus Use STN to inhibit a planned future activity Sequencing Might be an incorrect model if we emphasize the direct cortical input to the STN Direct control over movement suppression Learning of Sequential Procedures Frontoparietal association cortices and anterior part of the basal ganglia learn new sequences Uses visuospatial coordinates Motor premotor cortices and the midposterior part of the basal ganglia exploit learned sequences Uses motor coordinates BG and Decision Making Ventral striatum receives input from neocortical areas cognition and limbic emotional areas Speed of saccades are related to emotional or motivational state As with SEF and LIP many BG neurons respond to the expectation of reward Uses dopaminergic neurons to Modulate selectivity of individual neurons Modulate response magnitude of individual neurons Circuit Diagram Revisited Consequences of BG Disorders Involuntary movement Random movement Shorter saccades Problems with coordinated movements Responds deficit to memory guided saccades Trouble holding fixation Visually guided saccades Especially if STN is damaged Inability to learn sequential procedures Lack of motivation to perform actions Why Disinhibition Possibly an evolutionary by product Need a gating mechanism not an enhancement mechanism Outline Overview Reinforcement Learning Other Topics Attention Vs Reward Credit Assignment Problem Conclusions Attention Or reward Attention is a more global concept than reward Attention can modulate neurons before the onset of stimuli just like reward expectation neurons Attention is dependant on task difficulty How does one distinguish between reward expectation signal and attention to a particular stimuli at a single neuron Defined as the study of vigilance selective processing of stimuli and control systems for complex behavior Some studies of attention might have been looking at the same neural signal as those studying reward Provide better definitions for reward and attention Attention might be defined in terms of rewards The Credit Assignment Problem What chain of actions resulted in reward Which of the action to the right got you your
View Full Document
Unlocking...