Reinforcement Learning and Beliefs

In defining 'Feedback and Learning' for the Encyclopedia of the Sciences of Learning, we wrote that "feedback plays a crucial role in learning" and that it is both external (i.e., coming from the environment) and internal, that is to say "when it is the outcome of introspection processes in which individual choices are confronted with internalized beliefs, or social and ethical norms and so with past experience and learning".

Contributions in Neuroeconomics have started to investigate the neural circuitries involved in strategic interactions where behavior is lead by a continuous process of reinforcement learning and beliefs adaptation. A recent paper on PNAS confirms that the dual channel of learning by feedbacks is at work also at the neural level.

A brief description of the work is here.

Lusha Zhu, Kyle E. Mathewson, and Ming Hsu, Dissociable neural representations of reinforcement and belief prediction errors underlie strategic learningPNAS 2012

Abstract
Decision-making in the presence of other competitive intelligent agents is fundamental for social and economic behavior. Such decisions require agents to behave strategically, where in addition to learning about the rewards and punishments available in the environment, they also need to anticipate and respond to actions of others competing for the same rewards. However, whereas we know much about strategic learning at both theoretical and behavioral levels, we know relatively little about the underlying neural mechanisms. Here, we show using a multi-strategy competitive learning paradigm that strategic choices can be characterized by extending the reinforcement learning (RL) framework to incorporate agents’ beliefs about the actions of their opponents. Furthermore, using this characterization to generate putative internal values, we used model-based functional magnetic resonance imaging to investigate neural computations underlying strategic learning. We found that the distinct notions of prediction errors derived from our computational model are processed in a partially overlapping but distinct set of brain regions. Specifically, we found that the RL prediction error was correlated with activity in the ventral striatum. In contrast, activity in the ventral striatum, as well as the rostral anterior cingulate (rACC), was correlated with a previously uncharacterized belief-based prediction error. Furthermore, activity in rACC reflected individual differences in degree of engagement in belief learning. These results suggest a model of strategic behavior where learning arises from interaction of dissociable reinforcement and belief-based inputs.

No comments:

Post a Comment