Department of Psychiatry, Yale University School of Medicine, New Haven, Connecticut 06519
Human Informatics and Interaction Research Institute, the National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba 305-8560, Japan.
eNeuro. 2024 Feb 27;11(2). doi: 10.1523/ENEURO.0437-23.2024. Print 2024 Feb.
Organisms learn to gain reward and avoid punishment through action-outcome associations. Reinforcement learning (RL) offers a critical framework to understand individual differences in this associative learning by assessing learning rate, action bias, pavlovian factor (i.e., the extent to which action values are influenced by stimulus values), and subjective impact of outcomes (i.e., motivation to seek reward and avoid punishment). Nevertheless, how these individual-level metrics are represented in the brain remains unclear. The current study leveraged fMRI in healthy humans and a probabilistic learning go/no-go task to characterize the neural correlates involved in learning to seek reward and avoid pain. Behaviorally, participants showed a higher learning rate during pain avoidance relative to reward seeking. Additionally, the subjective impact of outcomes was greater for reward trials and associated with lower response randomness. Our imaging findings showed that individual differences in learning rate and performance accuracy during avoidance learning were positively associated with activities of the dorsal anterior cingulate cortex, midcingulate cortex, and postcentral gyrus. In contrast, the pavlovian factor was represented in the precentral gyrus and superior frontal gyrus (SFG) during pain avoidance and reward seeking, respectively. Individual variation of the subjective impact of outcomes was positively predicted by activation of the left posterior cingulate cortex. Finally, action bias was represented by the supplementary motor area (SMA) and pre-SMA whereas the SFG played a role in restraining this action tendency. Together, these findings highlight for the first time the neural substrates of individual differences in the computational processes during RL.
生物体通过行为-结果关联来学习获得奖励和避免惩罚。强化学习 (RL) 通过评估学习率、行为偏向、巴甫洛夫因素(即行为值受刺激值影响的程度)和结果的主观影响(即寻求奖励和避免惩罚的动机),为理解这种联想学习中的个体差异提供了一个关键框架。然而,这些个体水平的指标在大脑中是如何表现的仍然不清楚。本研究利用 fMRI 在健康人类中的应用和概率性学习 Go/No-Go 任务,来描述与学习寻求奖励和避免疼痛相关的神经相关性。行为上,参与者在避免疼痛时的学习率高于寻求奖励时的学习率。此外,奖励试验的结果主观影响更大,与反应随机性降低有关。我们的成像发现表明,在回避学习过程中,学习率和表现准确性的个体差异与背侧前扣带皮层、中扣带皮层和后中央回的活动呈正相关。相比之下,在回避疼痛和寻求奖励时,巴甫洛夫因素分别在前中央回和额上回 (SFG) 中表现出来。结果的主观影响的个体差异与左后扣带回的激活呈正相关。最后,动作偏向由补充运动区 (SMA) 和前 SMA 表示,而 SFG 在抑制这种动作倾向方面发挥作用。总的来说,这些发现首次强调了 RL 中个体差异的计算过程的神经基础。