Buscar

Index_2018_Goal-Directed-Decision-Making

Prévia do material em texto

INDEX
‘Note: Page numbers followed by “f ” indicate figures, “t” indicate tables, and “b” indicate boxes.’
A
Abstract hierarchical action spaces, 113e114
N-acetylcysteine, 378
Actioneoutcome evaluations, 287
Action spaces
abstract hierarchical action spaces,
113e114
creation, 117e118
temporally abstract actions, 114e115
Affiliative motivations, 311
Age-related trajectories, 296
Alcohol, 377
self-administration, 369e370
ALCOVE neural network model, 73
a�amino- 3-hydroxy-5-methyl-4-isoxazolepro-
pionic acid (AMPA) receptor subunits, 376
Anorexia nervosa (AN), 335
Anterior cingulate cortex (ACC), 117
Apparent decision-making behavior, 389
Arbitration
alternative models
Bayesian estimation of reliability, 165
costebenefit hypothesis, 166
metacontrol model, 165
model-free and model-based learning
strategies, 167
value of information, 166
costebenefit model, 164
habit and planning, cost-benefit trade-off
“adaptive decision-makers”, 159e160
Daw two-step task, 157e158
demand selection task, 159
dorsolateral prefrontal cortex (dlPFC),
158
executive functioning, 158
“expected value of control”, 159
model-based control, 157e158
working memory, 157e158
Attentional set-shifting, 204
Autism spectrum disorder (ASD), 350
Autoassociative encoding, 81
AX-CPT task, 295
B
Backward induction, 57e58
Basolateral amygdala (BLA), 185, 200
Bayesian perspective on multistep choice
inference process, 63
joint probability distribution, 61
loopy belief propagation, 63
model-based reinforcement learning, 62
planning-as-inference model, 61e62, 61f
reinforcement learning (RL), 62
restrictions, 63
reward function, 62
Behavioral tasks, 279
Bias-variance trade-off
Bayesian reasoning, 76
case-based control, 75
case-based reasoning (CBR), 76
continuous state/action spaces, 75
model-based controller, 75e76
NadarayaeWatson kernel regression estimator,
74e75
similarity function, 74e75
Binge eating disorder (BED), 335, 342
BOLD responses, 32e34, 392e393, 397e399
C
Cached value, 243, 251e252
Cannabinoids impair memory, 138
Case-based decision theory (CBDT)
agent’s memory, 69
averaged similarity, 70
axiomatic derivations, 70
calculation example, 70, 70t
computational characterizations
bias-variance trade-off, 74e76
case-based reasoning (CBR), 74
empirical studies
ALCOVE neural network model, 73
computational models, 73
experimental tests, 72
market conditions, 72
participants made choices, 72e73
457
Case-based decision theory (CBDT) (Continued)
interactions between systems
case-based and model-based systems, 85e88
case-based and model-free systems, 83e85
dopamine modulates hippocampal plasticity,
81e82
dopamine release from locus coeruleus, 81e82
episodic memory, 82e83
neural pathways, 76e81
primitive concepts, 69
similarity-based judgment, 68
similarity functions
alignment-based models, 71
feature-based models, 71
geometric models, 70e71
transformational models, 71
small world, 69
Case-based reasoning (CBR), 67e68
aerospace industry, 74
Clavier, 74
commercial tasks, 74
Causal models, 400e403
CBM. See Cognitive bias modification (CBM)
Chi-square test, 393e394
Chronic/acute stress, 335e336
Chronic dietary restriction, 344e345
Chronic drug users, 371
Clavier, 74
Clonidine, 138
CM. See Contingency management (CM)
Cocaine, 210
self-administration, 338e339
Cognitive bias modification (CBM), 356e357
Cognitive map
coordinate system, 127e128
internal representation, 126
situation-action decisions, 126
Cognitive models, 294e298
Cognitive remediation therapies (CRTs),
354e355
Cold cognitive reason, 387e388
Common valuation system, 315e316
Competition
arbitration
alternative models, 164e167
costebenefit model, 164
habit and planning, costebenefit trade-off,
157e160
controlereward trade-off, 160e161, 161f
distinguishing habit from planning
Daw two-step task, 155e158, 155f
model-free and model-based control,
155e157
probability of repeating, 156e157, 156f
novel two-step paradigm
controlereward trade-off, 162e163, 163f
Daw two-step task, 162
design and state transition structure, 161e162,
162f
model-free and model-based control, 162
Compulsivity disorders
challenges, 351e353
cognitive bias modification (CBM), 356e357
cognitive remediation therapies (CRTs),
354e355
contingency management (CM), 356
deep brain stimulation (DBS), 357
episodic future thinking (EFT), 354e355
exposure response prevention (ERP), 356
future directions, 351e353
goal-directed action and habit, 331e332
goal-directed and habitual control
anorexia nervosa (AN), 335
autism spectrum disorder (ASD), 350
binge eating disorder (BED), 335, 342
chronic/acute stress, 335e336
chronic dietary restriction, 344e345
cocaine self-administration, 338e339
development of insensitivity, 333e334
Diagnostic and Statistical Manual of Mental
Disorders (DSM), 335
disruptions, 336e351
drug craving, 339e340
drug-seeking habits, 337
eating disorders, 343e346
experimental assessment, 332e334
factors, 332, 333f
food addiction, 340e341
Gilles de la Tourette syndrome (GTS),
350
habit propensity, 335e336, 349e351
habitual behavior, 338e339
International Classification of Diseases, 335
LiCl pairings, 337
neural basis, 334
obesity, 340e343, 341f
obsessiveecompulsive disorder (OCD),
346e349, 349f
458 Index
Parkinson disease (PD), 351
satiation, 336e337
schizophrenia, 350
self-starvation, 344
social anxiety disorder (SAD), 351
substance abuse, 336e340
transdiagnostic perspectives, 335e336
ventromedial prefrontal cortex (vmPFC), 334,
337e338
habit reversal therapy (HRT), 356
habit triggers
cues, 355
removal/avoidance, 355
implementation intentions, 355e356
Pavlovianeinstrumental transfer (PIT) task,
352e353
pharmacological and neuromodulatory
treatment, 357
repetitive transcranial magnetic stimulation
(rTMS), 357
responseeoutcome (R-O) associations, 331e332
selective serotonin reuptake inhibitors (SSRIs), 357
strengthening goal-directed control, 354e355
treatment, 353e357, 354f
Computational reinforcement theory, 12e13
Conditioned reflexes, 447
Conditioning theories, 446e449
Conscious emotions, 317e318
Contingency management (CM), 356
Continuous performance test (CPT), 294e295
Controlereward trade-off
novel two-step paradigm, 162e163, 163f
two-step task, 160e161, 161f
Cooperation
habitual goal selection, 171e175, 173f
model-based simulation
brief rest (quiet wakefulness), 169f, 170
Dyna, 168
revaluation effect, 169f, 170
sample complexity, 168
sequential decision problem, 168e169, 169f
standard model-free learning algorithms,
169e170
time complexity, 168
partial evaluation, 170e171, 171f
Cortical determinants, 179e180
anatomical considerations and functional
contribution, 189f, 190
corticoecortical interactions, 186e187
homologous regions, rat and primate, 188e189
insular cortex (IC), 184e186
medial prefrontal cortex (mPFC), 180e181
neuromodulatory systems, 187e188
orbitofrontal cortex (OFC), 181e184
Corticoebasal ganglia loops, 108e109
Corticoecortical interactions, 186e187
Costebenefit trade-off
arbitration between habit and planning, 157e160
CRTs. See Cognitive remediation therapies
(CRTs)
Cueereward relationships, 252e254
Curse of dimensionality, 262e263
D
Daw two-step task
cognitive control ability, 158
controlereward trade-off, 160e161, 161f
design and state transition structure, 155e156,
155f
dorsolateral prefrontal cortex (dlPFC), 158
first-stage stimuli, 155e156
low-probability transitions, 156e157
numerical Stroop task, 157e158
second-stage states, 155e156
DBS. See Deep brain stimulation (DBS)
Decision policies, 315
Deep brain stimulation (DBS), 357
Degradation/devaluation treatments, 430
Delay discounting, 206e207
“Detonator synapses”, 81
Development, decision-making
actioneoutcome evaluations, 287
age-related trajectories, 296
AX-CPT task, 295
behavioral tasks, 279
cognitive models, 294e298
construction, 289e294
computational analysis, 285
continuous performance test (CPT), 294e295
distinct subregions, 286e287
dopaminergic system, 288
experimental manipulation, 280e281
feedback-driven learning process, 286e287
frontal and parietal regions, 297e298
goal-directedaction, 294e298
goal-directed behavior, 280e283, 286e289
goal-directed instrumental action, 283e286
habitual, model-free learner, 281e282
Index 459
Development, decision-making (Continued)
hippocampal activity, 291e292
hippocampus, 287
medial temporal lobe regions, 290e291
“model-based” algorithms, 281
model-free and model-based prediction errors,
286e287
model-free learning, 285
neurocircuitry, 286e289
orbitofrontal cortex (OFC), 287
prediction error signals, 286e287
prefrontal cortex, 287
proactive control, 296e298
“proactive possible” trials, 295
proliferative period, 287e288
reactive control, 297e298
reward probabilities, 281e282
rostrolateral prefrontal cortex, 290e291
sleep, 292e293
slower second-stage responses, 293
“tree-search” process, 281
two-step task, 281e282, 282f
ventromedial prefrontal cortex (vmPFC), 287
Diagnostic and Statistical Manual of Mental
Disorders (DSM), 335
Disruptions, 336e351
Distinct subregions, 286e287
Dopamine (DA)
cocaine, 210
delay discounting, 207
depletion, 207e208
firing patterns, 85
NAc core, 203
NAc shell, 202
optical stimulation, 202
receptor expression, 208e209
signaling, 108e109
Dopaminergic error signals
cached value, 243, 251e252
cueereward relationships, 252e254
dopamine prediction error, 248e249
generic value, 251e252
midbrain dopamine system, 244
nucleus accumbens (NaCC), 253
Pavlovian conditioning theory, 246e247
phasic stimulation, 249e251
prediction errors, 245e248
reinforcement learning, 245e248
RescorlaeWagner model, 247e248
reward-paired cues, 244e245
rewards, 244e245
synaptic plasticity, 247e248
temporal difference reinforcement learning
(TDRL), 248
ventral tegmental area (VTA), 244
Dopaminergic system, 288
Dopamine transporter binding (DAT), 375
Dorsal striatum, 372
Dorsolateral prefrontal cortex (dlPFC), 158
DREADD approach, 379e380
Drift-diffusion model
boundary separation, 51
evidence integrator, 50
integration process, 51
log likelihood ratio, 50
nondecision time, 52
one-step binary choice, 49e50, 50f
parameters, 51
perceptual decision-making tasks, 51
posterior ratio, 50
situational context, 51
state variable, 49e50
Drug addictions
N-acetylcysteine, 378
alcohol, 377
alcohol self-administration, 369e370
a�amino- 3-hydroxy-5-methyl-4-isoxazolepro-
pionic acid (AMPA) receptor subunits, 376
chronic drug users, 371
chronic exposure to psychostimulants, 375e376
chronic pretreatment, 370
dopamine transporter binding (DAT), 375
dorsal striatum, 372
DREADD approach, 379e380
excitatory postsynaptic potentials (EPSCs), 378
habit learning process, 367e368
habitual behavior, 369e371
high-impulsivity traits, 374e375
insensitivity to devaluation, 373
long-term depression (LTD), 377
medium spiny neuron (MSN), 372
methamphetamine exposure, 376e377
N-methyl-D-aspartate (NMDA) receptor, 376
neurocircuitry, 371e374
nicotine self-administration, 369e370
orbital frontal cortex (OFC), 379e380
paradigms, 370e371
Pavlovianeinstrumental transfer, 378e379
460 Index
psychostimulants, 375e377
rescuing adaptive behavior, 377e380
self-administration training, 369
stimuluseresponse associations, 367e368
ventral striatum, 372
Drug craving, 339e340
Drug-seeking habits, 337
Dual-system theory
belief criterion
“flying-in-that-direction schemata”, 5e6
intentional account of behavior, 6
interval training, 4
issue, 4e5
sodium appetite, 5
causal, 2
desire criterion, 1
animal’s propensity, 2e3
devaluation, 3
food caching, 20e21
variable interval schedule, 3
food caching, 21
habits
automaticity, 7
motivation, 8e9
outcome expectations, 9e12
stimulus-response/reinforcement mechanisms,
6e7
instrumental learning
avoidance, 17e18
choice training, 17
computational reinforcement theory, 12e13
extended training, 15e16, 16f
model-based computations, 12e13
rate correlational theory, 13e14
ratio and interval training, 14e15, 15fe16f
system integration, 18e19
Pavlovian conditioning, 20
rational, 2
Dyna, 168, 170, 415
E
Eating disorders, 343e346
Ecological behaviors, 420e421
Effortful responding, 204e205
EFT. See Episodic future thinking (EFT)
Emotions, 316e318
conscious emotions, 317e318
nonconscious emotions, 318
to social goals, 318e319
Entorhinal cortex (EC), 80
Episodic future thinking/mental time travel,
125e126
cannabinoids impair memory, 138
clonidine, 138
nonlocal firing, 131
phase precession, 135
place cells, 131
place fields, 131
pyramidal cells, 131
reactivation during sleep, 134e135
sequences depict future trajectories to home
location, 133, 134f
sequences during sleep, 134e135
sharp-wave ripple complexes (SWR), 131e134,
132f
theta oscillations, 131e132, 132f, 135, 138
theta sequences, 136e138
two-step decision task, 130e131, 130f
vicarious trial and error (VTE), 137
Episodic memory, 82e83
Epistemic action, 411
ERP. See Exposure response prevention (ERP)
Excitatory postsynaptic potentials (EPSCs), 378
Expected Utility Theory, 316
Exposure response prevention (ERP), 356
F
Feedback-driven learning process, 286e287
Flexible goal-directed behavior
attentional set-shifting, 204
canonical test, 204e205
effortful responding, 204e205
Pavlovian-to instrumental-transfer task, 205e206
reversal learning, 204
G
Generalization
construct, modify, and consolidate memory, 77
“grandmother cells”, 78
“Jennifer Aniston neurons”, 78
medial temporal lobe (MTL), 76e78
Parkinson disease (PD) patients, 77
retrieve relevant memories, 77
Generic value, 251e252
Gilles de la Tourette syndrome (GTS), 350
Goal-directed action
conditioned reflexes, 447
conditioning theories, 446e449
Index 461
Goal-directed action (Continued)
degradation and devaluation treatments, 430
habits, 430e439
implications, 438e439
incentive learning, 433
infralimbic cortices, 431
law of effect, 431e433
motivational processes, 439e446
affective change, 439e443
drive-incentive perspectives, 441e443
emotional feedback, 443e446
reinforcement, 439e443
reward, 443e446
stimuluseaction (S-R) association, 442f
subsequent positions, 439e440
neural feedback processes, 433e438
arkypallidal neuronal projection, 437
basolateral amygdala (BLA), 433e434
central nucleus of the amygdala (CeN), 436e437
cholecystokinin, 435e436
instrumental actions, 437
intercalated cells (ITCs), 437e438
medial orbital cortex (mOFC), 435e436
nucleus accumbens core (NAco), 434e435
reinforcement, 436e438
reward, 433e436
substantia nigra pars compacta (SNc), 436e437
posterior part of dorsomedial striatum (pDMS),
430e431, 434e435
prelimbic involvement, 430e431
reinforcement, 431e433
reward, 431e433
specific stimuluse response (S-R), 431
S-R association, 431e433
Goal-directed deficits, 387e406
Goal-directed social behavior
emotions, 316e318
conscious emotions, 317e318
nonconscious emotions, 318
to social goals, 318e319
goal pursuit, 309
guilt, 322e323
motivation, 310e314
negative affect, 320e322
quantitative disciplines, 314e315
social-affective control model, 320, 320f
social goal representation, 310e314
affiliative motivations, 311
harm prevention, 310e311
self-benefit, 313e314
uncertainty, 313
translating goals to action, 314e316
common valuation system, 315e316
decision policies, 315
Expected Utility Theory, 316
Weak Axiom of Revealed Preferences
(WARP), 316
ultimatum game (UG), 321
Goal selection, 418e420
“Grandmother cells”, 78
Guilt, 322e323
Gustatory cortex (GC), 184
H
Habit propensity, 335e336
Habits, 153, 430e439
automaticity, 7
motivation, 8e9
outcome expectations
contextual stimuli, 10
habitual S-R/reinforcement mechanism, 10
ideomotor theory, 11
Pavlovian stimulus, 9
specific PIT, 11e12
stimulus-response/reinforcement mechanisms,
6e7
Habitual behavior, 369e371, 407e408
Habitual goal selection, 171e175, 173f
Harm prevention, 310e311
Hebbian plasticity, 411
Hedonic processing, 202
Hierarchical RL (HRL), 114e115
Hippocampal activity, 291e292
Hippocampalestriatal connectivity, 83e84
Hippocampus, 88
anterior dorsolateral striatum, 127e128
cognitive map
coordinate system, 127e128
internal representation, 126
situation-action decisions, 126
dopaminergic neurons, 82
episodic future thinking/mental time travel,
125e126
cannabinoidsimpair memory, 138
clonidine, 138
nonlocal firing, 131
phase precession, 135
place cells, 131
462 Index
place fields, 131
pyramidal cells, 131
reactivation during sleep, 134e135
sequences depict future trajectories to home
location, 133, 134f
sequences during sleep, 134e135
sharp-wave ripple complexes (SWR),
131e134, 132f
theta oscillations, 131e132, 132f, 135, 138
theta sequences, 136e138
two-step decision task, 130e131, 130f
vicarious trial and error (VTE), 137
generalization
construct, modify, and consolidate memory, 77
“grandmother cells”, 78
“Jennifer Aniston neurons”, 78
medial temporal lobe (MTL), 76e78
Parkinson disease (PD) patients, 77
retrieve relevant memories, 77
Hullian situation-action process, 127e128
imagination, 125
learning transition probabilities, 85e86
medial prefrontal cortex (mPFC), 128e129
mental simulation, 86e87
“model-based decision making”, 126
neural computations
autoassociative encoding, 81
computational linchpin, 80
“detonator synapses”, 81
entorhinal cortex (EC), 80
reencoding, 81
plus maze task, 126e127, 127f
prefrontal cortex, 128e129
reinforcement learning, 126
sequences generated, 139e140
sequences improve/increase/predict planning
pairwise spiking activity, 140e141, 141f
place cell firing coordination, 140e141
SWR sequences, 140e142, 142f
stimulus associations
“delay conditioning”, 78e79
enhanced pattern similarity, 79
neural pattern similarity, 79e80
sensory preconditioning, 79
“trace conditioning”, 78e79
ventromedial prefrontal cortex, 125e126
vicarious trial and error (VTE), 128
Hot hedonic desire, 387e388
Hullian situation-action process, 127e128
I
Ideomotor theory, 11
Impulsivity, 408
Incentive learning, 433
Inferior parietal lobule (IPL), 32e34
Infralimbic cortices, 431
Instrumental contingency, 31
Instrumental divergence
boundary condition on goal-directedness, 40f
behaviorereward correlations, 41
formalizing arbitration, 44
strategy arbitration, 42
system-balancing task, 43e44
system imbalance and monetary loss, 42
expected utility, 28, 28f
information-theoretic formalization
defined, 29e30
JenseneShannon (JS) divergence, 29
Shannon entropy, 30
intrinsic utility of control, 43f
conventional RL agent, 36
flexible instrumental control, 34, 35f
high- and zero-divergence rooms, 36e37
mean choice proportions, 37e38, 38f
model-based RL agents, 35e36
OxfordeLiverpool Inventory of Feelings and
Experiences (O-LIFEs), 39
perceptual outcome diversity, 36e37
psychopathology, 38e39
RL predictions and mean choice proportions,
35e36, 36f
self-play vs. autoplay manipulation, 37e38
“sense of agency” (SOA), 38e39
neural correlates
Bayesian model selection analysis,
32e34
formal accounts of goal-directed action
values, 31
inferior parietal lobule (IPL), 32e34
instrumental contingency, 31
motivational and information-theoretic
variables, 32
parametric modulation, 32e34, 33f
state prediction error, 31e32
transition probabilities, 31e32
trial in choice task, 32, 33f
probability distributions over three potential
outcomes, 27e28, 28f
Index 463
Instrumental learning
avoidance, 17e18
choice training, 17
computational reinforcement theory, 12e13
extended training, 15e16, 16f
model-based computations, 12e13
rate correlational theory, 13e14
ratio and interval training, 14e15, 15fe16f
system integration, 18e19
Insular cortex (IC)
anatomical considerations, 184
and goal-directed behavior, 184e185
summary, 186
Integration, 388e389
Intertemporal choice, 87
J
“Jennifer Aniston neurons”, 78
JenseneShannon ( JS) divergence, 29
K
Kolmogorov complexity, 71
L
Latent spaces, 116
Law of effect, 431e433
LiCl pairings, 337
Long-term depression (LTD), 377
Loopy belief propagation, 63
M
Markov decision process (MDP), 412b
Markovian property, 262
Medial prefrontal cortex (mPFC), 200, 211
anatomical considerations, 180
and goal-directed behavior, 180e181
summary, 181
Medial temporal lobe (MTL), 76e77, 290e291
Mediodorsal thalamic nucleus (MD), 186e187
Medium spiny neuron (MSN), 372
Methamphetamine, 376e377
N-methyl-D-aspartate (NMDA) receptor, 376
Midbrain dopamine system, 244
“Model-based” algorithms, 281
Model-based reinforcement learning, 62
Model-free control, 419b
Model-free learning, 285
Model-free RL algorithms, 108
artificial agent, 108e109
limitations, 109
probabilistic reward learning, 109
reward prediction error, 108
Motivation, 310e314, 439e446
affective change, 439e443
drive-incentive perspectives, 441e443
emotional feedback, 443e446
reinforcement, 439e443
reward, 443e446
stimuluseaction (S-R) association, 442f
subsequent positions, 439e440
Multiple state spaces, 112e113
Multistep drift-diffusion models, 58f
accuracy and reaction times, 55e56, 56f
backward induction, 57e58
decision difficulty, 55
decision trees, 54e55, 54f
deliberation process, 56e57
eye tracking, 59
one-step decision problems, 55
principles, 60
properties, 56e57
results of experiment 1, 55, 56f
results of experiment 2, 56e57, 57f
serial sampling procedure, 59
simulation of both experiments, 58e59, 59f
simulation of experiment 1, 60, 60f
worse quantitative fit, 60
Multivariate pattern analysis techniques, 264e265
N
Negative affect, 320e322
Negative prediction errors, 391e394
Neural feedback processes, 433e438
arkypallidal neuronal projection, 437
basolateral amygdala (BLA), 433e434
central nucleus of the amygdala (CeN), 436e437
cholecystokinin, 435e436
instrumental actions, 437
intercalated cells (ITCs), 437e438
medial orbital cortex (mOFC), 435e436
nucleus accumbens core (NAco), 434e435
reinforcement, 436e438
reward, 433e436
substantia nigra pars compacta (SNc), 436e437
Neurocircuitry, 286e289, 371e374
Neuromodulatory systems, 187e188
Nicotine self-administration, 369e370
Nonconscious emotions, 318
464 Index
Non-Markovian environments, 265
Nucleus accumbens (NAc), 211e212
anatomical organization, 200e201
behavioral tasks, 201, 201t
decision-making
DA-related proteins, 207
D1/D2 antagonist, 208e209
delay discounting tasks, 206
delay sensitivity, 207e208
D1 mRNA, 208e209
pharmacological inactivation, 208
flexible goal-directed behavior
attentional set-shifting, 204
canonical test, 204e205
effortful responding, 204e205
Pavlovian-to instrumental-transfer task,
205e206
reversal learning, 204
guiding goal-directed behavior, 199
hedonic processing, 202
lateral hypothalamus (LH), 211
limbic-motor interface, 199
medial prefrontal cortex (mPFC), 211
NMDA receptor blockade, 201
non-NMDA glutamate receptors, 201
PrL cortex, 211
psychostimulant effects, 209e210
reward learning, 202e203
O
Obesity, 340e343, 341f
Obsessiveecompulsive disorder (OCD), 346e349,
349f
“Options framework”, 114e115
Orbital frontal cortex (OFC), 287, 379e380
anatomical considerations, 181e182
belief states, 265e268, 267f
curse of dimensionality, 262e263
decision-making, 272e273
definition, 260e261, 260f
and goal-directed behavior, 182e183
Markovian property, 262
multivariate pattern analysis techniques, 264e265
non-Markovian environments, 265
OFC-lesioned animals, 263e264
partially observable, 262
Pavlovianeinstrumental transfer, 267e268
prefrontal cortex (PFC), 260e261
reinforcement learning (RL), 259
representational similarity approach, 266e267
state inferences, 265e268, 267f
state-space theory, 259e265
summary, 183e184
value-independent outcome identities, 270e271
value signals, 268e272
OxfordeLiverpool Inventory of Feelings and
Experiences (O-LIFEs), 39
P
Parkinson disease (PD), 351
Partial evaluation, 170e171, 171f
Pavlovian conditioning theory, 246e247
Pavlovianeinstrumental transfer (PIT), 8, 11e12,
267e268, 352e353, 378e379, 399e400
actioneoutcome learning neuronal circuitry,
226e228
basolateral complex (BLA), 225e226
central nucleus of the amygdala (CeA), 225e226
cue-based choice
cholinergic interneurons (CINs),
232e234
cholinergic system, 231e232
confluence of signals, 234e236
modulator, 232e234
neuroanatomy, 224e229
neurobiology, 229e236
neuromodulation signals, 229e231
nucleus accumbens shell, 229e231
nucleus accumbens shells, 234e236, 235f
dorsolateral striatum (DLS), 227e228
dorsomedialpart of the striatum (DMS),
227e228
experimental approaches, 222f
infralimbic (IL) cortices, 227
neuronal integration, 228e229
orbitofrontal cortex (OFC), 224e225
prelimbic (PL) cortices, 227
“specific” form, 221e223
stimuluseoutcome learning, 224e226
Perceptual outcome diversity, 36e37
Phase precession, 135
“Phasic” activity, 202e203
Phasic stimulation, 249e251
Place fields, 131
Planning-as-inference model, 61e62, 61f
“Planning-until-habit”, 170e171
Posterior dorsomedial striatum (pDMS), 181,
430e431, 434e435
Index 465
Prediction errors
feedback learning, 83e84
signals, 286e287
Prefrontal cortex (PFC), 260e261, 287
Prelimbic (PL) cortices, 180e181, 227, 430e431
Proactive control, 296e298
Proactive possible trials, 295
Proliferative period, 287e288
Pseudocode, 109e110, 111t
Psychostimulants, 209e210, 375e377
Pyramidal cells, 131
Q
Q-learning model, 393e394, 394f
Quantitative disciplines, 314e315
R
Rate correlational theory, 13e14
Reactive control, 297e298
Real-life behaviors, 420e421
Real-life complexity, 420
“Reference class forecasting”, 86
Reinforcement, 431e433
Reinforcement learning (RL), 62, 84, 245e248,
259
action spaces
abstract hierarchical action spaces, 113e114
temporally abstract actions, 114e115
addiction, 408
algorithms, 106e108, 106fe107f
automaticity vs. control, 416e417
belief-based scheme, 413
belief-based vs. belief-free control, 411e413
belief-free agent, 411e413
brain, 107f, 108e109
create state/action spaces, 117e118
epistemic action, 411
framing problem, 109e110, 111t
goal-directed decisionmaking, 408
goal-directed system structure, 415e416
habits and goals
development, 417e418
ecological behaviors, 420e421
goal selection, 418e420
interactions, 421e422
model-free control, 419b
neural correlates, 418
pursuing, 418e420
real-life behaviors, 420e421
real-life complexity, 420
habitual behavior, 407e408
Hebbian plasticity, 411
impulsivity, 408
latent state and action spaces, 115e116
limitations, 109
markov decision process (MDP), 412b
model-based control, 416
model-based learning systems, 408
model-based vs. model-free dichotomy,
413e415
model-free and model-based control, 154e155
moral judgment, 408
state spaces
multiple state spaces, 112e113
simplifying, 110e112
taxonomy, 408e409
mapping animal behavior, 407e408
value-based vs. value-free control, 409e411, 410f
Repetitive transcranial magnetic stimulation
(rTMS), 357
RescorlaeWagner model, 247e248
Rescuing adaptive behavior, 377e380
Responseeoutcome (R-O) associations, 331e332
Reversal learning, 204
Reward, 431e433
learning, 202e203
prediction error, 108
Reward-paired cues, 244e245
Reward probabilities, 281e282
Rewards, 244e245
Rho kinase, 180
Rostrolateral prefrontal cortex, 290e291
“Rules” or task sets, 113
S
Sample complexity, 168
Schizophrenia, 350
apparent decision-making behavior, 389
BOLD responses, 392e393
causal models, 400e403
Chi-square test, 393e394
contemporary evidence, 394e399
driving performance, 391e392
goal-directed deficits, 387e406
historical evidence, 389e391
incentive value/reward, 388e389
integration, 388e389
learning causal models, 400e403
466 Index
negative prediction errors, 391e394
Pavlovian-to-instrumental transfer (PIT),
399e400
prefeeding, 397
Q-learning model, 393e394, 394f
stimuluseoutcome (S-O) learning, 399e400
stimulus-response tasks, 389e390
third possibility, 396
two-stage task, 395e396
Wisconsin Card Sorting, 390e391
Selective serotonin reuptake inhibitors (SSRIs),
357
Self-administration training, 369
Self-benefit, 313e314
Self-starvation, 344
“Sense of agency” (SOA), 38e39
Shannon entropy, 30
Sharp-wave ripple complexes (SWR), 131e132,
132f
cannabinoids, 138
electrical stimulation, 133e134
memory consolidation, 132e133
pairwise spiking activity, 140e141, 141f
place cell firing coordination, 140e141
predict future paths, 141e142, 142f
vs. theta oscillations, 138
Similarity-based judgment, 68
Similarity functions
alignment-based models, 71
feature-based models, 71
geometric models, 70e71
transformational models, 71
Simple binary choice
best-fitting drift rate, 52
collapsing decision, 53
correct-and-error reaction times, 52
decision difficulty, 53
independent integrators, 52e53
mean accuracy, 52
memory retrieval, 53
neural implementation, 52e53
Slower second-stage responses, 293
Social-affective control model, 320, 320f
Social anxiety disorder (SAD), 351
Social goal representation, 310e314
affiliative motivations, 311
harm prevention, 310e311
self-benefit, 313e314
uncertainty, 313
Specific stimuluse response (S-R), 431
S-R association, 431e433
State inferences, 265e268, 267f
State spaces
creation, 117e118
multiple state spaces, 112e113
simplifying, 110e112
State-space theory, 261e265
Stimuluseresponse associations, 367e368,
389e390
Substance abuse, 336e340
T
Taxonomy, 408e409
Temporal difference reinforcement learning
(TDRL), 248
Temporal dynamics of reward-based goal-directed
decision-making
Bayesian perspective on multistep choice
inference process, 63
joint probability distribution, 61
loopy belief propagation, 63
model-based reinforcement learning, 62
planning-as-inference model, 61e62, 61f
reinforcement learning (RL), 62
restrictions, 63
reward function, 62
drift-diffusion model. See Drift-diffusion model
multistep drift-diffusion models. See Multistep
drift-diffusion models
simple binary choice
best-fitting drift rate, 52
collapsing decision, 53
correct-and-error reaction times, 52
decision difficulty, 53
decision problems, 54
independent integrators, 52e53
mean accuracy, 52
memory retrieval, 53
neural implementation, 52e53
Temporally abstract actions, 114e115
Theta oscillations, 131e132, 132f
Theta sequences, 132f, 137
entorhinal cortex, 139e140
go backward, 142e143
and phase precession, 136,
139e140
Time complexity, 168
Transdiagnostic perspectives, 335e336
Index 467
Tree-search process, 281
Two-step task, 281e282, 282f
U
Ultimatum game (UG), 321
Uncertainty, 313
V
Value-based vs. value-free control, 409e411, 410f
Value-independent outcome identities, 270e271
Value of information, 166
Ventral striatum, 372
Ventral subiculum, 200
Ventral tegmental area (VTA), 203, 244
Ventromedial prefrontal cortex (vmPFC), 86e87,
125e126, 287, 334
Vicarious trial and error (VTE), 86e87, 128, 137
W
WARP. See Weak Axiom of Revealed
Preferences (WARP)
Weak Axiom of Revealed Preferences (WARP),
316
Working memory, 157e158
468 Index
	Index
	A
	B
	C
	D
	E
	F
	G
	H
	I
	J
	K
	L
	M
	N
	O
	P
	Q
	R
	S
	T
	U
	V
	W