Prévia do material em texto
INDEX ‘Note: Page numbers followed by “f ” indicate figures, “t” indicate tables, and “b” indicate boxes.’ A Abstract hierarchical action spaces, 113e114 N-acetylcysteine, 378 Actioneoutcome evaluations, 287 Action spaces abstract hierarchical action spaces, 113e114 creation, 117e118 temporally abstract actions, 114e115 Affiliative motivations, 311 Age-related trajectories, 296 Alcohol, 377 self-administration, 369e370 ALCOVE neural network model, 73 a�amino- 3-hydroxy-5-methyl-4-isoxazolepro- pionic acid (AMPA) receptor subunits, 376 Anorexia nervosa (AN), 335 Anterior cingulate cortex (ACC), 117 Apparent decision-making behavior, 389 Arbitration alternative models Bayesian estimation of reliability, 165 costebenefit hypothesis, 166 metacontrol model, 165 model-free and model-based learning strategies, 167 value of information, 166 costebenefit model, 164 habit and planning, cost-benefit trade-off “adaptive decision-makers”, 159e160 Daw two-step task, 157e158 demand selection task, 159 dorsolateral prefrontal cortex (dlPFC), 158 executive functioning, 158 “expected value of control”, 159 model-based control, 157e158 working memory, 157e158 Attentional set-shifting, 204 Autism spectrum disorder (ASD), 350 Autoassociative encoding, 81 AX-CPT task, 295 B Backward induction, 57e58 Basolateral amygdala (BLA), 185, 200 Bayesian perspective on multistep choice inference process, 63 joint probability distribution, 61 loopy belief propagation, 63 model-based reinforcement learning, 62 planning-as-inference model, 61e62, 61f reinforcement learning (RL), 62 restrictions, 63 reward function, 62 Behavioral tasks, 279 Bias-variance trade-off Bayesian reasoning, 76 case-based control, 75 case-based reasoning (CBR), 76 continuous state/action spaces, 75 model-based controller, 75e76 NadarayaeWatson kernel regression estimator, 74e75 similarity function, 74e75 Binge eating disorder (BED), 335, 342 BOLD responses, 32e34, 392e393, 397e399 C Cached value, 243, 251e252 Cannabinoids impair memory, 138 Case-based decision theory (CBDT) agent’s memory, 69 averaged similarity, 70 axiomatic derivations, 70 calculation example, 70, 70t computational characterizations bias-variance trade-off, 74e76 case-based reasoning (CBR), 74 empirical studies ALCOVE neural network model, 73 computational models, 73 experimental tests, 72 market conditions, 72 participants made choices, 72e73 457 Case-based decision theory (CBDT) (Continued) interactions between systems case-based and model-based systems, 85e88 case-based and model-free systems, 83e85 dopamine modulates hippocampal plasticity, 81e82 dopamine release from locus coeruleus, 81e82 episodic memory, 82e83 neural pathways, 76e81 primitive concepts, 69 similarity-based judgment, 68 similarity functions alignment-based models, 71 feature-based models, 71 geometric models, 70e71 transformational models, 71 small world, 69 Case-based reasoning (CBR), 67e68 aerospace industry, 74 Clavier, 74 commercial tasks, 74 Causal models, 400e403 CBM. See Cognitive bias modification (CBM) Chi-square test, 393e394 Chronic/acute stress, 335e336 Chronic dietary restriction, 344e345 Chronic drug users, 371 Clavier, 74 Clonidine, 138 CM. See Contingency management (CM) Cocaine, 210 self-administration, 338e339 Cognitive bias modification (CBM), 356e357 Cognitive map coordinate system, 127e128 internal representation, 126 situation-action decisions, 126 Cognitive models, 294e298 Cognitive remediation therapies (CRTs), 354e355 Cold cognitive reason, 387e388 Common valuation system, 315e316 Competition arbitration alternative models, 164e167 costebenefit model, 164 habit and planning, costebenefit trade-off, 157e160 controlereward trade-off, 160e161, 161f distinguishing habit from planning Daw two-step task, 155e158, 155f model-free and model-based control, 155e157 probability of repeating, 156e157, 156f novel two-step paradigm controlereward trade-off, 162e163, 163f Daw two-step task, 162 design and state transition structure, 161e162, 162f model-free and model-based control, 162 Compulsivity disorders challenges, 351e353 cognitive bias modification (CBM), 356e357 cognitive remediation therapies (CRTs), 354e355 contingency management (CM), 356 deep brain stimulation (DBS), 357 episodic future thinking (EFT), 354e355 exposure response prevention (ERP), 356 future directions, 351e353 goal-directed action and habit, 331e332 goal-directed and habitual control anorexia nervosa (AN), 335 autism spectrum disorder (ASD), 350 binge eating disorder (BED), 335, 342 chronic/acute stress, 335e336 chronic dietary restriction, 344e345 cocaine self-administration, 338e339 development of insensitivity, 333e334 Diagnostic and Statistical Manual of Mental Disorders (DSM), 335 disruptions, 336e351 drug craving, 339e340 drug-seeking habits, 337 eating disorders, 343e346 experimental assessment, 332e334 factors, 332, 333f food addiction, 340e341 Gilles de la Tourette syndrome (GTS), 350 habit propensity, 335e336, 349e351 habitual behavior, 338e339 International Classification of Diseases, 335 LiCl pairings, 337 neural basis, 334 obesity, 340e343, 341f obsessiveecompulsive disorder (OCD), 346e349, 349f 458 Index Parkinson disease (PD), 351 satiation, 336e337 schizophrenia, 350 self-starvation, 344 social anxiety disorder (SAD), 351 substance abuse, 336e340 transdiagnostic perspectives, 335e336 ventromedial prefrontal cortex (vmPFC), 334, 337e338 habit reversal therapy (HRT), 356 habit triggers cues, 355 removal/avoidance, 355 implementation intentions, 355e356 Pavlovianeinstrumental transfer (PIT) task, 352e353 pharmacological and neuromodulatory treatment, 357 repetitive transcranial magnetic stimulation (rTMS), 357 responseeoutcome (R-O) associations, 331e332 selective serotonin reuptake inhibitors (SSRIs), 357 strengthening goal-directed control, 354e355 treatment, 353e357, 354f Computational reinforcement theory, 12e13 Conditioned reflexes, 447 Conditioning theories, 446e449 Conscious emotions, 317e318 Contingency management (CM), 356 Continuous performance test (CPT), 294e295 Controlereward trade-off novel two-step paradigm, 162e163, 163f two-step task, 160e161, 161f Cooperation habitual goal selection, 171e175, 173f model-based simulation brief rest (quiet wakefulness), 169f, 170 Dyna, 168 revaluation effect, 169f, 170 sample complexity, 168 sequential decision problem, 168e169, 169f standard model-free learning algorithms, 169e170 time complexity, 168 partial evaluation, 170e171, 171f Cortical determinants, 179e180 anatomical considerations and functional contribution, 189f, 190 corticoecortical interactions, 186e187 homologous regions, rat and primate, 188e189 insular cortex (IC), 184e186 medial prefrontal cortex (mPFC), 180e181 neuromodulatory systems, 187e188 orbitofrontal cortex (OFC), 181e184 Corticoebasal ganglia loops, 108e109 Corticoecortical interactions, 186e187 Costebenefit trade-off arbitration between habit and planning, 157e160 CRTs. See Cognitive remediation therapies (CRTs) Cueereward relationships, 252e254 Curse of dimensionality, 262e263 D Daw two-step task cognitive control ability, 158 controlereward trade-off, 160e161, 161f design and state transition structure, 155e156, 155f dorsolateral prefrontal cortex (dlPFC), 158 first-stage stimuli, 155e156 low-probability transitions, 156e157 numerical Stroop task, 157e158 second-stage states, 155e156 DBS. See Deep brain stimulation (DBS) Decision policies, 315 Deep brain stimulation (DBS), 357 Degradation/devaluation treatments, 430 Delay discounting, 206e207 “Detonator synapses”, 81 Development, decision-making actioneoutcome evaluations, 287 age-related trajectories, 296 AX-CPT task, 295 behavioral tasks, 279 cognitive models, 294e298 construction, 289e294 computational analysis, 285 continuous performance test (CPT), 294e295 distinct subregions, 286e287 dopaminergic system, 288 experimental manipulation, 280e281 feedback-driven learning process, 286e287 frontal and parietal regions, 297e298 goal-directedaction, 294e298 goal-directed behavior, 280e283, 286e289 goal-directed instrumental action, 283e286 habitual, model-free learner, 281e282 Index 459 Development, decision-making (Continued) hippocampal activity, 291e292 hippocampus, 287 medial temporal lobe regions, 290e291 “model-based” algorithms, 281 model-free and model-based prediction errors, 286e287 model-free learning, 285 neurocircuitry, 286e289 orbitofrontal cortex (OFC), 287 prediction error signals, 286e287 prefrontal cortex, 287 proactive control, 296e298 “proactive possible” trials, 295 proliferative period, 287e288 reactive control, 297e298 reward probabilities, 281e282 rostrolateral prefrontal cortex, 290e291 sleep, 292e293 slower second-stage responses, 293 “tree-search” process, 281 two-step task, 281e282, 282f ventromedial prefrontal cortex (vmPFC), 287 Diagnostic and Statistical Manual of Mental Disorders (DSM), 335 Disruptions, 336e351 Distinct subregions, 286e287 Dopamine (DA) cocaine, 210 delay discounting, 207 depletion, 207e208 firing patterns, 85 NAc core, 203 NAc shell, 202 optical stimulation, 202 receptor expression, 208e209 signaling, 108e109 Dopaminergic error signals cached value, 243, 251e252 cueereward relationships, 252e254 dopamine prediction error, 248e249 generic value, 251e252 midbrain dopamine system, 244 nucleus accumbens (NaCC), 253 Pavlovian conditioning theory, 246e247 phasic stimulation, 249e251 prediction errors, 245e248 reinforcement learning, 245e248 RescorlaeWagner model, 247e248 reward-paired cues, 244e245 rewards, 244e245 synaptic plasticity, 247e248 temporal difference reinforcement learning (TDRL), 248 ventral tegmental area (VTA), 244 Dopaminergic system, 288 Dopamine transporter binding (DAT), 375 Dorsal striatum, 372 Dorsolateral prefrontal cortex (dlPFC), 158 DREADD approach, 379e380 Drift-diffusion model boundary separation, 51 evidence integrator, 50 integration process, 51 log likelihood ratio, 50 nondecision time, 52 one-step binary choice, 49e50, 50f parameters, 51 perceptual decision-making tasks, 51 posterior ratio, 50 situational context, 51 state variable, 49e50 Drug addictions N-acetylcysteine, 378 alcohol, 377 alcohol self-administration, 369e370 a�amino- 3-hydroxy-5-methyl-4-isoxazolepro- pionic acid (AMPA) receptor subunits, 376 chronic drug users, 371 chronic exposure to psychostimulants, 375e376 chronic pretreatment, 370 dopamine transporter binding (DAT), 375 dorsal striatum, 372 DREADD approach, 379e380 excitatory postsynaptic potentials (EPSCs), 378 habit learning process, 367e368 habitual behavior, 369e371 high-impulsivity traits, 374e375 insensitivity to devaluation, 373 long-term depression (LTD), 377 medium spiny neuron (MSN), 372 methamphetamine exposure, 376e377 N-methyl-D-aspartate (NMDA) receptor, 376 neurocircuitry, 371e374 nicotine self-administration, 369e370 orbital frontal cortex (OFC), 379e380 paradigms, 370e371 Pavlovianeinstrumental transfer, 378e379 460 Index psychostimulants, 375e377 rescuing adaptive behavior, 377e380 self-administration training, 369 stimuluseresponse associations, 367e368 ventral striatum, 372 Drug craving, 339e340 Drug-seeking habits, 337 Dual-system theory belief criterion “flying-in-that-direction schemata”, 5e6 intentional account of behavior, 6 interval training, 4 issue, 4e5 sodium appetite, 5 causal, 2 desire criterion, 1 animal’s propensity, 2e3 devaluation, 3 food caching, 20e21 variable interval schedule, 3 food caching, 21 habits automaticity, 7 motivation, 8e9 outcome expectations, 9e12 stimulus-response/reinforcement mechanisms, 6e7 instrumental learning avoidance, 17e18 choice training, 17 computational reinforcement theory, 12e13 extended training, 15e16, 16f model-based computations, 12e13 rate correlational theory, 13e14 ratio and interval training, 14e15, 15fe16f system integration, 18e19 Pavlovian conditioning, 20 rational, 2 Dyna, 168, 170, 415 E Eating disorders, 343e346 Ecological behaviors, 420e421 Effortful responding, 204e205 EFT. See Episodic future thinking (EFT) Emotions, 316e318 conscious emotions, 317e318 nonconscious emotions, 318 to social goals, 318e319 Entorhinal cortex (EC), 80 Episodic future thinking/mental time travel, 125e126 cannabinoids impair memory, 138 clonidine, 138 nonlocal firing, 131 phase precession, 135 place cells, 131 place fields, 131 pyramidal cells, 131 reactivation during sleep, 134e135 sequences depict future trajectories to home location, 133, 134f sequences during sleep, 134e135 sharp-wave ripple complexes (SWR), 131e134, 132f theta oscillations, 131e132, 132f, 135, 138 theta sequences, 136e138 two-step decision task, 130e131, 130f vicarious trial and error (VTE), 137 Episodic memory, 82e83 Epistemic action, 411 ERP. See Exposure response prevention (ERP) Excitatory postsynaptic potentials (EPSCs), 378 Expected Utility Theory, 316 Exposure response prevention (ERP), 356 F Feedback-driven learning process, 286e287 Flexible goal-directed behavior attentional set-shifting, 204 canonical test, 204e205 effortful responding, 204e205 Pavlovian-to instrumental-transfer task, 205e206 reversal learning, 204 G Generalization construct, modify, and consolidate memory, 77 “grandmother cells”, 78 “Jennifer Aniston neurons”, 78 medial temporal lobe (MTL), 76e78 Parkinson disease (PD) patients, 77 retrieve relevant memories, 77 Generic value, 251e252 Gilles de la Tourette syndrome (GTS), 350 Goal-directed action conditioned reflexes, 447 conditioning theories, 446e449 Index 461 Goal-directed action (Continued) degradation and devaluation treatments, 430 habits, 430e439 implications, 438e439 incentive learning, 433 infralimbic cortices, 431 law of effect, 431e433 motivational processes, 439e446 affective change, 439e443 drive-incentive perspectives, 441e443 emotional feedback, 443e446 reinforcement, 439e443 reward, 443e446 stimuluseaction (S-R) association, 442f subsequent positions, 439e440 neural feedback processes, 433e438 arkypallidal neuronal projection, 437 basolateral amygdala (BLA), 433e434 central nucleus of the amygdala (CeN), 436e437 cholecystokinin, 435e436 instrumental actions, 437 intercalated cells (ITCs), 437e438 medial orbital cortex (mOFC), 435e436 nucleus accumbens core (NAco), 434e435 reinforcement, 436e438 reward, 433e436 substantia nigra pars compacta (SNc), 436e437 posterior part of dorsomedial striatum (pDMS), 430e431, 434e435 prelimbic involvement, 430e431 reinforcement, 431e433 reward, 431e433 specific stimuluse response (S-R), 431 S-R association, 431e433 Goal-directed deficits, 387e406 Goal-directed social behavior emotions, 316e318 conscious emotions, 317e318 nonconscious emotions, 318 to social goals, 318e319 goal pursuit, 309 guilt, 322e323 motivation, 310e314 negative affect, 320e322 quantitative disciplines, 314e315 social-affective control model, 320, 320f social goal representation, 310e314 affiliative motivations, 311 harm prevention, 310e311 self-benefit, 313e314 uncertainty, 313 translating goals to action, 314e316 common valuation system, 315e316 decision policies, 315 Expected Utility Theory, 316 Weak Axiom of Revealed Preferences (WARP), 316 ultimatum game (UG), 321 Goal selection, 418e420 “Grandmother cells”, 78 Guilt, 322e323 Gustatory cortex (GC), 184 H Habit propensity, 335e336 Habits, 153, 430e439 automaticity, 7 motivation, 8e9 outcome expectations contextual stimuli, 10 habitual S-R/reinforcement mechanism, 10 ideomotor theory, 11 Pavlovian stimulus, 9 specific PIT, 11e12 stimulus-response/reinforcement mechanisms, 6e7 Habitual behavior, 369e371, 407e408 Habitual goal selection, 171e175, 173f Harm prevention, 310e311 Hebbian plasticity, 411 Hedonic processing, 202 Hierarchical RL (HRL), 114e115 Hippocampal activity, 291e292 Hippocampalestriatal connectivity, 83e84 Hippocampus, 88 anterior dorsolateral striatum, 127e128 cognitive map coordinate system, 127e128 internal representation, 126 situation-action decisions, 126 dopaminergic neurons, 82 episodic future thinking/mental time travel, 125e126 cannabinoidsimpair memory, 138 clonidine, 138 nonlocal firing, 131 phase precession, 135 place cells, 131 462 Index place fields, 131 pyramidal cells, 131 reactivation during sleep, 134e135 sequences depict future trajectories to home location, 133, 134f sequences during sleep, 134e135 sharp-wave ripple complexes (SWR), 131e134, 132f theta oscillations, 131e132, 132f, 135, 138 theta sequences, 136e138 two-step decision task, 130e131, 130f vicarious trial and error (VTE), 137 generalization construct, modify, and consolidate memory, 77 “grandmother cells”, 78 “Jennifer Aniston neurons”, 78 medial temporal lobe (MTL), 76e78 Parkinson disease (PD) patients, 77 retrieve relevant memories, 77 Hullian situation-action process, 127e128 imagination, 125 learning transition probabilities, 85e86 medial prefrontal cortex (mPFC), 128e129 mental simulation, 86e87 “model-based decision making”, 126 neural computations autoassociative encoding, 81 computational linchpin, 80 “detonator synapses”, 81 entorhinal cortex (EC), 80 reencoding, 81 plus maze task, 126e127, 127f prefrontal cortex, 128e129 reinforcement learning, 126 sequences generated, 139e140 sequences improve/increase/predict planning pairwise spiking activity, 140e141, 141f place cell firing coordination, 140e141 SWR sequences, 140e142, 142f stimulus associations “delay conditioning”, 78e79 enhanced pattern similarity, 79 neural pattern similarity, 79e80 sensory preconditioning, 79 “trace conditioning”, 78e79 ventromedial prefrontal cortex, 125e126 vicarious trial and error (VTE), 128 Hot hedonic desire, 387e388 Hullian situation-action process, 127e128 I Ideomotor theory, 11 Impulsivity, 408 Incentive learning, 433 Inferior parietal lobule (IPL), 32e34 Infralimbic cortices, 431 Instrumental contingency, 31 Instrumental divergence boundary condition on goal-directedness, 40f behaviorereward correlations, 41 formalizing arbitration, 44 strategy arbitration, 42 system-balancing task, 43e44 system imbalance and monetary loss, 42 expected utility, 28, 28f information-theoretic formalization defined, 29e30 JenseneShannon (JS) divergence, 29 Shannon entropy, 30 intrinsic utility of control, 43f conventional RL agent, 36 flexible instrumental control, 34, 35f high- and zero-divergence rooms, 36e37 mean choice proportions, 37e38, 38f model-based RL agents, 35e36 OxfordeLiverpool Inventory of Feelings and Experiences (O-LIFEs), 39 perceptual outcome diversity, 36e37 psychopathology, 38e39 RL predictions and mean choice proportions, 35e36, 36f self-play vs. autoplay manipulation, 37e38 “sense of agency” (SOA), 38e39 neural correlates Bayesian model selection analysis, 32e34 formal accounts of goal-directed action values, 31 inferior parietal lobule (IPL), 32e34 instrumental contingency, 31 motivational and information-theoretic variables, 32 parametric modulation, 32e34, 33f state prediction error, 31e32 transition probabilities, 31e32 trial in choice task, 32, 33f probability distributions over three potential outcomes, 27e28, 28f Index 463 Instrumental learning avoidance, 17e18 choice training, 17 computational reinforcement theory, 12e13 extended training, 15e16, 16f model-based computations, 12e13 rate correlational theory, 13e14 ratio and interval training, 14e15, 15fe16f system integration, 18e19 Insular cortex (IC) anatomical considerations, 184 and goal-directed behavior, 184e185 summary, 186 Integration, 388e389 Intertemporal choice, 87 J “Jennifer Aniston neurons”, 78 JenseneShannon ( JS) divergence, 29 K Kolmogorov complexity, 71 L Latent spaces, 116 Law of effect, 431e433 LiCl pairings, 337 Long-term depression (LTD), 377 Loopy belief propagation, 63 M Markov decision process (MDP), 412b Markovian property, 262 Medial prefrontal cortex (mPFC), 200, 211 anatomical considerations, 180 and goal-directed behavior, 180e181 summary, 181 Medial temporal lobe (MTL), 76e77, 290e291 Mediodorsal thalamic nucleus (MD), 186e187 Medium spiny neuron (MSN), 372 Methamphetamine, 376e377 N-methyl-D-aspartate (NMDA) receptor, 376 Midbrain dopamine system, 244 “Model-based” algorithms, 281 Model-based reinforcement learning, 62 Model-free control, 419b Model-free learning, 285 Model-free RL algorithms, 108 artificial agent, 108e109 limitations, 109 probabilistic reward learning, 109 reward prediction error, 108 Motivation, 310e314, 439e446 affective change, 439e443 drive-incentive perspectives, 441e443 emotional feedback, 443e446 reinforcement, 439e443 reward, 443e446 stimuluseaction (S-R) association, 442f subsequent positions, 439e440 Multiple state spaces, 112e113 Multistep drift-diffusion models, 58f accuracy and reaction times, 55e56, 56f backward induction, 57e58 decision difficulty, 55 decision trees, 54e55, 54f deliberation process, 56e57 eye tracking, 59 one-step decision problems, 55 principles, 60 properties, 56e57 results of experiment 1, 55, 56f results of experiment 2, 56e57, 57f serial sampling procedure, 59 simulation of both experiments, 58e59, 59f simulation of experiment 1, 60, 60f worse quantitative fit, 60 Multivariate pattern analysis techniques, 264e265 N Negative affect, 320e322 Negative prediction errors, 391e394 Neural feedback processes, 433e438 arkypallidal neuronal projection, 437 basolateral amygdala (BLA), 433e434 central nucleus of the amygdala (CeN), 436e437 cholecystokinin, 435e436 instrumental actions, 437 intercalated cells (ITCs), 437e438 medial orbital cortex (mOFC), 435e436 nucleus accumbens core (NAco), 434e435 reinforcement, 436e438 reward, 433e436 substantia nigra pars compacta (SNc), 436e437 Neurocircuitry, 286e289, 371e374 Neuromodulatory systems, 187e188 Nicotine self-administration, 369e370 Nonconscious emotions, 318 464 Index Non-Markovian environments, 265 Nucleus accumbens (NAc), 211e212 anatomical organization, 200e201 behavioral tasks, 201, 201t decision-making DA-related proteins, 207 D1/D2 antagonist, 208e209 delay discounting tasks, 206 delay sensitivity, 207e208 D1 mRNA, 208e209 pharmacological inactivation, 208 flexible goal-directed behavior attentional set-shifting, 204 canonical test, 204e205 effortful responding, 204e205 Pavlovian-to instrumental-transfer task, 205e206 reversal learning, 204 guiding goal-directed behavior, 199 hedonic processing, 202 lateral hypothalamus (LH), 211 limbic-motor interface, 199 medial prefrontal cortex (mPFC), 211 NMDA receptor blockade, 201 non-NMDA glutamate receptors, 201 PrL cortex, 211 psychostimulant effects, 209e210 reward learning, 202e203 O Obesity, 340e343, 341f Obsessiveecompulsive disorder (OCD), 346e349, 349f “Options framework”, 114e115 Orbital frontal cortex (OFC), 287, 379e380 anatomical considerations, 181e182 belief states, 265e268, 267f curse of dimensionality, 262e263 decision-making, 272e273 definition, 260e261, 260f and goal-directed behavior, 182e183 Markovian property, 262 multivariate pattern analysis techniques, 264e265 non-Markovian environments, 265 OFC-lesioned animals, 263e264 partially observable, 262 Pavlovianeinstrumental transfer, 267e268 prefrontal cortex (PFC), 260e261 reinforcement learning (RL), 259 representational similarity approach, 266e267 state inferences, 265e268, 267f state-space theory, 259e265 summary, 183e184 value-independent outcome identities, 270e271 value signals, 268e272 OxfordeLiverpool Inventory of Feelings and Experiences (O-LIFEs), 39 P Parkinson disease (PD), 351 Partial evaluation, 170e171, 171f Pavlovian conditioning theory, 246e247 Pavlovianeinstrumental transfer (PIT), 8, 11e12, 267e268, 352e353, 378e379, 399e400 actioneoutcome learning neuronal circuitry, 226e228 basolateral complex (BLA), 225e226 central nucleus of the amygdala (CeA), 225e226 cue-based choice cholinergic interneurons (CINs), 232e234 cholinergic system, 231e232 confluence of signals, 234e236 modulator, 232e234 neuroanatomy, 224e229 neurobiology, 229e236 neuromodulation signals, 229e231 nucleus accumbens shell, 229e231 nucleus accumbens shells, 234e236, 235f dorsolateral striatum (DLS), 227e228 dorsomedialpart of the striatum (DMS), 227e228 experimental approaches, 222f infralimbic (IL) cortices, 227 neuronal integration, 228e229 orbitofrontal cortex (OFC), 224e225 prelimbic (PL) cortices, 227 “specific” form, 221e223 stimuluseoutcome learning, 224e226 Perceptual outcome diversity, 36e37 Phase precession, 135 “Phasic” activity, 202e203 Phasic stimulation, 249e251 Place fields, 131 Planning-as-inference model, 61e62, 61f “Planning-until-habit”, 170e171 Posterior dorsomedial striatum (pDMS), 181, 430e431, 434e435 Index 465 Prediction errors feedback learning, 83e84 signals, 286e287 Prefrontal cortex (PFC), 260e261, 287 Prelimbic (PL) cortices, 180e181, 227, 430e431 Proactive control, 296e298 Proactive possible trials, 295 Proliferative period, 287e288 Pseudocode, 109e110, 111t Psychostimulants, 209e210, 375e377 Pyramidal cells, 131 Q Q-learning model, 393e394, 394f Quantitative disciplines, 314e315 R Rate correlational theory, 13e14 Reactive control, 297e298 Real-life behaviors, 420e421 Real-life complexity, 420 “Reference class forecasting”, 86 Reinforcement, 431e433 Reinforcement learning (RL), 62, 84, 245e248, 259 action spaces abstract hierarchical action spaces, 113e114 temporally abstract actions, 114e115 addiction, 408 algorithms, 106e108, 106fe107f automaticity vs. control, 416e417 belief-based scheme, 413 belief-based vs. belief-free control, 411e413 belief-free agent, 411e413 brain, 107f, 108e109 create state/action spaces, 117e118 epistemic action, 411 framing problem, 109e110, 111t goal-directed decisionmaking, 408 goal-directed system structure, 415e416 habits and goals development, 417e418 ecological behaviors, 420e421 goal selection, 418e420 interactions, 421e422 model-free control, 419b neural correlates, 418 pursuing, 418e420 real-life behaviors, 420e421 real-life complexity, 420 habitual behavior, 407e408 Hebbian plasticity, 411 impulsivity, 408 latent state and action spaces, 115e116 limitations, 109 markov decision process (MDP), 412b model-based control, 416 model-based learning systems, 408 model-based vs. model-free dichotomy, 413e415 model-free and model-based control, 154e155 moral judgment, 408 state spaces multiple state spaces, 112e113 simplifying, 110e112 taxonomy, 408e409 mapping animal behavior, 407e408 value-based vs. value-free control, 409e411, 410f Repetitive transcranial magnetic stimulation (rTMS), 357 RescorlaeWagner model, 247e248 Rescuing adaptive behavior, 377e380 Responseeoutcome (R-O) associations, 331e332 Reversal learning, 204 Reward, 431e433 learning, 202e203 prediction error, 108 Reward-paired cues, 244e245 Reward probabilities, 281e282 Rewards, 244e245 Rho kinase, 180 Rostrolateral prefrontal cortex, 290e291 “Rules” or task sets, 113 S Sample complexity, 168 Schizophrenia, 350 apparent decision-making behavior, 389 BOLD responses, 392e393 causal models, 400e403 Chi-square test, 393e394 contemporary evidence, 394e399 driving performance, 391e392 goal-directed deficits, 387e406 historical evidence, 389e391 incentive value/reward, 388e389 integration, 388e389 learning causal models, 400e403 466 Index negative prediction errors, 391e394 Pavlovian-to-instrumental transfer (PIT), 399e400 prefeeding, 397 Q-learning model, 393e394, 394f stimuluseoutcome (S-O) learning, 399e400 stimulus-response tasks, 389e390 third possibility, 396 two-stage task, 395e396 Wisconsin Card Sorting, 390e391 Selective serotonin reuptake inhibitors (SSRIs), 357 Self-administration training, 369 Self-benefit, 313e314 Self-starvation, 344 “Sense of agency” (SOA), 38e39 Shannon entropy, 30 Sharp-wave ripple complexes (SWR), 131e132, 132f cannabinoids, 138 electrical stimulation, 133e134 memory consolidation, 132e133 pairwise spiking activity, 140e141, 141f place cell firing coordination, 140e141 predict future paths, 141e142, 142f vs. theta oscillations, 138 Similarity-based judgment, 68 Similarity functions alignment-based models, 71 feature-based models, 71 geometric models, 70e71 transformational models, 71 Simple binary choice best-fitting drift rate, 52 collapsing decision, 53 correct-and-error reaction times, 52 decision difficulty, 53 independent integrators, 52e53 mean accuracy, 52 memory retrieval, 53 neural implementation, 52e53 Slower second-stage responses, 293 Social-affective control model, 320, 320f Social anxiety disorder (SAD), 351 Social goal representation, 310e314 affiliative motivations, 311 harm prevention, 310e311 self-benefit, 313e314 uncertainty, 313 Specific stimuluse response (S-R), 431 S-R association, 431e433 State inferences, 265e268, 267f State spaces creation, 117e118 multiple state spaces, 112e113 simplifying, 110e112 State-space theory, 261e265 Stimuluseresponse associations, 367e368, 389e390 Substance abuse, 336e340 T Taxonomy, 408e409 Temporal difference reinforcement learning (TDRL), 248 Temporal dynamics of reward-based goal-directed decision-making Bayesian perspective on multistep choice inference process, 63 joint probability distribution, 61 loopy belief propagation, 63 model-based reinforcement learning, 62 planning-as-inference model, 61e62, 61f reinforcement learning (RL), 62 restrictions, 63 reward function, 62 drift-diffusion model. See Drift-diffusion model multistep drift-diffusion models. See Multistep drift-diffusion models simple binary choice best-fitting drift rate, 52 collapsing decision, 53 correct-and-error reaction times, 52 decision difficulty, 53 decision problems, 54 independent integrators, 52e53 mean accuracy, 52 memory retrieval, 53 neural implementation, 52e53 Temporally abstract actions, 114e115 Theta oscillations, 131e132, 132f Theta sequences, 132f, 137 entorhinal cortex, 139e140 go backward, 142e143 and phase precession, 136, 139e140 Time complexity, 168 Transdiagnostic perspectives, 335e336 Index 467 Tree-search process, 281 Two-step task, 281e282, 282f U Ultimatum game (UG), 321 Uncertainty, 313 V Value-based vs. value-free control, 409e411, 410f Value-independent outcome identities, 270e271 Value of information, 166 Ventral striatum, 372 Ventral subiculum, 200 Ventral tegmental area (VTA), 203, 244 Ventromedial prefrontal cortex (vmPFC), 86e87, 125e126, 287, 334 Vicarious trial and error (VTE), 86e87, 128, 137 W WARP. See Weak Axiom of Revealed Preferences (WARP) Weak Axiom of Revealed Preferences (WARP), 316 Working memory, 157e158 468 Index Index A B C D E F G H I J K L M N O P Q R S T U V W