Abstract
Rationale
During operant conditioning, animals associate actions with outcomes. However, patterns and rates of operant responding change over learning, which makes it difficult to distinguish changes in learning from general changes in performance or movement. Thus, understanding how task parameters influence movement execution is essential.
Objectives
To understand how specific operant task parameters influenced the repetition of future operant responses, we investigated the ability of operant conditioning schedules and contingencies to promote reproducible bouts of five lever presses in mice.
Methods
Mice were trained on one of the four operant tasks to test three distinct hypotheses: (1) whether a cue presented concurrently with sucrose delivery influenced the pattern of lever pressing; (2) whether requiring animals to collect earned sucrose promoted the organization of responses into bouts; and (3) whether only reinforcing bouts where interresponse time (IRT) variances were below a target promoted reproducible patterns of operant behavior.
Results
(1) Signaling reinforcer delivery with a cue increased learning rates but resulted in mice pressing the lever in fast succession until the cue turned on, rather than executing discrete bouts. (2) Requiring mice to collect the reinforcer between bouts had little effect on behavior. (3) A training strategy that directly reinforced bouts with low variance IRTs was not more effective than a traditional fixed ratio schedule at promoting reproducible action execution.
Conclusions
Together, our findings provide insights into the parameters of behavioral training that promote reproducible actions and that should be carefully selected when designing operant conditioning experiments.
Similar content being viewed by others
References
Arbel Y, Hong L, Baker TE et al (2017) It’s all about timing: an electrophysiological examination of feedback-based learning with immediate and delayed feedback. Neuropsychologia 99:179–186. https://doi.org/10.1016/j.neuropsychologia.2017.03.003
Baron A, Mikorski J, Schlund M (1992) Reinforcement magnitude and pausing on progressive-ratio schedules. J Exp Anal Behav 2(2):377–388
Blakely E, Schlinger H (1988) Determinants of pausing under variable-ratio schedules: reinforcer magnitude, ratio size, and schedule configuration. J Exp Anal Behav 1(1):65–73
Branch MN (1977) Signalled and unsignalled percentage reinforcement of performance under a chained schedule. J Exp Anal Behav 27(1):71–83. https://doi.org/10.1901/jeab.1977.27-71
Doughty AH, Lattal KA (2003) Response persistence under variable-time schedules following immediate and unsignalled delayed reinforcement. Q J Exp Psychol Section B: Comp Physiol Psychol 56(3):267–277. https://doi.org/10.1080/02724990244000124
Duffy A, Latimer KW, Goldberg JH et al (2022) Dopamine neurons evaluate natural fluctuations in performance quality. Cell Reports 38(13):110574. https://doi.org/10.1016/j.celrep.2022.110574
Faure A, Haberland U, Condé F et al (2005) Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J Neurosci 25(11):2771–2780. https://doi.org/10.1523/JNEUROSCI.3894-04.2005
Felton M, Lyon D (1966) The post-reinforcement pause. J Exp Anal Behav 9(2):131–134
Ferster CB, Skinner BF (1957) Schedules of Reinforcement. Appleton-Century-Crofts 79(1911):5326
Foerde K, Shohamy D (2011) Feedback timing modulates brain systems for learning in humans. J Neurosci 31(37):13157–13167. https://doi.org/10.1523/JNEUROSCI.2701-11.2011
Gadagkar V, Puzerey PA, Chen R et al (2016) Dopamine neurons encode performance error in singing birds. Science 354(6317):1278–1283
Garr E, Padovan-Hernandez Y, Janak PH et al (2021) Maintained goal-directed control with overtraining on ratio schedules. Learn Mem (Cold Spring Harbor, N.Y.) 28(12):435–439. https://doi.org/10.1101/lm.053472.121
Gershman SJ, Ölveczky BP (2020) The neurobiology of deep reinforcement learning. Curr Biol 30(11):R629–R632. https://doi.org/10.1016/j.cub.2020.04.021
Jin X, Tecuapetla F, Costa RM (2014) Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat Neurosci 17(3):423–430. https://doi.org/10.1038/nn.3632
Krame TJ, Rilling M (1970) Differential reinforcement of low rates: a selective critique. Psychol Bulletin 74(4):716. https://doi.org/10.1037/h0021468
Kravitz AV, Kreitzer AC (2012) Striatal mechanisms underlying movement, reinforcement, and punishment. Physiology 27:167–177. https://doi.org/10.1152/physiol.00004.2012
Kuch D, Platt JR (1976) Reinforcement rate and interresponse time differentiation. J Exp Anal Behav 3(3):471–486
Lewis P, Lewin L, Muehleisen P et al (1974) Preference for signalled reinforcement. J Exp Anal Behav 22(1):143–150. https://doi.org/10.1901/jeab.1974.22-143
Lowe CF, Davey GCL, Harzem P (1974) Effects of reinforcement magnitude on interval and ratio schedules. J Exp Anal Behav 3(3):553–560
Malott RW, Cumming WW (1966) Concurrent schedules of interresponse time reinforcement: probability of reinforcement and the lower bounds of the reinforced interresponse time intervals. J Exp Anal Behav 9(4):317–325. https://doi.org/10.1901/jeab.1966.9-317
Marcucella H, Margolius G (1978) Time allocation in concurrent schedules: the effect of signalled reinforcement. J Exp Anal Behav 29(3):419–430. https://doi.org/10.1901/jeab.1978.29-419
McMillan JC (1971) Percentage reinforcement of fixed-ratio and variable-interval performances. J Exp Anal Behav 15(3):297–302. https://doi.org/10.1901/jeab.1971.15-297
Packard MG, Knowlton BJ (2002) Learning and memory functions of the basal ganglia. Annu Rev Neurosci 25:563–593. https://doi.org/10.1146/annurev.neuro.25.112701.142937
Panigrahi B, Martin KA, Li Y et al (2015) Dopamine is required for the neural representation and control of movement Vigor. Cell 162(6):1418–1430. https://doi.org/10.1016/j.cell.2015.08.014
Peterburs J, Kobza S, Bellebaum C (2016) Feedback delay gradually affects amplitude and valence specificity of the feedback-related negativity (FRN). Psychophysiology 53(2):209–215. https://doi.org/10.1111/psyp.12560
Powell RW (1969) The effect of reinforcement magnitude upon responding under fixed-ratio schedules. J Exp Anal Behav 12(4):605–608. https://doi.org/10.1901/jeab.1969.12-605
Sanderson DJ, Cuell SF, Bannerman DM (2014) The effect of US signalling and the US-CS interval onbackward conditioning in mice. Learn Motiv 48:22–32. https://doi.org/10.1016/j.lmot.2014.08.002
Schachtman TR, Reed P (1992) Reinforcement signals facilitate learning about early behaviors of a response sequence. Behav Proc 26:1–11
Schlinger H, Blakely E, Kaczor T (1990) Pausing under variable-ratio schedules: interaction of reinforcer magnitude, variable-ratio size, and lowest ratio. J Exp Anal Behav 1(1):133–139
Sidman M, Stebbins WC (1954) Satiation effects under fixed-ratio schedules of reinforcement. J Comp Physiol Psychol 47(2):114–116. https://doi.org/10.1037/h0054127
Urcelay GP, Jonkman S (2019) Delayed rewards facilitate habit formation. J Exp Psychol: Animal Learn Cogn 45(4):413–421. https://doi.org/10.1037/xan0000221
Vandaele Y, Ahmed SH (2020) Habit, choice, and addiction. Neuropsychopharmacology. https://doi.org/10.1038/s41386-020-00899-y
Vandaele Y, Pribut HJ, Janak PH (2017) Lever insertion as a salient stimulus promoting insensitivity to outcome devaluation. Front Integr Neurosci 11(September):1–13. https://doi.org/10.3389/fnint.2017.00023
Weinberg A, Luhmann CC, Bress JN et al (2012) Better late than never? The effect of feedback delay on ERP indices of reward processing. Cogn Affect Behav Neurosci 12(4):671–677. https://doi.org/10.3758/s13415-012-0104-z
Wymbs NF, Bassett DS, Mucha PJ et al (2012) Differential recruitment of the sensorimotor putamen and frontoparietal cortex during motor chunking in humans. Neuron 74:936–946. https://doi.org/10.1016/j.neuron.2012.03.038
Yin H, Wang Y, Zhang X et al (2018) Feedback delay impaired reinforcement learning: principal components analysis of reward positivity. Neurosci Lett 685:179–184. https://doi.org/10.1016/j.neulet.2018.08.039
Alleman HD (1970) Interresponse time reinforcement. University of Iowa Thesis.
Greenstreet F, Vergara HM, Pati S, et al. (2022) Action prediction error: a value-free dopaminergic teaching signal that drives stable learning. BiorXiv.
Funding
This work was supported by NIH grants DA055380 and DA048931 to E.S.C., 5T32MH065215-18 to M.C. and to M.Z.L., as well as by funds from Brain and Behavior Research Foundation, the Whitehall Foundation, and the Edward Mallinckrodt, Jr. Foundation, to E.S.C.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Follman, E.G., Chevée, M., Kim, C.J. et al. Task parameters influence operant response variability in mice. Psychopharmacology 240, 213–225 (2023). https://doi.org/10.1007/s00213-022-06298-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00213-022-06298-z