Abstract
“To act or not to act” is a fundamental decision made in daily life. However, it is unknown how the relevant signals are transmitted to the secondary motor cortex (M2), which is the cortical origin of motor initiation. Here, we found that in a decision-making task in male mice, inputs from the thalamus to M2 positively regulated the action while inputs from the lateral part of the orbitofrontal cortex (LO) negatively regulated it. The motor thalamus that received the basal ganglia outputs transmitted action value-related signals to M2 regardless of whether the animal acted or not. By contrast, a large subpopulation of LO inputs showed decreased activity before and during the action, regardless of the action value. These results suggest that M2 integrates the positive signal of the action value from the motor thalamus with the negative action-biased signal from the LO to finally determine whether to act or not.
Similar content being viewed by others
Introduction
To do or not to do, is that not the question? In addition to multiple choices with different actions, the choice of whether to act or not is an important decision to be made1. Through learning, animals come to select an action to obtain reward and avoid punishment with every decision made. In this decision, animals sometimes appear to choose non-action, even if the non-action results in no overt reward or does not avoid overt punishment2,3. How does the cerebral cortex decide whether or not to act and drive the action-related cortical activity when an action is about to be executed?
In rodents, the secondary motor cortex (M2) plays critical roles in value-based decision-making, motor planning, and motor initiation in response to sensory stimuli. In a two-choice decision-making task, the action-choice signal appears earlier in M2 than in the orbitofrontal cortex (OFC), medial prefrontal cortex (mPFC), and striatum4. M2 neurons projecting to the spinal cord regulate the action selection and execution5. Voluntary movement initiates when M2 activity exceeds a threshold6. Motor-related activity in M2 spreads over many cortical areas7. M2 activity required for the forelimb lever-pull movement precedes the activity of the primary motor cortex (M1)8.
Many studies strongly suggest that integration of reward-associated sensorimotor signals from other areas may be necessary in M2 for appropriate motor initiation4,9,10,11,12,13. M2 receives direct projections from numerous brain areas, including the sensory cortex, parietal cortex, motor thalamic nuclei (mTh), and OFC13,14,15,16,17. M2 and M1 receive outputs from the striatum, particularly the dorsal striatum, via the ventromedial (VM) and ventroanterior (VA) nuclei included in the mTh18,19. VM inputs to M2 are required for motor initiation12. In turn, M2 sends strong projections to both dorsomedial striatum (DMS) and dorsolateral striatum (DLS), whereas M1 sends projections to the DLS, but not the DMS20. The action value is strongly represented in the DMS and ventral striatum, but not the DLS4,21,22. DLS and M1 are more directly associated with activity related to movement8,23,24. Considering that M2 activities are relevant to the value-based decision-making and motor initiation, M2 is assumed to be influenced by the action value signals in the striatum. However, it has not been demonstrated whether the thalamic neurons that mediate the pathway from the striatum to M2 convey the action value and bias the choice to act or not to act when an action occurs probabilistically in response to a certain sensory stimulus. In addition, the lateral part of the OFC (LO) in the mouse suppresses compulsive behavior through its DMS-projecting pathway25. However, it is unknown whether the LO directly inhibits M2 activity for M2-dependent goal-directed actions.
In the current study, we trained head-fixed male mice to choose to either pull or not to pull a lever after presentation of one of two-tone cues that were assigned to different reward probabilities2,3. When the cue with the low reward probability was presented, sometimes the mice pulled the lever, and sometimes they did not. While the mice performed this task, we conducted optogenetic stimulation of thalamic and LO inputs to M2. We also conducted two-photon calcium imaging of those thalamic axons in layer 1 (L1) of M2 that mainly originated from VM neurons (VM→M2L1 axons), axons of LO neurons that projected to L1 of M2 (LO→M2L1 axons), and layer 2/3 (L2/3) neurons in M2. Our results suggest that the decision-making on whether to act or not is regulated by convergence in M2 of the action value-related signal from the motor thalamus and the action-suppression signal from the LO.
Results
Mice show both action and non-action in response to a tone with lower reward probability in a two-tone lever-pull task
We trained head-fixed male mice to perform a lever-pull task with different reward probabilities (Fig. 1a). In each trial, one of the two pure-tone cues A and B was presented for 0.8–1.2 s (cue period) and then a go sound (white or pink noise) was presented for 300 ms. When the mice used their right forelimb to pull the lever within 1 s after the go sound onset (response period) and maintained the pulling for 200 ms (successful lever-pull), a water reward was delivered at a probability of 100% following the cue A presentation (cue A trials) and 20% following the cue B presentation (cue B trials). If they did not pull the lever, they did not receive a reward or punishment. The cue in the next trial was presented 4–5 s after the end of the response period. The period for 1 s before the cue onset was defined as the pre-cue period. In pre-training, the mice received a water reward every time they pulled the lever after either tone cue was presented. As the training days progressed, the lever-pull rate (the number of successful lever pulls/the number of trials without early pull that were defined as pulling during the cue period) remained high in cue A trials (~0.9), whereas it decreased to ~0.5 in cue B trials (Fig. 1b and Supplementary Fig. 1a). The reaction time from the go sound onset to the onset of the successful lever-pull trial was 100–200 ms, and was slightly but significantly shorter in cue A trials than in cue B trials (Supplementary Fig. 1b). The early-pull rate was 0.1–0.2, and was higher in cue A trials than in cue B trials (Supplementary Fig. 1b). Licking during the cue period (anticipatory licking) was infrequent, and licking started immediately after the go sound onset in successful lever-pull trials in cue A and B trials (Apull and Bpull trials, respectively), but not in non-lever-pull trials in cue B trials (Bnon-pull trials) (Fig. 1c). These results suggest that after the session had sufficiently progressed, the mice lowered the value of pulling the lever in cue B trials compared with cue A trials.
a Schematic of the two-tone lever-pull task. X = 100 for cue A trials and X = 20 for cue B trials. ITI, intertrial interval. b Lever-pull rate in the last training session (n = 89 mice, ***P < 0.001, Wilcoxon signed-rank test, two-sided). Black lines indicate the mean. c Averaged lever trajectory and lick probability for each trial type after learning (n = 89 mice; mean ± SEM; red, Apull trials; blue: Bpull trials; light blue: Bnon-pull trials). The solid vertical line, the onset of the cue presentation. The dotted vertical line, the onset of the go sound presentation. White, black, and gray bars at the top indicate the pre-cue, cue, and response periods, respectively. Scale bars, 2.5 mm for lever trajectory and 0.2 for lick probability. d Lever-pull rate after administration of artificial cerebrospinal fluid (ACSF; open box) and muscimol (closed box) to M2. Gray lines indicate each mouse. Black bars indicate mean ± SEM, n = 6, *P < 0.05, Wilcoxon signed-rank test, two-sided. e Top left and right, retrograde staining with FastBlue (green) put onto the M2 surface (2.34 mm anterior to the bregma). White, nuclear staining. Bottom, section images including retrogradely stained bilateral LO (left, 3.08 mm anterior to the bregma) and ipsilateral VM (right, 1.22 mm posterior to the bregma). Scale bars: 1 mm (top left), 100 µm (others). AM, anteromedial thalamic nucleus. f, g Effects of photoactivation and photoinactivation of mTh→M2 axons (f) and LO→M2 axons (g) on the lever-pull rate. Blue, ChR2 photostimulation trials (n = 11 mice for mTh→M2 axons and 8 for LO→M2 axons). Yellow, eNpHR photostimulation trials (n = 5 mice for mTh→M2 axons and 6 for LO→M2 axons). Black bars indicate mean ± SEM. *P < 0.05, ***P < 0.001, Wilcoxon signed-rank test, two-sided. h Effects of photostimulation of GFP-expressing mTh→M2 axons on the lever-pull rate (n = 4 mice). Black bars indicate mean ± SEM. Source data are provided as a Source Data file.
The mTh→M2 pathway positively biases the choice to pull the lever and the LO→M2 pathway negatively biases it
Muscimol inactivation of left M28 after training sessions reduced the lever-pull rate in both cue A and B trials (Fig. 1d). Thus, as in other types of forelimb movement tasks and decision-making tasks4,8, M2 was important for the performance of the task presented in this study. When a retrograde tracer was applied onto the cortical surface of M2, it strongly stained the ventromedial nucleus (VM) in the thalamus (Fig. 1e), as it did when it was applied onto the surface of M126,27,28. In addition, this tracer strongly stained bilateral LO (Fig. 1e).
Next, we examined whether activation and inactivation of the pathway from mTh (including VM and the ventrolateral nucleus [VL]) to M2 (mTh→M2 pathway) and that from LO to M2 (LO→M2 pathway) affected the task performance. The mTh and LO project their axons to other layers of M2 as well as to L1 (Supplementary Fig. 1c–e). Blue light illumination of ChR2-expressing axons from the left mTh in left M2 during the cue and response periods increased the lever-pull rate in cue B trials, whereas yellow light illumination of NpHR-expressing axons from the left mTh in left M2 tended to decrease the lever-pull rate in cue B trials (Fig. 1f, Supplementary Fig. 1f, g and Supplementary Table 1). By contrast, blue light illumination of ChR2-expressing axons from the bilateral LO in left M2 decreased the lever-pull rate in cue A and B trials (although this decrease was not significant in cue B trials), whereas yellow light illumination of NpHR-expressing axons from the bilateral LO in left M2 increased the lever-pull rate in cue B trials (Fig. 1g and Supplementary Fig. 1h, i). The lever-pull rate did not change when GFP was expressed in mTh neurons (Fig. 1h). Except for the increase in the early-pull rate in cue A trials in response to photostimulation of ChR2-expressing axons from the left mTh in left M2, the reaction time and early-pull rate did not change in either of the photostimulation experiments (Supplementary Fig. 1j, k). These results suggest that the mTh→M2 pathway positively regulated the lever-pull action after the go-cue onset and that the LO→M2 pathway negatively regulated it.
To examine whether the effects of these photostimuli were M2-specific, we also photostimulated the pathway from the motor thalamus to M1 (mTh→M1 pathway) and that from LO to M1 (LO→M1 pathway), although an anatomical LO→M1 pathway was not apparently observed (Supplementary Fig. 1d). Neither photoactivation nor photoinactivation of these pathways in M1 changed any of the lever-pull or early-pull rates (Supplementary Fig. 2a–g). However, as demonstrated in our previous study26, when ChR2-expressing mTh→M1 axons were stimulated while the forelimb was left floating without a lever, the forelimb moved (Supplementary Fig. 2h–j). Thus, simple photoactivation of the mTh→M1 pathway in our experimental condition did not directly induce the goal-directed lever-pull movement. These results suggest that mTh→M2 and LO→M2 pathways, but not mTh→M1 and LO→M1 pathways, were important for decision-making with regard to whether or not to pull the lever.
Two-photon calcium imaging of VM→M2L1 axons and LO→M2L1 axons during task performance
Next, to examine the neuronal activity of VM→M2L1 and LO→M2L1 pathways in the two-tone lever-pull task, we conducted two-photon imaging of GCaMP6s- or R-CaMP1.07-expressing VM→M2L1 axons and LO→M2L1 axons in L1 of M2 during task performance (Fig. 2a, b and Supplementary Fig. 3a–e). After imaging, we corrected motion artifacts, extracted active axonal boutons, and calculated their denoised fluorescence change (ΔF/F) (see “Methods” for details). We considered that multiple boutons that showed high pairwise correlations in activity (>0.6) originated from the same neurons26, and used their averaged activity as that of one axon (Supplementary Fig. 3f–h).
a Top, schematic of the two-photon calcium imaging of VM→M2L1 and LO→M2L1 axons. Bottom, representative sagittal slice image of GCaMP6s-expressing LO neurons and R-CaMP1.07-expressing thalamic neurons. b Left, representative image of R-CaMP1.07-expressing VM→M2L1 axons (red) and GCaMP6s-expressing LO→M2L1 axons (green) in L1 of M2. Right, expansion of the boxed region in the left subimage. Scale bars: 20 μm (left), 10 μm (right). c, d Mean lever trajectory and lick probability (25 sessions from 10 mice) (c) and mean trial-averaged activity of VM→M2L1 and LO→M2L1 axons (n = 1932 and 1697, respectively, in the same sessions as those in c) (d) during the imaging experiment. Shading indicates ±SEM. Red: Apull trials; blue: Bpull trials; light blue: Bnon-pull trials. The black, white, and gray rectangles at the top represent the pre-cue, cue, and response periods, respectively. The cue period (0.8–1.2 s) was scaled to 1 s by equalizing the cue period length. Scale bars: 2.5 mm for lever trajectory, 0.2 for lick probability, and 1% ΔF/F for axonal activity. e Trial-averaged activity patterns in Apull (red), Bpull (blue), and Bnon-pull (light blue) trials for six clusters of VM→M2L1 and LO→M2L1 axons. Each row represents an individual axon and the horizontal axis represents time. The green and orange bars in the right column indicate VM→M2L1 and LO→M2L1 axons, respectively. White, black, and gray bars at the bottom indicate the pre-cue, cue, and response periods, respectively. f Time course of the mean activity of each cluster in the Apull (red), Bpull (blue), and Bnon-pull (light blue) trials shown in (e). Shading indicates ±SEM. Scale bars: 1% ΔF/F. g Proportions of axons classified into each cluster for VM→M2L1 (green) and LO→M2L1 (orange) axons (25 sessions from 10 mice). **P < 0.01, ***P < 0.001, the proportions of the corresponding clusters were significantly different from the expected value (16.7%), chi-squared test with post-hoc residual analysis (Bonferroni correction, six repeats), two-sided. The test was separately performed in VM→M2L1 (green) and LO→M2L1 (orange) axons. Source data are provided as a Source Data file.
We analyzed the neuronal activity in the following three types of trials: Apull trials, Bpull trials, and Bnon-pull trials (Fig. 2c). The Bnon-pull trials did not include trials in which the lever was pulled but not enough to obtain the reward, or trials with early pulls. The averaged activity of the VM→M2L1 and LO→M2L1 axons largely increased while the lever was pulled (Fig. 2d). In addition, during the cue period, VM→M2L1 axons showed an increase in averaged activity and maintained this increase until the go-cue onset in Apull and Bpull trials, with this increase appearing to be higher in Apull trials than in Bpull trials (Fig. 2d). In LO→M2L1 axons, the increase in the averaged activity occurred after the cue onset in Apull trials, but was not maintained until the go-cue onset (Fig. 2d). In Bpull trials, the averaged activity during the cue period appeared to decrease below both the baseline activity and the averaged activity in Bnon-pull trials (Fig. 2d). Thus, VM→M2L1 and LO→M2L1 axons appeared to represent the task-related information in different ways.
To objectively determine which types of activity patterns were included in these pathways in the cue and response periods, we used spectral analysis29 to classify the VM→M2L1 and LO→M2L1 axons into six clusters (clusters 1–6; Fig. 2e, f and Supplementary Fig. 3i, j; see “Methods” for details). The major characteristics of each cluster are summarized in Fig. 2f. Clusters 1 and 2 showed a sharp increase in activity at the onset of the lever-pull, then activity decreased before the lever was returned. The peak timing was earlier in cluster 2 than in cluster 1, and cluster 2 was therefore inferred to be more strongly related to motor initiation. Cluster 3 showed a decrease in activity in the lever-pull trials. Clusters 4 and 5 showed a slower increase in activity than clusters 1 and 2, and the reward dependency was high (Supplementary Fig. 4a, b). The activity of cluster 5 was higher in non-rewarded trials than in rewarded trials. Cluster 6 showed a transient increase in activity during the cue period in Apull trials. The proportion of the total axons contributed by each cluster differed between VM→M2L1 and LO→M2L1 axons (Fig. 2g). In VM→M2L1 axons, the proportion of each cluster ranged from ~10% to ~20%. In LO→M2L1 axons, the proportion represented by cluster 3 was particularly high at ~50%. The proportions of the other clusters were in the range of 5%–15%, with the smallest proportion being from cluster 2 at ~5%. Thus, the clustering analysis demonstrated that VM→M2L1 and LO→M2L1 axons represent different types of task-related information in different ways.
The lever-pull preference of activity during the cue period showed opposing traits between VM→M2L1 and LO→M2L1 axons
Next, we compared the activity between Apull and Bpull trials in each cluster to estimate the tone preference of each cluster. We also compared the activity between Bpull and Bnon-pull trials to estimate the pull preference. Most VM→M2L1 axon clusters, except for cluster 3, showed positive tone and pull preferences during the cue and response periods (Fig. 3a, b). These were related to the fact that the averaged activity was higher in Apull trials than in Bpull trials, and was higher in Bpull trials than in Bnon-pull trials (Fig. 2d). Only cluster 5 during the response period showed a preference for cue B trials over cue A trials, with the activity being higher in non-rewarded trials than in rewarded trials (Supplementary Fig. 4a, b). By contrast, in LO→M2L1 axons, cluster 3 showed negative tone and pull preferences during the cue and response periods (Fig. 3a, b). These corresponded to the higher averaged activity in Bpull trials than in Apull trials, and higher averaged activity in Bnon-pull trials than in Bpull trials (Fig. 2d). The tone and pull preferences of the other clusters were weak in the cue period, whereas in the response period, the activity tone and pull preferences were significant for all clusters except for cluster 6. Thus, the tone and pull preferences also differed between VM→M2L1 and LO→M2L1 axons. In particular, clusters 1, 2, and 4–6 of the VM→M2L1 axons showed a significant pull preference in both cue and response periods, whereas cluster 3 of the LO→M2L1 axons showed a significant non-pull preference. This is consistent with the photostimulation results in cue B trials: activation of mTh→M2 axons increased the pull rate while activation of LO→M2 axons tended to decrease the pull rate.
a Time courses of the averaged pull preferences of clusters 1–3 (mean ± SEM, cluster 1, n = 650 axons; cluster 2, n = 417 axons, cluster 3, n = 1247 axons). The mean preference over the pre-cue period was subtracted. Horizontal bars indicate the time points at which the preference was significantly different from that in the pre-cue period (Wilcoxon signed-rank test with Bonferroni correction, 30 repeats, two-sided). b Violin plots of the tone (top) and pull (bottom) preferences calculated from the averaged activity during the cue (left) and response (right) periods. The mean preference over the pre-cue period was subtracted. Thick lines indicate the mean. *P < 0.05, **P < 0.01, ***P < 0.001, Wilcoxon signed-rank test with Bonferroni correction (six repeats), two-sided. c Plots of the cue period noise activity of VM→M2L1 and LO→M2L1 axons in the simultaneous imaging experiments (n = 25 sessions from 10 mice). Top, Apull trials; Middle, Bpull trials; Bottom, Bnon-pull trials. Each dot indicates each trial (the total trial number was 882 in Apull trials, 1139 in Bpull trials, and 1517 in Bnon-pull trials). For each axon, the activity in each trial minus the within-session-averaged activity was calculated. For each trial, activity was averaged over cluster 3 in LO→M2L1 axons and clusters 1, 2, and 4–6 in VM→M2L1 axons to represent the noise activity. Spearman’s ρ and P-values (two-sided) are also shown. Source data are provided as a Source Data file.
The axonal activity during the cue period might largely reflect the infrequent anticipatory lick. However, even when all trials with an anticipatory lick were removed, the proportions of clusters and the tone and pull preferences in the cue period did not apparently change (Supplementary Fig. 4c–e). If the common upstream neurons positively regulated VM→M2L1 axons and negatively regulated LO→M2L1 axons, the trial-to-trial fluctuation should be negatively correlated between these axons. However, in each type of trial, no such noise correlation in the cue period was detected between clusters 1, 2, and 4–6 of VM→M2L1 axons, and cluster 3 of LO→M2L1 axons (Fig. 3c). This result suggests that the VM→M2L1 and LO→M2L1 pathways were independently generated and differently affected the decision-making regarding whether or not to pull the lever.
The action value for tone B trials increased after the reward probability was suddenly increased
The striatum is thought to be critical for calculation of the action value in real-time. The VM strongly receives the final outputs from the basal ganglia18,26,30; thus, such an action value might be transmitted to M2 through the VM→M2L1 pathway. If so, VM→M2L1 axons would show a strong pull preference because the action value and lever-pull probability should be correlated. Alternatively, whether the lever was pulled or not could be determined during the cue period, and this signal might be transmitted to M2 through the VM→M2L1 pathway. LO inhibits compulsive behaviors through the pathway from LO to the dorsal striatum25. Thus, the LO→M2L1 pathway might convey either the action value or information on whether the lever was pulled or not as a negative signal.
To determine whether VM→M2L1 and LO→M2L1 pathways conveyed the action value, we investigated whether the neuronal activity that could not be explained by the presence or absence of the lever-pull action could be explained by the action value inferred from action-choice history. To compare the neuronal activity across cue B trials with largely differing action values within the same session, we introduced a value-shift task in which the reward probability in cue B trials was shifted from 20% to 100% 5 min after the session started, while the probability in cue A trials remained at 100%. After the mice that expressed GCaMP6s in VM→M2L1 or LO→M2L1 axons (Supplementary Fig. 5a, b) were trained to perform the standard two-tone lever-pull task, they then performed the value-shift task. During the first 20-min period in this task, we conducted two-photon imaging of the VM→M2L1 and LO→M2L1 axons. The mice showed an increased lever-pull rate in cue B trials after the shift (Fig. 4a). To explain the transition of the lever-pull rate in cue B trials in the value-shift task, we constructed a behavioral state model. The relative action value in cue B trials, ΔQ, which indicates the degree to which pulling the lever is better than not pulling it, was updated every cue B trial according to whether the lever was pulled or not and whether the reward was obtained or not (Fig. 4b). The probability of the lever pull was determined in each trial according to the sigmoidal transformation of ΔQ (see “Methods” for details). We constructed four models with and/or without the lever-pulling cost term (θ), which decreased ΔQ when the lever was pulled but the reward was not received, and the saving term (ψ), which decreased ΔQ and increased the probability of the following non-pull choice when the lever was not pulled2. The result of the model evaluation using the Akaike information criterion and Bayesian information criterion did not contradict the result of our previous study using the similar two-tone lever-pull task, in which the model with ψ and without θ is the best model2 (Fig. 4c). In this model, ΔQ increased after the change in the reward probability, and this increase well explained the increase in the lever-pull rate in cue B trials (Fig. 4d and Supplementary Fig. 5c).
a Change in lever-pull rate in cue A (red) and B (blue) trials over 5 min in the value-shift task. The reward probability in Bpull trials was changed from 20% to 100% 5 min after the imaging started (n = 13 sessions from 8 mice). Each thin line indicates each session. Thick lines are mean ± SEM. **P < 0.01, Wilcoxon signed-rank test with Bonferroni correction (six repeats), two-sided. b Schematic of the behavioral state model. The relative action value in the t+1th cue B trial (ΔQ(t + 1)) is updated according to ΔQ (t), action in the tth trial (P(t)), the trial variable in the tth trial (U(t)), the learning rate (α), and the constant bias term (ε). U(t) is 1 when the lever was pull and the reward was delivered, −θ when the lever was pull and the reward was not delivered, and −Ψ when the lever was not pulled. See “Methods” for details. c Validation of four behavioral state models using Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC). The difference in AIC or BIC between each model and the model with ψ of 0 and θ of 0 is indicated as ΔAIC or ΔBIC, respectively (mean ± SEM, n = 13). *P < 0.05, **P < 0.01, Wilcoxon signed-rank test with Bonferroni correction (six repeats), two-sided. d Representative trace of the recorded lever-pull rate (black) and the lever-pull rate predicted from the model with ψ as a variable and θ of 0 (orange) in cue B trials. The lever-pull rate is shown as the 8-trial moving average. The filled red, filled gray, and open gray dots indicate rewarded trials, pull trials, and non-pull trials, respectively. The trace of ΔQ is also shown (magenta). Source data are provided as a Source Data file.
VM→M2L1 axons represent the action value for cue B trials during the cue and response periods, whereas LO→M2L1 axons do not
The imaged axonal boutons were classified into six clusters that were determined in Fig. 2f (six original clusters) according to the similarity of their activity to the averaged activity in each of the original clusters. Here, we focused on the activities of clusters 1, 2, and 4–6 of VM→M2L1 axons and cluster 3 of LO→M2L1 axons in cue B trials. We classified cue B trials with ΔQ values from the top 80% to 0% in the model with ψ and without θ into three groups of high, middle, and low values (Supplementary Fig. 5d; see “Methods” for details), and then separated the classified groups into Bpull and Bnon-pull trials.
In both cue and response periods, the averaged activity of clusters 1, 2, and 4–6 of VM→M2L1 axons in Bpull and Bnon-pull trials increased as ΔQ increased (Fig. 5a). In the cue period, the activity in the trials with the low and middle groups of ΔQ did not differ between Bpull and Bnon-pull trials. This indicates that the activity in the cue period did not largely reflect the motor preparatory activity. In the response period, there was a tendency for correlation between the activity and ΔQ in both Bpull and Bnon-pull trials. The activity within the same ΔQ group differed substantially between Bpull and Bnon-pull trials (Fig. 5a). To determine whether behavioral variables could explain the correlation tendency between the neuronal activity and ΔQ, and the difference in activity between Bpull and Bnon-pull trials in the response period, we constructed an encoding model to explain the neuronal activity from the variables of the task events (cue A or B, rewarded or non-rewarded, and pull or non-pull) and the behaviors that could be measured (lever trajectory, pull onset, pull end, and lick rate) (Supplementary Fig. 5e). Then, we examined the properties of the residual activity (the activity minus the encoding model-fitted activity) of the axons that were significantly explained by the encoding model (see “Methods” for details). The positive correlation tendency between the residual activity and ΔQ was again detected in both cue and response periods (Fig. 5b, c). Furthermore, the residual activity in the same group of trials did not differ between Bpull and Bnon-pull trials in the cue period. The result for the model with neither ψ nor θ was very similar (Supplementary Fig. 5f–h). These results suggest that the majority of VM→M2L1 axons possessed the relative action value throughout the cue and response periods, and did not possess the decision to pull or not, at least during the cue period.
a, d The mean activity of clusters 1, 2, and 4–6 of VM→M2L1 axons (a) and cluster 3 of LO→M2L1 axons (d) during the cue (left) and response (right) periods in Bpull (blue) and Bnon-pull (cyan) trials. Trials were divided into three groups according to the ΔQ(t) value in each session. Bars indicate mean ± SEM. n = 518 axons in 6 sessions from 4 mice in (a). n = 629 axons in 7 sessions from 4 mice in (d). Differences between Bpull and Bnon-pull trials in the trials with the same group of ΔQ and differences between the trials with the different groups of ΔQ in the same trial type were tested with Wilcoxon rank sum test with Bonferroni correction (three repeats) (*P < 0.05, ***P < 0.001, two-sided). b, e Time course of the residual activity of clusters 1, 2, and 4–6 of VM→M2L1 axons (b) and cluster 3 of LO→M2L1 axons (e) in Bpull (top) and Bnon-pull (bottom) trials. Trials were divided into three groups according to ΔQ(t) (from dark to light in each color: high, middle, and low). Shading indicates ±SEM. c, f The mean residual activity of clusters 1, 2, and 4–6 of VM→M2L1 axons (c) and cluster 3 of LO→M2L1 axons (f) during the cue (left) and response (right) periods. Bars indicate mean ± SEM. Differences between Bpull and Bnon-pull trials in the trials with the same group of ΔQ and differences between the trials with the different groups of ΔQ in the same trial type were tested with Wilcoxon rank sum test with Bonferroni correction (three repeats) (*P < 0.05, **P < 0.01, ***P < 0.001, two-sided). Source data are provided as a Source Data file.
The averaged activity of cluster 3 of LO→M2L1 axons in cue B trials did not show a tendency to correlate with ΔQ in the model with ψ and without θ in both cue and response periods (Fig. 5d). By contrast, activity was higher in Bnon-pull trials than in Bpull trials, except for the trials with the middle ΔQ value in the cue period (Fig. 5d). After the encoding model processing, this difference was detected in all three types of trials during the cue period (Fig. 5e, f). Thus, this different activity did not reflect the difference in the history of the action and reward. Although a large difference in the activity between Bnon-pull and Bpull trials in the response period was reduced in the residual activity, the residual activity still did not correlate with ΔQ (Fig. 5e, f). The result for the model with neither ψ nor θ was very similar (Supplementary Fig. 5i–k). These results suggest that the activity of cluster 3 of LO→M2L1, which showed a decrease in the lever-pull trials, possessed the negative signal regarding the decision to pull in the cue and response periods, rather than the relative action value.
mTh→M2 inputs might strongly affect the M2 activity relevant to the action initiation
We then asked the question “Were the six patterns of VM→M2L1 and LO→M2L1 axonal activity embodied in the activity of M2 neurons?” We conducted two-photon calcium imaging of jRGECO1a-expressing neurons in M2 L2/3 during the standard two-tone lever-pull task (Fig. 6a, b and Supplementary Fig. 6a). We classified the imaged M2 neurons into the six clusters that were determined for VM→M2L1 and LO→M2L1 axons, according to the similarity of each M2 neuron’s activity to the averaged activity in each of the six original clusters (Fig. 6c). In M2 L2/3 neurons, cluster 2 neurons (which were probably the most related to the lever-pull initiation) formed the highest proportion, at ~30% (Fig. 6d). Cluster 1 and 2 neurons showed a positive lever-pull preference during the cue and response periods, which was similar to the preference of VM→M2L1 axons (Fig. 6e and Supplementary Fig. 6b). Although the proportion represented by cluster 3 was lower in M2 neurons than in VM→M2L1 axons, cluster 3 still showed the negative lever-pull preference during the cue and response periods, which was similar to the preference of LO→M2L1 axons (Fig. 6e and Supplementary Fig. 6b). These results suggest that the prominent properties of VM→M2L1 and LO→M2L1 axons were inherited by subsets of M2 L2/3 neurons.
a Schematic of two-photon calcium imaging of non-specific M2 neurons in L2/3. b Representative image of M2 L2/3. Scale bar: 100 μm. c Activity patterns of each cluster of M2 L2/3; n = 497 neurons for cluster 1, 983 for cluster 2, 893 for cluster 3, 494 for cluster 4, 692 for cluster 5, and 234 boutons for cluster 6. Shading indicates ±SEM. Scale bar: 1% ΔF/F. d Proportions of the six clusters among the M2 L2/3 neurons (n = 18 sessions from 5 mice). ***P < 0.001, the proportions of the corresponding clusters were significantly different from the expected value (16.7%), chi-squared test with post-hoc residual analysis (Bonferroni correction, six repeats), two-sided. e Violin plots of the pull preference of each cluster during the sample (left) and response (right) periods. The preference in the pre-cue period was subtracted. Thick lines indicate the mean. ***P < 0.001, Wilcoxon signed-rank test with Bonferroni correction (six repeats), two-sided. Source data are provided as a Source Data file.
When spectral clustering was directly applied to M2 neuronal activities, they were classified into four clusters (M2 clusters; Supplementary Fig. 6c–e). According to the similarities of the time courses of activity (Supplementary Fig. 6d and Fig. 6c), the M2 clusters 1, 2, 3, and 4 roughly correspond to the original clusters 3, 2, 5, and 4, respectively. The proportions of M2 clusters 1 and 2 were high (Supplementary Fig. 6f), and their pull preferences were negative (but not significant in the cue period) and positive, respectively (Supplementary Fig. 6g). These properties were consistent with those of clusters 2 and 3 of M2 neurons (Fig. 6d, e).
Finally, we examined what activities were exhibited by M2 neurons that were synaptically coupled with mTh and LO. Taking advantage of the anterograde transsynaptic spread property of AAV131, we injected AAV1-hSyn-Cre into mTh or LO, and AAV-flex-jRGECO1a into M2 (Fig. 7a). Very few M2 L2/3 neurons projected to the motor thalamus (Supplementary Fig 7a–f), as described previously32, indicating that when AAV1-hSyn-Cre was injected into mTh, jRGECO1a-expressing L2/3 neurons in M2 were transsynaptically labeled in the anterograde direction. We refer to these neurons as M2 ←mTh neurons. Since LO and M2 project bidirectionally17, it was possible that M2 neurons projecting to LO were retrogradely transfected and that these transfected neurons also expressed jRGECO1a. By comparing the numbers of fluorescent neurons with retrograde transfections of AAV1 and AAVrg (an AAV retrograde serotype)33, following the methods used in a previous study34, we estimated that when AAV1-hSyn-Cre was injected into the LO, ~70% of jRGECO1a-expressing L2/3 neurons in M2 showed anterograde transsynaptic transfection, while the remaining 30% were neurons with retrograde transfection (Supplementary Fig. 7g–o). Thus, we considered that the majority of jRGECO1a-expressing M2 L2/3 neurons were those receiving synaptic inputs from LO, and refer to them as M2←LO neurons.
a Schematics of two-photon calcium imaging of M2←LO neurons (top) and M2←mTh neurons (bottom). b Representative images of M2←LO neurons (top) and M2←mTh neurons (bottom). Scale bar, 50 μm. c Time course of the mean activity of each cluster of M2←LO neurons (top) and M2←mTh neurons (bottom) in Apull (red), Bpull (blue), and Bnon-pull (cyan) trials. Shading indicates ±SEM. Scale bar: 1% ΔF/F. d Proportions of the six clusters of M2←mTh neurons (green) and M2←LO neurons (orange). *P < 0.05, **P < 0.01, ***P < 0.001, the proportions of the corresponding clusters were significantly different from the expected value (16.7%), chi-squared test with post-hoc residual analysis (Bonferroni correction, six repeats), two-sided. The test was separately performed in VM→M2L1 (green) and LO→M2L1 (orange) axons. e Violin plots of the pull preference of each cluster of M2←mTh neurons (green) and M2←LO neurons (orange) during the cue (left) and response (right) periods. The preference in the pre-cue period was subtracted. Thick lines indicate the mean. *P < 0.05, **P < 0.01, ***P < 0.001, Wilcoxon signed-rank test with Bonferroni correction (six repeats), two-sided. Source data are provided as a Source Data file.
We imaged M2←mTh neurons and M2←LO neurons during task performance (Fig. 7b and Supplementary Fig. 8a), and classified these neurons into six clusters, as we did for non-specific M2 neurons (Fig. 7c). The cluster 2 proportion was highest in M2←mTh neurons, whereas the cluster 3 proportion was highest in M2←LO neurons (Fig. 7d). Many clusters, except for cluster 3, showed positive pull preferences, and the negative pull preference of cluster 3 was detected in both types of neurons in the response period (Fig. 7d and Supplementary Fig. 8b). Although these imaged neurons should include neurons that received inputs from other thalamic nuclei and neurons receiving the inputs in layers 2–6 of M2, as in the photostimulation experiments, some properties that were observed in VM→M2L1 and LO→M2L1 axons were also possessed: a higher proportion of cluster 2 and a lower proportion of cluster 3 in M2←mTh neurons than in M2←LO neurons (Fig. 7d), the positive pull preference tendency of clusters 1, 2, and 4–6 in M2←mTh neurons, and the highest proportion and negative pull preference of cluster 3 in M2←LO neurons (Fig. 7e).
When M2←mTh neurons and M2←LO neurons were classified into the four M2 clusters, the different proportions of M2 clusters 1 and 2 between M2←mTh and M2←LO neurons were relatively conserved (Supplementary Fig. 8c, d). Considering that the proportion of cluster 2 was the highest in the non-specific M2 neurons and M2←mTh neurons (Figs. 6d and 7d), mTh→M2 inputs, including VM→M2L1 inputs, might play an important excitatory role in the decision to pull the lever and generation of the motor initiation signal.
Discussion
We have revealed that VM→M2L1 and LO→M2L1 axons showed distinct activities in a decision-making task regarding whether to act or not. There was no trial-to-trial negative correlation between the activity in clusters 1, 2, and 4–6 of VM→M2L1 axons and that in cluster 3 of LO→M2L1 axons, suggesting that the signals in the VM→M2L1 and LO→M2L1 pathways are likely to be independently generated. Even if the decreased activity in LO might directly affect the DMS or the ventral striatum25,35, this signal should be processed along with many other signals throughout the basal ganglia. As a result, the final output in the VM (VM→M2L1 axons) might not correlate with the activity of LO→M2L1 axons. We propose that the balance between VM and LO signals plays an important role in the decision-making process in M2 regarding whether to act or not.
The effect of the action value signal of the mTh→M2 pathway on the M2 activity involved in motor initiation
Our behavioral and encoding models revealed that VM→M2L1 axons possessed the relative action value in each cue B trial during the cue and response periods, irrespective of whether the lever was pulled or not. We also showed that activation and inactivation of the motor thalamic inputs to M2 can bias the choice towards action and non-action, respectively. These results directly connect the report stating that the chosen-action signal is strongly represented in M2 with reports stating that the action value is strongly represented in the ventral striatum and DMS in left or right choice tasks4,21. Although the action value is widely distributed across brain areas including the LO, mPFC, striatum, M2, and hippocampus36, it is strongly represented and updated in VS and DMS4,21,37. Anatomically, mPFC, OFC, and the motor cortical areas are included in different cortico-basal ganglia loops, with these including the ventral striatum, DMS (or caudate nucleus), and DLS (or putamen), respectively18,19,38. These loops are segregated, but partially overlap, e.g., mPFC and DMS are linked with the mediodorsal thalamus, M2 also projects to DMS, and DMS can affect M2 and M1 activities through the motor thalamus18,20,39. Thus, we consider that the relative action value represented in DMS would be transmitted to the VM→M2L1 pathway. In addition, in trials with the same relative action value, the activity of clusters 1, 2, and 4–6 in VM→M2L1 axons during the cue period clearly increased as much when the lever was not pulled as when it was pulled. This suggests that the chosen-action signal was not strongly represented in VM→M2L1 axons before the go-cue onset. The mPFC maintains the action value over a long timescale in a dynamic foraging task, and represents the action value better than M2 in the pre-cue period40. These results imply that the action value is updated in DMS every trial, is maintained in mPFC and DMS in the intertrial interval, and is then sent to M2 after the cue presentation.
The cluster proportion was similar between non-specific M2 L2/3 neurons and M2←VM L2/3 neurons suggesting that VM→M2L1 axons widely distributed the relative action value signal to M2 neurons, which could bias the choice to action or non-action. However, the pull preference in the cue period was very weak in clusters 4 and 5 of the non-specific M2 L2/3 neurons. By contrast, a strong pull preference was detected in clusters 1 and 2, which showed strong activity immediately after the lever-pull onset. The relative action value signal might be strongly incorporated into the generation of the motor initiation signal. We therefore propose that whether to act or not is determined by M2 rather than VM→M2 axons or their upstream striatum.
Decreased activity of the majority of the LO→M2 axons in lever-pull trials
In contrast to the activity of VM→M2L1 axons, approximately half of the LO→M2L1 axons (cluster 3) showed decreased activity during the cue and response periods of the lever-pull trials, irrespective of whether the relative action value was low or high. This result is consistent with the report that there are OFC neurons that increase their activity during behavioral inhibition, and that this stopping-related activity is unrelated to reward-related activity in the macaque41. However, this result is inconsistent with reports that the action value, but not the chosen action, is represented in the LO to some extent4,42, and that ChR2 stimulation of LO increased devalued lever pressing43. This discrepancy might be because the proportion of cluster 3 was small in the LO neurons (including those without M2 projections), and some other neurons might represent the action value and their activation might induce the lever pressing through other pathways17.
The gradual decrease in cluster 3 activity of LO→M2L1 axons during the cue and response periods was also detected in the cue A trials in which mice consistently pulled the lever. Thus, the baseline activity of the LO→M2 pathway might generally serve to inhibit M2-triggered movements. However, photoinhibition of this pathway did not increase the early-pull rate. Thus, both the go signal for the motor initiation that was partly provided by the motor thalamus and a decrease in the LO→M2 inputs would be necessary for initiation of the goal-directed movement.
M2 neuron types that may be downstream of the LO→M2 pathway
Considering that activation of the pathway from the LO to the DMS activates fast-spiking neurons to reduce the activity of medium-spiny neurons25, the decreased activity of LO→M2L1 axons might reduce the activity of a subset of M2 inhibitory neurons and disinhibit the excitatory neurons that are required for motor initiation. In fact, the interplay between the parvalbumin-positive and somatostatin-positive neurons in M2 that receive external inputs is important in the initiation of locomotion44. Even without taking into account the direct projection from LO→M2 axons to M2 inhibitory neurons, considering that the proportion of cluster 3 in M2←LO neurons was as high as that in LO→M2L1 axons, it may simply be that the decreased excitatory inputs lower the activity of their postsynaptic excitatory neurons in M2, which engages the M2 population activity into a state necessary for the motor initiation and execution45. Alternatively, reduction of the LO→M2 inputs might reduce the activity of the subthalamic nucleus-projecting neurons that might inhibit the movement as the hyperdirect pathway46,47,48. It should be clarified whether the LO→M2 pathway plays a role in movement initiation and inhibition in other types of tasks such as go/no-go and stop-signal tasks49,50,51, and how different subtypes of inhibitory neurons and subcortical projecting neurons in M2 change their activities in these tasks.
Possible functions of the LO→M2 pathway in action inhibition
Although it is still a matter of debate as to whether OFC has behavioral inhibitory functions in humans and rodents50,51,52, it is known that dysfunction of the OFC is related to obsessive-compulsive disorder, and that activation of the pathway from the LO to the DMS suppresses compulsive behaviors25,53. Within the basal ganglia, the indirect pathway including dopamine D2 receptor-expressing neurons is related to suppression of inappropriate actions54. These cortico-striatal and subcortical pathways are strongly influenced by dopaminergic projections. Therefore, their activation might accompany the updating of the action value. Considering that M2 modulates M1 activity to execute appropriate movements in context-dependent manners8,55, the cortico-cortical LO→M2 pathway might play a role in inhibition of a certain action in a certain context without updating of its value.
In humans, the right lateral OFC is activated when it is necessary to suppress the response to another preferred item in order to select the most preferred item56. In the current task, this would correspond to the suppression of the lever-pull in cue B trials because lever pulling in cue B trials was not punished, but was rewarded at 20% probability, and in fact, the mice pulled the lever in ~50% of cue B trials. In addition, LO-lesioned rats are not able to wait long enough to receive a large reward, suggesting that LO activity may be necessary to wait for one57. Thus, the sustained activity of cluster 3 in cue B trials without pulling might reflect a decision to skip these cue B trials and wait for a tone A presentation. This interpretation suggests that non-action selection1,2 can be discussed in the context of two-choice decision-making.
Area 47/12 in rhesus and marmoset monkeys, which roughly corresponds to the LO in the rodent, innervates the premotor cortex, although the connection is not strong58,59. It should also be clarified whether this pathway is involved in decision-making and response inhibition in primates.
Limitations
The optogenetic experiments did not have a sufficiently large number of animals to permit statistical tests. In addition, the mTh→M2 and LO→M2 axonal pathways, which were not imaged in L1, were also photostimulated. We used only one combination of the reward shift (from 20% to 100% reward probability in tone B trials) in the value-shift task, and did not examine the condition with other reward shifts (for example, from 100% to 20% reward probability in tone A trials, and shifts between the reward probabilities assigned to tones A and B). It should be clarified whether the very different functional properties of LO→M2 and mTh→M2 pathways are generally applied to other decision-making tasks, including two-choice tasks and go/no-go tasks. The estimation of the anterograde transsynaptic transfection and retrograde transfection with AAV1-Cre assumed that the retrograde transfection efficiency of AAV1-Cre is approximately the same in M2-striatal and M2-LO pathways.
Methods
Animals
All animal experiments were approved by the Animal Experimental Committee of the University of Tokyo. Wild-type C57BL/6 mice (male, aged 8–25 weeks; Japan SLC) were used for all behavioral experiments. Transgenic mice with Cre-dependent tdTomato expression (Ai14, B6.Cg-Gt(ROSA)26Sortm14(CAG-tdTomato)Hze/J, Strain #:007914, Jackson laboratory)60 were obtained by crossing homozygote transgenic mice with wild-type C57BL/6 mice in our laboratory. The genotype of the animals utilized for the experiments was hemizygote, and their age ranged from 8 to 30 weeks. In the histological experiments using Ai14 mice, both sexes were used. All mice were provided with food and water ad libitum and were housed in a 12:12 h light-dark cycle (light cycle; 8 AM–8 PM). Ambient temperature and humidity were 22–25 °C and 40–80%, respectively.
Virus production
For imaging of R-CaMP1.07, the GCaMP3 DNA of pAAV-human synapsin I promoter (hSyn)-GCaMP3-WPRE-hGH polyA61 was replaced with R-CaMP1.07 DNA from a pN1-R-CaMP1.07 vector construct62. The packaging plasmids (pAAV-RC1 and pHelper) and transfer plasmid (pAAV- hSyn-R-CaMP1.07) were transfected into HEK293 T cells using the calcium phosphate method. AAV vector particles were purified by serial ultracentrifugation with cesium chloride. The purified particles were dialyzed with phosphate-buffered saline (PBS) containing 0.001% Pluronic F-68 (Sigma-Aldrich), followed by concentration with an Amicon 10 K MWCO filter (Merck Millipore). The copy number of the viral genome (vg) (~1.3 × 1013 vg/mL) was determined by real-time quantitative PCR63.
Surgery for in vivo optogenetic and imaging experiments
After mice were anesthetized by intraperitoneal injection of a mixture of ketamine (74 mg/kg) and xylazine (10 mg/kg), atropine sulfate (0.5 mg/kg) was administered intraperitoneally and an eye ointment (Tarivid; Santen Pharmaceutical) was applied. After the head of the mouse was sterilized with 70% ethanol, and the skin covering the neocortex was incised, a custom head plate (Tsukasa Giken) was attached to the skull using dental cement (Fuji lute BC, GC; and Bistite II or Estecem II, Tokuyama Dental)8. The surface of the intact skull was coated with dental adhesive resin cement (Superbond; Sun Medical). After the attachment, a recovery period of at least 1 day was allowed before any other procedure. Isoflurane (1%–1.5%) inhalation was used for anesthesia throughout surgery.
For photostimulation experiments, AAV9-hSyn-hChR2(H134R)-EYFP (titer 3.6 × 1013 vg/mL, obtained from Addgene, #26973-AAV9) was injected into the left motor thalamus (centered at 1.2 mm posterior [AP –1.2 mm] and 0.8 mm lateral [ML 0.8 mm] to the bregma, and 3.75 mm ventral to the cortical surface [DV 3.75 mm]; 100 or 200 nL) or the left OFC (AP 2.8 mm and ML 1.8 mm, and DV 2.0 mm; 100 or 200 nL) with a Nanoject III injector (Drummond Scientific Company). When injecting into the thalamus, the injection needle was inserted 3.75 mm vertically into the cortical surface at AP 1.2 mm and ML 0.8 mm; when injecting into the lateral OFC, the injection needle was inserted 3.5 mm into the cortical surface at a 52° angle at AP 0 mm and ML 1.8 mm to prevent the viral solution from leaking into the left side of M2. For photoinhibition experiments, AAV2/9-EF1α-DIO-eNpHR3.0-EYFP64 (≥1.0 × 1013 vg/mL, Addgene, #26966-AAV9) and AAV9-CaMKII0.4-Cre-SV40 (≥1.0 × 1013 vg/mL, Addgene, #105558-AAV9) were mixed to the final concentrations of 5-fold and 2-fold dilutions, respectively. The injection volume for each site was 100 nL. The solution was injected into the bilateral LO in six mice, the left thalamus in five mice, and the left and right thalamus in one mouse. As a control experiment, AAV9-hSyn-eGFP-WPRE-bGH (≥1 × 1013 vg/mL, Penn vector core) was injected into the left thalamus.
At least 5 days before the optogenetics experiment, craniotomy was conducted over an area of AP 3.2 mm to −1.1 mm and ML 0 mm to 2.3 mm. A 2 × 4 mm cover glass (thickness 0.35–0.45 mm, Matsunami Glass) and a 3 × 5 mm thin cover glass (thickness 0.13–0.17 mm, Matsunami Glass) were laminated with UV-curing resin (NOA81, Norland Products) and placed over the craniotomized area. The space between the skull and glass was filled with Vetbond (3 M) and bonded seamlessly. The exposed part of the skull was coated with Superbond (Sun Medical). For photoinhibition of LO→M2 axons in some mice, a 1.6-mm-diameter area centered at AP 2.5 mm and ML 0.8 mm was craniotomized, and a 1.5-mm-diameter cover glass (thickness 0.45–0.60 mm, Matsunami Glass) and a thin 2-mm-diameter cover glass (thickness No.1, Matsunami Glass) were used.
For two-photon calcium imaging of VM→M2L1 and OFC→M2L1 axons, AAV1-Syn-GCaMP6s-WPRE-SV4065 (Addgene viral prep #100843-AAV1, 2.2 × 1013 vg/mL) or AAV1-hSyn-R-CaMP1.07 were injected into the left LO and the left thalamus. The injection volume for each site and the injection sites and methods were the same as in the optogenetic experiments. For M2 imaging, an area of 3.2–0.9 mm AP and 0–2.3 mm ML was craniotomized at least 3 days before the imaging experiments. A 2 × 2 mm cover glass (Matsunami Glass, thickness No. 5) and a 3 × 3 mm thin cover glass (Matsunami Glass, 0.08–0.12 mm) were laminated and placed over the craniotomized area. The glass window was 0.2 mm thicker than those previously used26 to reduce the vertical motion artifact of the brain during imaging. In addition, the glass window was placed before performing the imaging experiment, rather than immediately after the virus injection, to maintain strong adhesion between the glass window and the skull.
For two-photon calcium imaging of M2 neurons, AAV1-hSyn-NES-jRGECO1a-WPRE-SV40 (Penn Vector Core) was injected into the left M2. The 5-fold diluted viral solution was injected at three sites: (AP, ML) = (2.5 mm, 0.5 mm), (2.5 mm, 1.0 mm), and (2.0 mm, 0.5 mm). The total volume was 210 nL (70 nL/site). The laminated 2 × 2 mm and 3 × 3 mm cover glasses were placed on the left M2.
For two-photon calcium imaging of M2 neurons that were synaptically coupled with the motor thalamus or LO (M2←mTh neurons or M2←LO neurons, respectively), we used AAV1, which shows an anterograde transsynaptic spread property31. The mixture of AAV1-hSyn-Cre (Addgene viral prep # 105553-AAV1; final concentrations of a 2-fold dilution) and AAV-carrying GFP (AAV9-hSyn-GFP [Addgene viral prep # 50465-AAV9], AAV5-CAG-GFP [Addgene viral prep # 37825-AAV5], or AAV9-CAG-GFP [Addgene viral prep # 37825-AAV9]; final concentrations of a 30-fold dilution) was injected into the left VM or LO, and AAV1-hSyn-flex-NES-jRGECO1a-WPRE-SV40 was injected into the left M2. AAV-carrying GFP was used for checking the spread of viral expression. Each injection site is described above. After the injection, the laminated 2 × 2 mm and 3 × 3 mm cover glasses were placed in position.
Virus transfections to evaluate the efficiency of AAV1 retrograde transfection
Although AAV1 has a property of anterograde transsynaptic spreading, it is also incorporated into neurons retrogradely31,34. Thus, we determined to what extent fluorescent expression of M2←mTh neurons and M2←LO neurons was caused by the retrograde labeling. In all these experiments, we used Ai14 mice60.
To determine whether M2←mTh neurons included retrogradely transfected M2 neurons, we injected a solution (60 nL) including AAV1-hSyn-Cre (6.0 × 1012 vg/mL) and AAV5-CAG-GFP (3.33 × 1011 vg/mL), or a solution (60 nL) including AAVrg-hSyn-Cre (Addgene viral prep # 105553-AAVrg; 6.0 × 1012 vg/mL) and AAV5-CAG-GFP (3.33 × 1011 vg/mL), into the motor thalamus.
To estimate to what extent M2←LO neurons included retrogradely labeled neurons, we first adjusted the virus titers of AAV1 and AAVrg for the injection into LO. As it was reported that the probability of retrograde transfection with AAV1-hSyn-Cre is approximately a quarter of that of AAVrg in a population of neurons expressing the target protein by AAV1-hSyn-Cre in the reciprocally connected projection34, the number of retrogradely labeled neurons was expected to be lower than that of the anterogradely labeled neurons. Therefore, we employed a low-titer AAV-carrying FlpO that is retrogradely transfected and a high-titer AAV carrying a FlpO-dependent fluorescent reporter gene. A mixed solution (60 nL) including AAV1-hSyn-Cre-WPRE (6.0 × 1012 vg/mL) and AAVrg-EF1a-FlpO66 (2.55 × 109 vg/mL) was injected into the left dorsal striatum (AP 0.5 mm, ML 2.0 mm, and DV 2.6 mm), and a solution (200 nL) of AAV1-EF1a-fDIO-eYFP (2.1 × 1013 vg/mL) was injected into the left M2. In this case, the number of tdTomato-expressing L2/3 neurons in M2 was similar to that of eYFP-expressing L2/3 neurons in M2 (Supplementary Fig. 7o). As the connection between M2 and the dorsal striatum is unidirectional (only from M2 to the striatum)67, we assumed that when the same titers and volumes of these AAVs as used in the striatum injection experiment were injected into LO, the number of neurons with anterograde transsynaptic labeling of tdTomato by AAV1 could be estimated as the number of tdTomato-expressing neurons minus the number of eYFP-expressing neurons. Therefore, we injected the solution (60 nL) including AAV1-hSyn-Cre (6.0 × 1012 vg/mL) and AAVrg-EF1a-FlpO (2.55 × 109 vg/mL) into LO, and the solution (200 nL) including AAV1-EF1-fDIO-eYFP (2.1 × 1013 vg/mL) into M2. In this experiment, we assumed that, regardless of the projection pathways (cortico-cortical and cortico-striatal), the transfection efficiencies of AAV1-Cre and AAVrg-FlpO were approximately constant.
Behavioral training
After at least 5 days of recovery from the head plate implantation, the mice were water-deprived in their home cages. One session of the behavioral task was conducted per day. Sessions were conducted during the light period. Training was conducted 4–7 days per week. Additional amounts of water were administered daily in the cages to maintain 80%–85% body weight. The behavioral apparatus (sound attenuation chamber, head-fixing frame, body holder, sound presentation system, water-supply system, and integrated lever device) was manufactured by O’Hara & Co. Ltd. The lever position was monitored by a magnetic sensor and was continuously recorded at an acquisition rate of 1000 Hz by a NI-DAQ (USB-6001, USB-6221, USB-6229, or PCIe-6361; National Instruments). The sound presentation and water delivery were controlled using a program written in LabVIEW (National Instruments).
Pre-training
Pre-training consisted of three steps. First, in order for the mice to learn that they would receive a water reward (4 µL) if they pulled the lever after the go sound presentation (white or pink noise sound, 300 ms), they were trained to perform a sound-triggered licking task (sound-lick task), in which they obtained a water reward immediately after they licked the spout within 1 s after the go sound was presented. The go sound was presented at 4–5 s intervals. If they licked before the go sound presentation, the interval was extended by 1.5–2.5 s. The sound-lick task was performed for 1 or 2 sessions. Next, the mice were trained to perform a sound-triggered lever-pull task (sound lever-pull task), in which a reward was delivered when the lever was pulled within 1 s after the go sound was presented (response period). As in the sound-lick task, the go sound was presented at 4–5 s intervals, and if the lever was pulled before the go sound presentation, the next sound was presented 1.5–2.5 s after the lever was returned. The sound lever-pull task was performed over four sessions, and the lever-pull time required for reward was extended from 100 ms to 200 ms. Third, for the mice to learn to wait without pulling the lever during the cue presentation period, a tone cue (tone A or tone B, 10-kHz or 6-kHz sine wave sound) was presented for 0.8–1.2 s before the go sound was presented. The tone cue was randomly presented as either tone A or B at a 50% probability. The probability of reward was set to 100% for both tone A and B trials. If the lever was pulled during the cue presentation period, the intertrial interval was extended to 4–5 s. This third step was conducted over at least two sessions.
Two-tone lever-pull task with different reward probabilities (standard two-tone lever-pull task)
After finishing the three steps of pre-training, the two-tone lever-pull task with different reward probabilities was started. During the cue period, either tone A or B was presented (cue A or B trial, respectively). The probability of being presented with tone A was 50% or 30% (30% in the imaging experiment and 50% in the other experiments). The length of the cue period was randomly chosen from 0.8 to 1.2 s for each trial so that the mice would not come to expect that the timing of the go sound presentation was fixed. Trials in which the lever was pulled during the cue period were defined as “early-pull trials”, and when an early-pull trial occurred, the following intertrial interval (ITI) was extended by 4–5 s (randomly chosen for each trial) as a punishment time. The early-pull trials were excluded from the behavioral and neuronal analyses. Immediately after the end of the cue period, the go sound was presented. We defined a “pull trial” as a trial in which the mouse continuously pulled the lever with a pull distance that exceeded 20% (1 mm) of the lever-pull maximal distance (5 mm) for more than 200 ms within the response period, and the other trials were defined as “non-pull” trials. The magnitude of the lever pull distance and the weight of the lever pull were set as in previous studies2. The probability of reward delivery was set to 100% for cue A trials and 20% for cue B trials. In the non-pull trials, no reward was delivered in either cue A or B trials. After the end of the response period, the ITI period started, and after 1.5–2.5 s, the cue period for the next trial began. If the lever was pulled during the ITI period, the ITI period was extended and the cue presentation for the next trial began 1.5–2.5 s after the lever was returned. The mice were trained for 4–15 days until the session-averaged lever-pull rate of cue A trials over the previous 3 days was higher than that of cue B trials. If this criterion was achieved earlier than the expression period of the target protein (3–4 weeks after the AAV injection), training was continued until the end of the expression period. For each mouse, pharmacological, optogenetic, or axonal imaging experiments were started after the last training session. For behavioral experiments, only male mice were used to prevent potential behavioral variations caused by the estrous cycle.
In the analyses of behavioral task performance, trials performed within 30 min after the start of the session were used, and sessions in which the percentage of early-pull trials for either cue A or B trials was more than 30% were excluded. In the imaging experiment, the neuronal activity was recorded for 15 min, at least 2 min after the start of the session.
Two-tone lever-pull task with a shift in the cue B reward probability (value-shift task)
A subgroup of mice (n = 4 for mTh→M2 axonal imaging and n = 4 for LO→M2 axonal imaging) that had learned the standard two-tone lever-pull task performed the value-shift task. In this task, the reward probability in cue B trials was shifted from 20% to 100% at 5 min after the start of the imaging. The mice performed this task for one to three sessions. The next session after this task was always the standard two-tone lever-pull task to confirm that the lever-pull rate for cue B trials was lower than that for cue A trials. For the analyses, we used the sessions in which the lever-pull rate of cue B trials over 5 min after the start of the imaging was lower than that of cue A trials and the lever-pull rate of cue B trials over 15–20 min after the start of the imaging was at least 10% higher than that of the cue B trials over the first 5 min. Sessions in which the percentage of early-pull trials for either tone A or B trials was more than 30% were excluded from the analyses.
Pharmacological inactivation of M2
The effects of administration of muscimol on task performance were examined in mice that were trained to perform the standard two-tone lever-pull task. At least 1 day before administration of muscimol, a 1-mm-diameter circle of skull was removed by craniotomy centered on the M2 area of the left hemisphere (AP 2.5 mm, ML 0.8 mm). The craniotomy site was covered with a silicone seal (Kwik-Cast; World Precision Instruments) to prevent drying and infection. Under light anesthesia with 0.8% isoflurane, muscimol dissolved in artificial cerebrospinal fluid (ACSF) (5 µg/µL, 80 nL) was administered at a depth of 0.5 mm from the cortical surface using a Nanoject III. The craniotomy site was then covered again with a silicone seal. After the animals had been allowed to rest in the cage for 30 min after administration, behavioral tasks were performed for 30 min8. As a control experiment, 80 nL of ACSF was administered to M2 in the same manner as for the muscimol, and the behavioral task was performed following the same methods.
Optogenetic experiments
Mice expressing ChR2 in mTh or LO neurons with a glass window above the left motor cortex were used in photostimulation experiments of mTh→M2 and LO→M2 axons. A fiber-optic patch cable (core diameter 200 µm, aperture 0.48, Thorlabs) was fixed perpendicular to the glass window26 on the left M2 (centered at AP 2.5 mm and ML 0.8 mm) 3–4 weeks after the AAV injection. As an optical fiber with a core diameter of 200 µm and an aperture of 0.48 and a 500-µm-thick glass were used, the illumination area on the cortical surface was estimated to be a 746-µm-diameter circular area. For the light illumination on mTh→M1 and LO→M1 axons, the optical fiber was placed at the top of the left M1 (centered at AP 0 mm and ML 1.2 mm). The same mice were used daily for either the M2 or M1 photostimulation session. There was an interval of up to 1 week between the M2 and M1 photostimulation sessions, during which the two-tone lever-pull task was performed.
For photostimulation, a blue laser light (450 nm, 15–20 mW, BioRay; Coherent) was delivered at 20 Hz (20 ms exposure, 30 ms pause) in 25% of the trials (pseudorandomly chosen). The light was illuminated for a total of 1.8–2.2 s from the cue onset to the end of the response period. In all trials, a bright blue LED (20 Hz, 20-ms exposure, 30-ms pause) placed near the eyes was turned on as a masking light during the cue and response periods to prevent the animals from recognizing the presence of the laser stimulus.
In the experiment to examine whether the photostimulation of mTh→M1 axons induced the forelimb movement shown in our previous study26, the head-fixed mice were trained to perform the sound-pull task in sessions carried out over 5 days. In each session after the task training, a bright blue LED (20 Hz, 20-ms illumination, 30-ms pause, 1 s) placed near the eyes was turned on as a masking light every 10 to 30 s for 5 min. These training sessions were conducted to habituate the mice to the lever-pull movement and light illumination. Then, we compared the photostimulation-induced forelimb movement in the head-fixed mice between conditions with and without the lever. The photostimulation frequency was set to 10 Hz, as in Tanaka et al.26. The movement of the right forelimb was recorded at 70 Hz using a high-speed camera (scA800-510um; Basler) with a machine vision lens (focal length; 12 mm; Edmund Optics). The right forelimb was tracked with DeepLabCut68 using the tip of the middle finger as a representative point. To exclude trials with unstable position estimates, those with frames having a likelihood below 0.8 from 1 s before to 2 s after the onset of photostimulation were excluded from the analysis. Additionally, in the analysis of sessions where the forelimb was placed on the lever, trials in which the forelimb was detached from the lever at the onset of photostimulation were also excluded. Forelimb movement was calculated as the Euclidean distance between the frames of limb position and was smoothed with a moving average of 300 ms. For statistical analysis, the averaged movement during 1 s of photostimulation was compared with the averaged movement for 1 s before the photostimulation onset.
For photoinhibition of eNpHR-expressing axons in M2 or M1, the glass window was placed over the motor cortex. At 3–4 weeks after the AAV injection, a fiber-optic patch cable was placed over the left M2 or M1 and the standard two-tone lever-pull task was performed. As in the ChR2 stimulation experiment, orange laser light (594 nm, 20–30 mW; Coherent) was continuously illuminated during the cue and response periods in 25% of the trials. The lever-pull rate in each session was calculated for each mouse. In all trials, a bright orange LED (continuous illumination) placed near to the eyes was used as a masking light. In all photostimulation and photoinhibition experiments, the first session with a lever-pull rate of 90% or less in cue B trials without the laser illumination was used for the analysis in each mouse.
Two-photon calcium imaging
Imaging was conducted with two-photon microscopy (FVMPE-RS; Olympus) with a broadly tunable ultrafast laser (InSight DS+Dual; Spectra Physics). The wavelength of the laser was set at 940 nm for GCaMP6s imaging and 1040 nm for R-CaMP1.07 imaging. A 25× objective (XLPLN25XWMP2, back aperture diameter 15.1 mm, numerical aperture 1.05; Olympus) was used, and to obtain bright images, the effective aperture of the objective lens was set below 0.5 by narrowing the laser beam diameter in the back aperture to 7.2 mm, in a similar manner to that in a previous study69. A previous study by ourselves70 showed that the overlap of thalamocortical axon terminal positions detected in three planes separated by 8 µm in depth under a similar imaging condition remained low, at less than 4% of all axon terminals, and in the detection of axon terminal ROIs using CaImAn71, as described below, it is possible to separate axon terminals by image processing, even if they are present in overlapping positions. Fluorescence was split into red light from R-CaMP1.07 and green light from GCaMP6s using a dichroic mirror (FV30-FDR; Olympus) and was detected using a GaAsP photomultiplier tube (Hamamatsu Photonics). The field of view (FOV) size for all axonal imaging was set to 127.2 × 127.2 µm with 512 × 512 pixels. In the left M2 and M1, the imaging depth was set to 10–80 µm from the cortical surface that corresponded to L1. One FOV was imaged per session. When imaging multiple sessions (the maximum was five sessions, including the two types of two-tone lever-pull tasks) from the same mouse, imaging was performed in different FOVs.
In the simultaneous two-color imaging of VM→M2L1 and LO→M2 L1 axons, the green image excited by the 940-nm laser and the red image excited by the 1040-nm laser were alternately acquired by switching the laser each time one image was acquired to minimize leakage between the color channels. Imaging was conducted for 15 min at a frame rate of 30 Hz (15 Hz each for the green and red channel images), and 13,500 green channel images and 13,500 red channel images were acquired. The numbers of FOVs (sessions) used for the analyses were 13 from 6 mice that expressed GCaMP6s in VM→M2L1 axons and R-CaMP1.07 in LO→M2 L1 axons, and 12 from 4 mice expressing R-CaMP1.07 in VM→M2L1 axons and GCaMP6s in LO→M2 L1 axons.
For imaging of VM→M2L1 axons and LO→M2L1 axons in the value-shift task, GCaMP6s-derived fluorescence images were acquired using the 940-nm laser. Imaging was conducted at a frame rate of 30 Hz for 20 min.
For imaging of the somata of L2/3 neurons in the left M2, jRGECO1a-derived fluorescence images were acquired using a 1100-nm two-photon laser. Imaging was conducted at a frame rate of 30 Hz for 15 or 20 min. For the 20-min imaging session, the data for the first 15 min were analyzed. The size of the FOV was set to 424 × 424 µm or 339.2 × 339.2 µm, with 512 × 512 pixels. The imaging depth was set to 100–200 µm from the cortical surface. One FOV was imaged per session. When imaging was conducted across multiple sessions (the maximum was four sessions) from the same mouse, different FOVs were used.
Histology
The mice were deeply anesthetized by intraperitoneal injection of a mixture of ketamine and xylazine, and were then perfused transcardially with PBS followed by a solution of 4% paraformaldehyde. Their brains were then removed and stored in the fixative overnight. Coronal or sagittal sections of 100-µm-thickness were made using a vibratome (VT1000S; Leica Microsystems). In most mice, nuclear or fluorescent Nissl staining was performed using DAPI (1:1000; Invitrogen) or NeuroTrace (1:200; Invitrogen). Fluorescence images were obtained using an epifluorescence microscope (IX83; Olympus) or a confocal microscope (A1R; Nikon). For mice without nuclear or Nissl staining, the location of the LO was estimated by comparing the shape of the brain sections with a mouse brain atlas72 and was defined as the lateral OFC and dorsolateral OFC in the atlas.
For retrograde staining of M2 L1, M2 (AP 2.5 mm, ML 0.8 mm as the center) was craniotomized and a 500-µm square filter paper soaked with FastBlue (1% w/w in water; Cat#17740; PolySciences) was placed on the cortical surface for 5 min and then removed. The dura mater of the craniotomized area was removed before the filter paper was placed to allow the FastBlue to penetrate efficiently into the cortex. The craniotomy site was then covered with silicone sealant (Kwik-Cast). For retrograde staining of M2, FastBlue (1.25% w/w in water, 100 nL) was injected into M2 (AP 2.5 mm, ML 0.8 mm, depth 0.3 mm) at a speed of 10 nL/min. After 1 week, the brain was perfused. Coronal sections of 100-µm-thickness were reacted with propidium iodide (2 µg/mL, Cat#29037; Nacalai Tesque) for 30 min.
When the AAVs were injected into the motor thalamus as the target, the target protein was expressed mainly in the VM and VL. The transfection of AAV into only VM was difficult. In addition, parts of the medial subthalamic nucleus, the posterior thalamic nucleus (Po), and the dorsal medial nucleus were transfected. However, when the retrograde tracer was injected into M2, the stained neurons were mainly detected in VM, VL, VA, and Po in the thalamus, but only in the LO in and around the OFC. Thus, we consider that the photostimulation of M2 axons mainly stimulated the motor thalamic neurons or LO neurons. When the retrograde tracer was injected into L1 of M2, the stained neurons were mainly detected in VM around the AAV-injected thalamic regions. Thus, we considered that the thalamic axons that were imaged in L1 mainly originated from the VM.
The Ai14 mice with AAV injections were euthanized and decapitated 14–15 days after the injections, and brain sections were obtained using similar procedures to those described above. The brain sections were washed in PBS-X (0.5% triton-X in PBS) containing 10% normal goat serum, and then incubated with the primary antibodies (1:500, rabbit anti-GFP; A-6455, polyclonal, lot: 2901490; Invitrogen) overnight at 4 °C. Afterwards, sections were washed in 0.3% PBS-X and then incubated for 2 h with Alexa Fluoro-488 conjugated secondary antibody (1:500, Goat anti-rabbit IgG; A-11034, polyclonal, lot: 1705912; Invitrogen). Then, the sections were washed twice, the cell nuclei were stained with DAPI, and the sections were mounted on glass slides. Fluorescence images were acquired with a benchtop fluorescence microscope (APX100, Evident, Japan) with 4× and 10× objectives. A machine learning-assisted cell detection was performed on the acquired images with ilastik73, and the segmented images were analyzed with Fiji software74. The sections that ranged from AP 2.8 mm to AP 2.0 mm were used to count the number of fluorescence-positive neurons.
Behavioral state model
We modeled the lever-pull probability in the tth trial, P(t), in cue B trials in the value-shift task. We defined a state value (relative action value) function in the tth trial, ΔQ(t), which was updated by the reward in the tth trial, R(t), and action in the tth trial, Pull(t), where R(t) = 1 for the rewarded trial, R(t) = 0 for the unrewarded trial, Pull(t) = 1 for the pull trial, and Pull(t) = 0 for the non-pull trial. ΔQ(t) was updated by the trial variable in the tth trial, U(t), calculated from the action and reward obtained in the tth trial as follows:
where the learning rate α was the update rate of ΔQ(t) per trial and the bias term ε was a constant. The trial variable U(t) was defined by the following equation:
where θ and ψ are constants. From Eq. (2), in the trials in which the lever was pulled and the reward was received, U(t) = 1. For trials in which the lever was pulled but no reward was received, U(t) = −θ. For trials in which the lever was not pulled, U(t) = −ψ. In other words, θ is the lever-pulling cost term that decreased ΔQ(t) when the lever was pulled but the reward was not received, and ψ is the term that decreased ΔQ(t) and increased the probability of the non-pull when waiting without pulling the lever, with the saving of the lever-pulling cost expressed as a covert reward (saving) term2. The sigmoid transformation of ΔQ(t) was used to calculate the lever-pulling action probability P(t) as follows:
where the inverse temperature β is a constant that determines how much P(t) changes with the magnitude of ΔQ(t).
The five variables: the learning rate α, bias term ε, inverse temperature β, cost term θ, and saving term ψ, were estimated independently for each session. We searched for the values of these variables that provided the best fit for the model-estimated P(t) to the actual lever-pulling behavior Pull(t). The goodness of fit was calculated using Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC). Parameter estimation was performed by grid search; specifically, a quasi-random matrix with 5 rows × 100,000 columns was generated using the MATLAB function haltonset, and the model was fit 100,000 times using the quasi-random matrix as variables. The set of variables with the lowest AIC among them was used as the estimate for each variable. The search ranges for each variable were 0 < α < 1, 0 < θ < 1, 0 < ψ < 1, −1 < ε < 1, and 0 < β < 5. The ranges of α, θ, ψ, and ε are defined as 0 to 1 or −1 to 1, because the reward function R(t) is in the range of 0 to 1. Since expanding the search range of β did not improve the fit, the analysis was limited to the range of 0–5.
As cue B trials with very high relative action values continued, the mice might not need to discriminate the tone type or decide whether to pull or not to pull in each trial and might also become less thirsty. Therefore, the behavioral strategy might be changed. To remove this possibility, we analyzed the relationship between the relative action value and the neuronal activity in cue B trials that showed a relative action value of 0%–80% of the highest relative action value for each session. For each session, the trials with a state value of 0%–27%, 27%–54%, and 54%–80% were classified into groups of high, middle, and low relative action values, respectively.
Image processing for two-photon calcium imaging data
MATLAB (2019a, 2020b, or 2022b; MathWorks), Python3.8, and ImageJ (National Institute of Health) were used for analyses. For motion correction of images, TurboReg75 and NoRMCorre76 were used. In TurboReg, each pixel of the entire FOV was moved by the same distance, whereas in NoRMCorre, a 512 × 512-pixel image was divided into 64 × 64-pixel images and the pixels were moved at different distances between the divided images. By combining the two motion correction methods, both the displacement of the entire field and local distortions were eliminated from most sessions. In addition, all motion-corrected images were checked visually, and FOVs with apparent remaining displacement were excluded from the analysis.
Using the motion-corrected images, axonal boutons were detected as regions of interest (ROIs) using CaImAn71. Furthermore, using a modified CaImAn code to extract active axonal boutons70, we defined the average luminance in non-ROI areas as the luminance of the neuropil (the value calculated as the “Background” in CaImAn), and calculated the luminance time series for each ROI by subtracting the luminance of the neuropil from the average luminance within each ROI70. Detrended relative fluorescence changes (ΔF/F) were calculated for 15 percentile values over an interval of ±15 s around each sample time point. Spike timing was also inferred from the ΔF/F values by spike estimation using the “constrained nonnegative matrix factorization” algorithm77 implemented in CaImAn. For the analysis of neuronal activity, the inferred firings were smoothed with a Gaussian filter with a full width at half maximum of 40 ms. The neural data from simultaneous two-color imaging at 15 Hz were linearly interpolated to the data at 30 Hz using the MATLAB function interp1 before applying smoothing.
According to the magnitude of the activity correlation between ROIs, those ROIs with correlation above a certain threshold were taken as originating from the same axon, and the average activity between these ROIs was used for the analysis. The threshold value was set at the trough of the distribution in the histogram of the correlation distribution of all ROI pairs in each FOV (MATLAB function islocalmin; Supplementary Fig. 3f, g). The average threshold value for all sessions used in the analysis was ~0.6, regardless of the territory or calcium indicator type, and was consistent with the criterion used in previous studies26,78,79. The percentage of ROIs the axons contained after averaging is shown in Supplementary Fig. 3h.
Tone and pull preferences
For Apull and Bpull trials, the tone preference of the neuronal activity during the cue and response periods was calculated using the average activity during the cue and response periods, respectively. The area under the receiver operating characteristics curve (AUROC) for estimation of the difference in the averaged activity between Apull and Bpull trials was used as the tone preference value. For the pull preference, we quantified the difference in the averaged activity between Bpull trials and Bnon-pull trials using the AUROC. The time course of the preference was quantified using the AUROC for the average activity at each time point. The pre-cue period (1 s before the cue presentation), cue period, and response period were each divided into ten time points. To create ten time points for the cue period of 0.8–1.2 s, the neural activity during the cue period for all trials (the processed frame rate was 30 Hz, so there were 24–36 data points) was linearly interpolated to 30 time points using the MATLAB function interp1 (hereafter this operation is referred to as “cue period length equalization”). The preference value averaged over the pre-cue period was subtracted from the value at each time point.
PAIRS analysis
PAIRS (projection angle index of response similarity) analysis80 was performed to investigate whether the neuronal activity of individual VM→M2L1 axons and LO→M2L1 axons in the behavioral task was segregated into multiple cluster groups81. First, we conducted cue period length equalization, then each task period (pre-cue, cue, and response periods) was divided into three time bins and the neuronal activity for each trial was calculated as the total of nine bins. Next, the average activity for each of the three trial types (Apull, Bpull, and Bnon-pull trials) was calculated, and the activity of each neuron was expressed as 27 time-series features (9 time bins × 3 trial types) by concatenating the activity of the three trial types. Since the proportion of the non-lever pull in cue A trials was less than 10%, we did not analyze the data in these trials. Next, the data for VM→M2L1 axons (n = 1931 axons) and LO→M2L1 axons (n = 1694 axons) in the lever-pull trials with different reward probabilities were combined to form 3625 axons, and the 27 time-series features were dimensionally compressed by principal component analysis (PCA). PCA on the 3625-axon × 27-bin matrix yielded a 26 principal component (PC) × 27-bin matrix (PC time series) and a 3625-axon × 26 PC weight matrix that showed how much of each time-series feature was present in each axon. The purpose of the PCA was to reduce the number of dimensions while efficiently extracting the neural activity patterns that each axon had in common, and its use presented the question of how many PCs should be used in the calculation. To determine the number of dimensions to be used, 3625 axons were randomly divided into two equal groups (axon populations I and II), and PCA was performed on each group to extract two PC time series with 26 PCs × 27 bins. Pairwise correlations were calculated between the respective PCs of the two PC time series computed from axon populations I and II. In the case of PCs that were common to many axons, the PC time series could be expected to appear in the same dimension for both axon populations I and II. The separation of axon populations I and II was repeated 100 times at random, and the average value was calculated. The resulting pairwise correlations up to PC 11 were as high as 0.8 or more, while those after PC 12 dropped to 0.7 or less. From this result, the subsequent analyses were conducted using dimensions up to PC 11, with these dimensions explaining 78.8% of the variance in the data. Thus, the PCA extracted a neuronal pattern of 3625 axons × 11 PCs for the VM→M2L1 and LO→M2L1 axons combined. Finally, VM→M2L1 and LO→M2L1 axons were separated into two matrices and examined for categorical type as matrices of 1931 axons × 11 PCs, and 1694 axons × 11 PCs, respectively. As random data for comparison, matrices of the same size as those of the VM→M2L1 and LO→M2L1 axons were generated 10,000 times each from a Gaussian distribution. The average value of the angle between a data point and its neighbor data points was calculated.
Classification of neuronal activity patterns
We used spectral clustering, which is an unsupervised learning method for cluster classification of multidimensional data, to perform clustering based on the distance between data points29,82,83. In spectral clustering, the distance between data points is calculated using a similarity matrix, and the clusters are classified based on the distance between them using k-means clustering. In the current analysis, a similarity matrix was created using the combined axonal activity patterns (3625 axons × 11 PCs) of VM→M2L1 and LO→M2L1 axons, using the same method as for the PAIRS analysis. Spectral clustering was performed using the scikit-learn function sklearn.cluster.SpectralClustering.
Two hyperparameters for spectral clustering needed to be determined: the number of clusters and the number of near neighbors (NN) to be included in the calculation of the similarity matrix. To determine these, 90% of all axons were randomly chosen 100 times and spectral clustering was performed on each group of axons. The similarity between the clusters of the two axon groups was calculated using the adjusted rand index (ARI), which indicated the similarity between the two clusterings to be between 0 and 129. The ARI is 1 when the cluster classifications are perfectly matched and 0 when the cluster classifications do not match for any data pair. The ARI was calculated using the scikit-learn function sklearn.metrics.adjusted_rand_score. The hyperparameters with the highest ARI (the number of clusters was 6 and NN was 140) were used. For clustering of the activity of VM→M2L1 and LO→M2L1 axons in the no-lick trials, the activity of VM→M2L1 and LO→M2L1 axons in the value-shift task, and the activity of M2 neurons, each neuronal activity that was averaged over 20 min was classified according to its highest correlation with the average activity within each cluster in the clustering of VM→M2L1 and LO→M2L1 axons that was described above.
For spectral clustering of non-specific M2 neurons, the same procedure was used with the activity patterns of 3793 neurons × 9 PCs that were determined from the PAIRS analysis and explained 80.3% of the variance, the cluster number of 4, and NN of 120.
Residual analysis of the regression model of neuronal activity
To construct the encoding model, we used the neuronal activity from the onset of the pre-cue period to the end of the response period in cue B trials in the value-shift task. Individual axonal activity was z-scored and down-sampled to 10 Hz. At this point, cue period length equalization was not conducted. First, the activity of each axon in the tth cue B trial was predicted from ridge regression with the task variables (Pull(t), R(t), One-back Tone(t), One-back Pull(t), One-back R(t)) and behavioral variables (lever trajectory, lever pull, lever return, and lick rate). Since the number of predictors was large (114 task variables and 32 behavioral variables), an L2 regularization term was added to the ridge regression to prevent overlearning. As in a previous study84, the task variable was 0 or 1 each time. Pull(t) was considered only for the response period and was set to 0 until the go sound onset. R(t) was set to 1 from the time after the reward was delivered to the end. The One-back Tone(t) was 1 if the one-back trial of the tth cue B trial was a cue A trial, and was 0 if the one-back trial was a cue B trial. One-back Pull(t) or One-back R(t) was 1 if the one-back trial of the tth cue B trial was a pull or rewarded trial, respectively, and was 0 if the one-back trial of the tth cue B trial was a non-pull or non-rewarding trial, respectively. The behavioral variables were lever trajectories (normalized to the range 0–1) that were down-sampled to 10 Hz, lever pulls (1 at the time of lever pull, 0 otherwise), lever returns (1 at the time of lever return, 0 otherwise), and the time bin was 100 ms. We computed the time series of the lick rate, quantifying the number of licks in a 100 ms time bin, and used behaviors between 0.5 s in the past and 0.2 s in the future.
Ridge regression was implemented using the scikit-learn function sklearn.linear_model.RidCV. The response variable (axonal activity) and the predictor variables (task and behavioral variables) were time series of cue B trials concatenated in the temporal domain. The accuracy was calculated using five-fold cross-validation. For this cross-validation, the data were randomly divided into five segments from all trials in the session, regardless of the time of the trial. The accuracy was quantified according to the explanatory power of the variance, expressed as a value from 0 to 100%, which indicated how well the predicted data computed by regression explained the variance (EV) in the original data:
where y is the original data, \(\bar{y}\) is the predicted data, and Var(x) is the variance of x. The explanatory power of the variance of the ridge regression model in each axon was 14.5% ± 0.6% for VM→M2L1 axons (n = 518 axons in 6 sessions from 4 mice) and 4.4% ± 0.3% for LO→M2L1 axons (n = 629 axons in 7 sessions from 4 mice). For each axon, the regression residual was calculated by subtracting the predicted activity from the original z-scored activity. Finally, we conducted cue period length equalization for the regression residuals, and the residual value averaged over the pre-cue period was subtracted.
Statistical analysis
Statistical analyses were performed using MATLAB’s statistics and machine learning toolbox. Data are presented as the mean ± standard error of the mean (SEM), and error bars in graphs represent the SEM unless otherwise stated. The Wilcoxon rank sum test, Wilcoxon signed-rank test, and chi-squared test with post-hoc residual analysis were used for statistical comparisons. Bonferroni correction was applied for multiple comparisons. All tests were two-tailed, except for the PARIS analysis (Supplementary Fig. 3i). Spearman’s rank correlation coefficient was used to calculate the correlation in the activity between VM→M2L1 axons and LO→M2L1 axons (Fig. 3c). Statistical methods were not employed to predetermine sample sizes. No blinding or randomization was performed in this study. Information on all statistical tests is summarized in Supplementary Data 1.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Source data are provided with this paper. Raw data from immunohistochemistry and imaging experiments are too large to deposit; however, any additional requests for information can be directed to, and will be fulfilled by, the corresponding authors. Source data are provided with this paper.
Code availability
Codes used to conduct the state-space model are available on figshare (https://doi.org/10.6084/m9.figshare.25894588).
References
Kühn, S. & Brass, M. When doing nothing is an option: the neural correlates of deciding whether to act or not. Neuroimage 46, 1187–1193 (2009).
Tanimoto, S., Kondo, M., Morita, K., Yoshida, E. & Matsuzaki, M. Non-action learning: saving action-associated cost serves as a covert reward. Front. Behav. Neurosci. 14, 141 (2020).
Nishio, M., Kondo, M., Yoshida, E. & Matsuzaki, M. Medial prefrontal cortex suppresses reward-seeking behavior with risk of punishment by reducing sensitivity to reward. Front. Neurosci. 18, 1412509 (2024).
Sul, J. H., Jo, S., Lee, D. & Jung, M. W. Role of rodent secondary motor cortex in value-based action selection. Nat. Neurosci. 14, 1202–1210 (2011).
Li, N., Chen, T. W., Guo, Z. V., Gerfen, C. R. & Svoboda, K. A motor cortex circuit for motor planning and movement. Nature 519, 51–56 (2015).
Murakami, M., Vicente, M. I., Costa, G. M. & Mainen, Z. F. Neural antecedents of self-initiated actions in secondary motor cortex. Nat. Neurosci. 17, 1574–1582 (2014).
Makino, H. et al. Transformation of cortex-wide emergent properties during motor learning. Neuron 94, 880–890.e8 (2017).
Terada, S., Kobayashi, K. & Matsuzaki, M. Transition of distinct context-dependent ensembles from secondary to primary motor cortex in skilled motor performance. Cell Rep. 41, 111494 (2022).
Barthas, F. & Kwan, A. C. Secondary motor cortex: Where ‘sensory’ meets ‘motor’ in the rodent frontal cortex. Trends Neurosci. 40, 181–193 (2017).
Ebbesen, C. L. & Brecht, M. Motor cortex—to act or not to act? Nat. Rev. Neurosci. 18, 694–705 (2017).
Kondo, M. & Matsuzaki, M. Neuronal representations of reward-predicting cues and outcome history with movement in the frontal cortex. Cell Rep. 34, 108704 (2021).
Takahashi, N. et al. Thalamic input to motor cortex facilitates goal-directed action initiation. Curr. Biol. 31, 4148–4155.e4 (2021).
Esmaeili, V. et al. Rapid suppression and sustained activation of distinct cortical regions for a delayed sensory-triggered motor response. Neuron 109, 2183–2201.e9 (2021).
Yamawaki, N., Radulovic, J. & Shepherd, G. M. G. A corticocortical circuit directly links retrosplenial cortex to M2 in the mouse. J. Neurosci. 36, 9365–9374 (2016).
Reep, R. L., Goodwin, G. S. & Corwin, J. V. Topographic organization in the corticocortical connections of medial agranular cortex in rats. J. Comp. Neurol. 294, 262–280 (1990).
Hoover, W. B. & Vertes, R. P. Anatomical analysis of afferent projections to the medial prefrontal cortex in the rat. Brain Struct. Funct. 212, 149–179 (2007).
Zingg, B. et al. Neural networks of the mouse neocortex. Cell 156, 1096–1111 (2014).
Aoki, S. et al. An open cortico-basal ganglia loop allows limbic control over motor output via the nigrothalamic pathway. eLife 8, 1–29 (2019).
Haber, S. N. & Knutson, B. The reward circuit: linking primate anatomy and human imaging. Neuropsychopharmacology 35, 4–26 (2010).
Hooks, B. M. et al. Topographic precision in sensory and motor corticostriatal projections varies across cell type and cortical area. Nat. Commun. 9, 3549 (2018).
Ito, M. & Doya, K. Validation of decision-making models and analysis of decision variables in the rat basal ganglia. J. Neurosci. 29, 9861–9874 (2009).
Samejima, K., Ueda, Y., Doya, K. & Kimura, M. Neuroscience: representation of action-specific reward values in the striatum. Science 310, 1337–1340 (2005).
Rueda-Orozco, P. E. & Robbe, D. The striatum multiplexes contextual and kinematic information to constrain motor habits execution. Nat. Neurosci. 18, 453–462 (2015).
Lemke, S. M., Ramanathan, D. S., Guo, L., Won, S. J. & Ganguly, K. Emergent modular neural control drives coordinated motor actions. Nat. Neurosci. 22, 1122–1131 (2019).
Burguière, E., Monteiro, P., Feng, G. & Graybiel, A. M. Optogenetic stimulation of lateral orbitofronto-striatal pathway suppresses compulsive behaviors. Science (80-.). 340, 1243–1246 (2013).
Tanaka, Y. H. et al. Thalamocortical axonal activity in motor cortex exhibits layer-specific dynamics during motor learning. Neuron 100, 244–258.e12 (2018).
Rubio-Garrido, P., Pérez-De-Manzo, F., Porrero, C., Galazo, M. J. & Clascá, F. Thalamic input to distal apical dendrites in neocortical layer 1 is massive and highly convergent. Cereb. Cortex 19, 2380–2395 (2009).
Kuramoto, E. et al. Ventral medial nucleus neurons send thalamocortical afferents more widely and more preferentially to layer 1 than neurons of the ventral anterior-ventral lateral nuclear complex in the rat. Cereb. Cortex 25, 221–235 (2015).
Hirokawa, J., Vaughan, A., Masset, P., Ott, T. & Kepecs, A. Frontal cortex neuron types categorically encode single decision variables. Nature 576, 446–451 (2019).
Kuramoto, E. et al. Complementary distribution of glutamatergic cerebellar and GABAergic basal ganglia afferents to the rat motor thalamic nuclei. Eur. J. Neurosci. 33, 95–109 (2011).
Zingg, B. et al. AAV-mediated anterograde transsynaptic tagging: Mapping corticocollicular input-defined neural pathways for defense behaviors. Neuron 93, 33–47 (2017).
Harris, J. A. et al. Hierarchical organization of cortical and thalamic connectivity. Nature 575, 195–202 (2019).
Tervo, D. G. R. et al. A designer AAV variant permits efficient retrograde access to projection neurons. Neuron 92, 372–382 (2016).
Zingg, B., Peng, B., Huang, J., Tao, H. W. & Zhang, L. I. Synaptic specificity and application of anterograde transsynaptic AAV for probing neural circuitry. J. Neurosci. 40, 3250–3267 (2020).
Voorn, P., Vanderschuren, L. J. M. J., Groenewegen, H. J., Robbins, T. W. & Pennartz, C. M. A. Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci. 27, 468–474 (2004).
Shin, E. J. et al. Robust and distributed neural representation of action values. eLife 10, 1–17 (2021).
Kim, H., Lee, D. & Jung, M. W. Signals for previous goal choice persist in the dorsomedial, but not dorsolateral striatum of rats. J. Neurosci. 33, 52–63 (2013).
Balleine, B. W. & O’Doherty, J. P. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69 (2010).
de Kloet, S. F. et al. Bi-directional regulation of cognitive control by distinct prefrontal cortical output neurons to thalamus and striatum. Nat. Commun. 12, 1994 (2021).
Bari, B. A. et al. Stable representations of decision variables for flexible behavior. Neuron 103, 922–933.e7 (2019).
Balasubramani, P. P., Pesce, M. C. & Hayden, B. Y. Activity in orbitofrontal neuronal ensembles reflects inhibitory control. Eur. J. Neurosci. 51, 2033–2051 (2020).
Sul, J. H., Kim, H., Huh, N., Lee, D. & Jung, M. W. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making. Neuron 66, 449–460 (2010).
Gremel, C. M. & Costa, R. M. Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat. Commun. 4, 2264 (2013).
Zhang, C. L. et al. Inhibitory control of synaptic signals preceding locomotion in mouse frontal cortex. Cell Rep. 37, 110035 (2021).
Kaufman, M. T., Churchland, M. M., Ryu, S. I. & Shenoy, K. V. Cortical activity in the null space: permitting preparation without movement. Nat. Neurosci. 17, 440–448 (2014).
Koketsu, D., Chiken, S., Hisatsune, T., Miyachi, S. & Nambu, A. Elimination of the cortico-subthalamic hyperdirect pathway induces motor hyperactivity in mice. J. Neurosci. 41, 5502–5510 (2021).
Li, B., Nguyen, T. P., Ma, C. & Dan, Y. Inhibition of impulsive action by projection-defined prefrontal pyramidal neurons. Proc. Natl. Acad. Sci. USA 117, 17278–17287 (2020).
Karube, F., Takahashi, S., Kobayashi, K. & Fujiyama, F. Motor cortex can directly drive the globus pallidus neurons in a projection neuron type-dependent manner in the rat. eLife 8, 1–25 (2019).
Tremblay, L. & Schultz, W. Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. J. Neurophysiol. 83, 1864–1876 (2000).
Aron, A. R., Robbins, T. W. & Poldrack, R. A. Inhibition and the right inferior frontal cortex: one decade on. Trends Cogn. Sci. 18, 177–185 (2014).
Stalnaker, T. A., Cooch, N. K. & Schoenbaum, G. What the orbitofrontal cortex does not do. Nat. Neurosci. 18, 620–627 (2015).
Bari, A. & Robbins, T. W. Inhibition and impulsivity: behavioral and neural basis of response control. Prog. Neurobiol. 108, 44–79 (2013).
Milad, M. R. & Rauch, S. L. Obsessive-compulsive disorder: beyond segregated cortico-striatal pathways. Trends Cogn. Sci. 16, 43–51 (2012).
Maia, T. V. & Frank, M. J. An integrative perspective on the role of dopamine in schizophrenia. Biol. Psychiatry 81, 52–66 (2017).
Omlor, W. et al. Context-dependent limb movement encoding in neuronal populations of motor cortex. Nat. Commun. 10, 1–16 (2019).
Arana, F. S. et al. Dissociable contributions of the human amygdala and orbitofrontal cortex to incentive motivation and goal selection. J. Neurosci. 23, 9632–9638 (2003).
Mar, A. C., Walker, A. L. J., Theobald, D. E., Eagle, D. M. & Robbins, T. W. Dissociable effects of lesions to orbitofrontal cortex subregions on impulsive choice in the rat. J. Neurosci. 31, 6398–6404 (2011).
Barbas, H. & Pandya, D. N. Architecture and frontal cortical connections of the premotor cortex (area 6) in the rhesus monkey. J. Comp. Neurol. 256, 211–228 (1987).
Burman, K. J., Bakola, S., Richardson, K. E., Reser, D. H. & Rosa, M. G. P. Patterns of afferent input to the caudal and rostral areas of the dorsal premotor cortex (6DC and 6DR) in the marmoset monkey. J. Comp. Neurol. 522, 3683–3716 (2014).
Madisen, L. et al. A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat Neurosci. 13, 133–140 (2010).
Masamizu, Y. et al. Two distinct layer-specific dynamics of cortical ensembles during learning of a motor task. Nat. Neurosci. 17, 987–994 (2014).
Ohkura, M., Sasaki, T., Kobayashi, C., Ikegaya, Y. & Nakai, J. An improved genetically encoded red fluorescent Ca2+ indicator for detecting optically evoked action potentials. PLoS ONE 7, e39933 (2012).
Kobayashi, K. et al. Survival of corticostriatal neurons by Rho/Rho-kinase signaling pathway. Neurosci. Lett. 630, 45–52 (2016).
Gradinaru, V. et al. Molecular and cellular approaches for diversifying and extending optogenetics. Cell 141, 154–165 (2010).
Chen, T. W. et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295–300 (2013).
Fenno, L. E. et al. Targeting cells with single vectors using multiple-feature Boolean logic. Nat. Methods 11, 763–772 (2014).
Kitanishi, T., Tashiro, M., Kitanishi, N. & Mizuseki, K. Intersectional, anterograde transsynaptic targeting of neurons receiving monosynaptic inputs from two upstream regions. Commun. Biol. 5, 149 (2022).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Kondo, M., Kobayashi, K., Ohkura, M., Nakai, J. & Matsuzaki, M. Two-photon calcium imaging of the medial prefrontal cortex and hippocampus without cortical invasion. eLife 6, e26839 (2017).
Yoshida, E. et al. In vivo wide-field calcium imaging of mouse thalamocortical synapses with an 8 K ultra-high-definition camera. Sci. Rep. 8, 8324 (2018).
Giovannucci, A. et al. Caiman an open source tool for scalable calcium imaging data analysis. eLife 8, 1–45 (2019).
Franklin, K. B. J. & Paxinos, G. The Mouse Brain in Stereotaxic Coordinates 3rd edn (Academic Press, 2008).
Berg, S. et al. Ilastik: interactive machine learning for (bio)image analysis. Nat. Methods 16, 1226–1232 (2019).
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
Thévenaz, P., Ruttimann, U. E. & Unser, M. A pyramid approach to subpixel registration based on intensity. IEEE Trans. Image Process. 7, 27–41 (1998).
Pnevmatikakis, E. A. & Giovannucci, A. NoRMCorre: an online algorithm for piecewise rigid motion correction of calcium imaging data. J. Neurosci. Methods 291, 83–94 (2017).
Pnevmatikakis, E. A. et al. Simultaneous denoising, deconvolution, and demixing of calcium imaging data. Neuron 89, 285–299 (2016).
Petreanu, L. et al. Activity in motor-sensory projections reveals distributed coding in somatosensation. Nature 489, 299–303 (2012).
Livneh, Y. et al. Estimation of current and future physiological states in insular cortex. Neuron 105, 1094–1111.e10 (2020).
Raposo, D., Kaufman, M. T. & Churchland, A. K. A category-free neural population supports evolving demands during decision-making. Nat. Neurosci. 17, 1784–1792 (2014).
Ganguli, S. & Sompolinsky, H. Compressed sensing, sparsity, and dimensionality in neuronal information processing and data analysis. Annu. Rev. Neurosci. 35, 485–508 (2012).
Shi, J. & Malik, J. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000).
Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007).
Musall, S., Kaufman, M. T., Juavinett, A. L., Gluf, S. & Churchland, A. K. Single-trial neural dynamics are dominated by richly varied movements. Nat. Neurosci. 22, 1677–1686 (2019).
Acknowledgements
We thank M. Nishiyama for animal care. This work was supported by Grants-in-Aid for Scientific Research on Innovative Areas (17H06309 to M.M.), for Transformative Research Areas (A) (22H05163 to K.N., 23H04676 to S.I., and 22H05160 to M.M.), and for Scientific Research (A) (19H01037 and 23H00388 to M.M.) from the Ministry of Education, Culture, Sports, Science, and Technology, Japan; AMED (JP22dm0207001 to M.M.; JP18dm0207027 to M.M.; and JP22dm0107150 to M.M.); Joint Research Project (20-253) of National Institute for Physiological Sciences (to M.M. and K.K.); and the Nakatani Foundation (to M.M.).
Author information
Authors and Affiliations
Contributions
E.Y., M.K., and M.M. designed the experiments. E.Y. and M.K. conducted the experiments and data analyses. E.Y., K.N., and S.I. constructed the models. R.A., N.H., and L.L. conducted part of the histology. S.-I.T. analyzed the photostimulation-induced forelimb movement. K.K. prepared AAV1-hSyn-R-CaMP1.07. E.Y., M.K., and M.M. wrote the paper, with comments from all other authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yoshida, E., Kondo, M., Nakae, K. et al. Whether or not to act is determined by distinct signals from motor thalamus and orbitofrontal cortex to secondary motor cortex. Nat Commun 16, 3106 (2025). https://doi.org/10.1038/s41467-025-58272-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-58272-w