Introduction

The sensory environment is a key part of our social lives. For example, in many social animals like humans and birds, dynamic vocal exchanges shape interactions and impact subsequent encounters. Social cues such as vocalizations are processed hierarchically in the brain along the ascending auditory pathway from hindbrain to pallium. In parallel, several limbic and hypothalamic regions are critically involved in the control of social behaviors; these ‘social behavior’ nuclei are densely interconnected, enriched in steroid signaling, and are often referred to as the Social Behavior Network (SBN)1,2. The SBN is thought to control social behavior in part by integrating external social signals, like vocalizations, with internal state. In order to understand social behavioral control, it is essential to understand how socially-relevant cues, once fully unpacked by sensory systems, reach the SBN to modulate its output.

It has long been known that SBN nuclei respond to all major sensory modalities tested (e.g., olfaction3,4,5; auditory6,7,8,9,10,11; somatosensory12,13,14; visual15,16,17), suggesting rich connections with sensory systems that are critical for how these nuclei participate in the control of social behavior18,19. Social stimulus selectivity in sensory processing areas20,21,22,23,24,25,26, taken together with the rich sensory representations in the SBN, appear to blur the lines between regions delineated as strictly sensory processing and those that control social behavior18. This is most evident in studies that directly link the social behavioral impacts of olfactory processing in the pallium of mammals to specific SBN circuit nodes1,27,28,29,30,31,32,33,34. For example, in mice, juvenile pheromones processed in olfactory pallium (i.e., the vomeronasal organ) decreased sexual receptivity in females; this sensory-guided social decision-making relied on SBN nuclei31. For mammals, network models and schematics of social circuits now regularly include specific olfactory pallial areas18,33,35.

When it comes to other sensory modalities prominent across vertebrate species, such as vision and audition, there is less known about circuits that transform sensory representations into social behavior. For the auditory system, brain areas relatively ‘early’ in the ascending auditory pathway clearly provide input to the SBN. Comparative work in frogs, birds, and rodents, has revealed that the auditory midbrain and thalamus influence the SBN11,36,37,38,39,40,41. However, auditory signals in many species are complex and are not fully processed until reaching higher-order circuits such as the auditory pallium. In many vertebrate taxa, the auditory thalamus sends a large projection to a primary recipient area of the auditory pallium (including the cortex in mammals); this primary area then projects to a number of secondary auditory pallial areas42,43,44,45,46. In mice, inactivation or lesion of the primary auditory cortex impact pup retrieval in mothers but not virgin females47,48. In birds, lesion studies of auditory pallium alter female preferences for male vocalizations, disrupt pair-bond formations, and alter entire social networks of individuals49,50,51. The auditory pallium is clearly instrumental for the complete processing of sociosensory stimuli, including dissection of auditory scenes, extracting meaning from complex auditory signals (i.e., language, song), learning new sounds, and recognizing conspecifics25,52,53,54,55,56,57. These functions are vital to social interactions, and so these higher-order sensory percepts must eventually reach circuits that control social behavior.

To uncover the influence of auditory pallium on circuits controlling social behavior, birds are an excellent study system. Zebra finches (Taeniopygia guttata) live in large social groups and can remember upwards of forty individuals by their vocalizations alone58, and the auditory pallium of this species has been well-characterized (e.g.21,43,49,52,59,60,61,62,63,64,65,66,). In birds, current evidence supports the possibility that auditory pallium is functionally connected to SBN nuclei, though the available data are correlational. Auditory-evoked immediate early gene expression in secondary auditory pallium and SBN nuclei are highly correlated in female Zebra finches7, and secondary auditory pallium projects to potential intermediate nuclei, which may then project to the SBN67,68.

We asked whether auditory pallium is necessary for auditory responses observed in SBN nuclei, consistent with an auditory pallial influence on circuits that control social behavior. Alternatively, auditory pallium could process sensory cues in parallel to the SBN, and in this way, both systems could guide behaviors via efferents to effector regions. In this latter case, sensory representations in SBN nuclei would be unaltered by disruption of the auditory pallium. We focused on female Zebra finches in this study, given the well-studied social role of females in this species in evaluating conspecific vocalizations (e.g.69,). We transiently inactivated the auditory pallium of female Zebra finches during male song playback and examined song-induced immediate early gene activation in SBN nuclei. We observed an effect of pallial inactivation solely in a specific nucleus of the SBN: the lateral ventromedial nucleus of the hypothalamus (VMHl). We then characterized the auditory responses of VMHl cells using extracellular electrophysiology. Our findings identify the first known functional connection between auditory pallium and a specific node of the SBN, the VMHl.

Results

Primary auditory pallial inactivation suppressed activity in a secondary auditory pallial region

To determine whether inactivation of primary auditory pallium disrupted coding in secondary auditory regions, we injected GABA receptor agonists (baclofen & muscimol) centered on primary auditory pallium (i.e., Field L) and recorded extracellularly from the caudomedial nidopallium (NCM), a secondary auditory pallial region that receives dense synaptic input from Field L. Prior to injection in Field L, multiunit traces in NCM displayed characteristic baseline and stimulus-response properties as in previous work (e.g. ref. 70, Fig. 1B). In multiunit traces, NCM is characterized by irregular spontaneous activity and heightened activity throughout the duration of natural acoustic sounds such as bird songs and calls. Upon baclofen & muscimol injection into Field L, after 5 minutes spontaneous NCM activity was entirely altered, qualitatively shifting to intermittent bursts of activity in between periods of inactivity. With respect to stimulus-driven activity, song playback now evoked only brief onset responses to some syllables, and overall evoked firing was suppressed. We moved the probe 100 μm ventral to a second recording site to confirm this suppression was not due to attrition of the signal, and recorded at this site at three further time points, 10, 17, and 60 minutes, to assess the duration of the effect GABA receptor agonists in Field L had on NCM. We conducted this experiment on one bird, using a linear mixed model with auditory stimulus as a random effect to determine the effect of time of recording relative to GABA receptor agonist infusion in Field L on NCM firing rates (2 recording sites total, 5 trials of each of 4 auditory stimuli per 5-time points (i.e., 20 multiunit firing rates at each of 5 total time points); see methods and Fig. 1 caption for further detail). This model revealed a main effect of time of recording on song-evoked multiunit firing rates (F(4,92) = 14.89, p < 0.001), with responses 5 min before GABA receptor agonist infusion in Field L differing from all other time points (compared to 5 min post-infusion, t92 = 7.30, p < 0.0001; to 10 min post-infusion, t92 = 5.78, p < 0.0001; to 17 min post-infusion, t92 = 4.06, p = 0.001; to 60 min post-infusion, t92 = 4.05, p = 0.001; Fig. 1C). Song-evoked multiunit firing rates slowly increased between the 5 min and 60 min recording (5 min post-infusion to 60 min post-infusion; t92 = −3.25, p = 0.014), but the qualitative change in baseline activity and song-evoked firing rates to stimuli did not fully recover (Fig. 1C). Recordings at various sites in NCM as the electrode was retracted confirmed that this suppression was evident for at least 400 μm dorsal to the initial recording site. Although we did not directly test the difference in multiunit recovery between individual stimuli, conspecific song processing appeared most affected, and therefore we used conspecific song stimuli in our subsequent experiment (see below). Thus, inactivation of primary auditory pallium using GABA receptor agonists leads to suppression of secondary auditory processing regions for at least 60 min.

Fig. 1: Successful targeting of the primary auditory pallium, Field L, with GABA receptor agonists results in impaired auditory processing in a secondary auditory pallial region.
figure 1

A A sagittal schematic of the Zebra finch brain, and the major route of auditory information flow to the pallium (shaded with lines) from subpallium (white background). Prior to, and following, an infusion of baclofen & muscimol (GABA receptor agonists), in the primary auditory pallium (Field L; shown in green), extracellular physiological recordings were made in a secondary auditory pallial region that receives dense synaptic input from Field L, the caudomedial nidopallium (NCM). Secondary auditory regions (NCM, and caudomedial mesopallium (CMM)) are shown in blue. Subpallial auditory regions (mesencephalicus lateralis, pars dorsalis (MLd), and nucleus ovoidalis (Ov)) are shown in gray). B and C show an experiment conducted in N = 1 female Zebra finch. Recordings were made at 2 sites in NCM: 1 recording site for 5 min prior to and 5 min following infusion of baclofen & muscimol into Field L; 1 recording site 100 μm ventral from the first for the 3 other time points following infusion (10, 17, and 60 min). At each time point, we conducted playbacks consisting of 5 trials of each of 4 auditory stimuli (i.e., 20 multiunit firing rates at each of 5 total time points). B Raster plots and summary histograms of the multiunit response in NCM before (top) and five min following (bottom) an infusion of baclofen & muscimol into Field L. A sonogram of the Zebra finch song played beginning at time 0 is depicted in green above each plot. C Mean multiunit evoked firing rates to various auditory stimuli recorded at various time points before and after baclofen & muscimol (GABA receptor agonists) infusion into Field L. Transparent data points represent individual trial firing rates at a given time point; color and shape of individual trial firing rates correspond to the color and shape of the overall mean at that time point. The vertical red dashed line represents the time of infusion. * represents p < 0.003 for a given time point compared to pre-infusion.

GABA receptor agonists in auditory pallium disrupted the immediate early gene response to song specifically in the lateral ventromedial nucleus of the hypothalamus

Next, we asked how disruption of auditory pallium might impact nuclei in the Social Behavior Network. To do this, we targeted bilateral cannulae centered on the dorsal primary auditory pallium, Field L. We confirmed cannula placement across treatment groups (see Materials and Methods for detail; Fig. 2A, B). We asked whether song exposure, GABA receptor agonist infusion in the auditory pallium, or the interaction of these two treatment levels, affected the number of egr-1 cells labeled in our brain regions of interest in the Social Behavior Network (Fig. 2C–E; Fig. 3A). Negative binomial models were used to account for overdispersed cell counts (see Statistics section for detail). Because sample size varied across experimental groups and brain regions due to tissue damage during brain processing, we refer the reader to the Cell counting section of the data analysis methods (the same information can also be found in the Fig. 2 caption) for exact sample sizes for each experimental group for each brain region.

Fig. 2: Targeting of cannulas and regions of interest.
figure 2

A Rostral-to-caudal map of bilateral cannula barrel tip locations at maximum depth in animals (total N = 26) receiving saline infusions (N = 13; left series) or baclofen & muscimol infusions (N = 13; right series). The playback condition is indicated by open circles (N = 12; silence condition) versus closed circles (N = 14; song playback condition). B (Left) Exemplar whole section image of cannula tracks at maximum depth. The green around cannula tip represents Alexa 488 fluorophore in infusion. Blue is used for contrast. (Right) A close-up of a cannula barrel tip imaged for DAPI (blue), Alexa 488 (green), and myelinated fibers (white; imaged using the reflective properties of myelin, see materials and methods; white arrows indicate examples of fibers). C Coronal plane depicted regions of interest quantified for egr-1 labeling. LSc.vl = lateral septum, caudoventrolateral; BSTm = bed nucleus of the stria terminalis, medial; mPOAc = medial preoptic area, caudal; VMHl and VMHm = ventromedial nucleus of the hypothalamus, lateral and medial respectively. D Bilateral view of the ventromedial nucleus of the hypothalamus (VMH) in a coronal plane. The lateral VMH is defined as the extent of the population of cells expressing the enzyme aromatase (solid outline; anti-aromatase = magenta), and the medial VMH is an ovoid nucleus ventromedial to VMHl (dashed outline). DAPI (blue) and parvalbumin (gold) are used as counterstains. E Exemplar images of egr-1 expression (green) and DAPI (blue) in the VMHl (left column) and VMHm (right column) taken from the positions of boxes in B. Each row represents an experimental treatment group, including whether a bird received a saline infusion or baclofen & muscimol infusion (Bac+Musc; GABA receptor agonists) into auditory pallium, and whether a bird was exposed to silence or song playback.

Fig. 3: Effects of sound and cannula infusion treatments on egr-1 cell counts.
figure 3

A Schematic of experimental treatment groups. Female Zebra finches received either saline or baclofen & muscimol (Bac+Musc; GABA receptor agonists) in the auditory pallium. Then, birds were either exposed to silence or playback of conspecific male songs. BF Box plots of cell counts of tissue immunolabeled for egr-1 in multiple regions of the Social Behavior Network. * = p < 0.05. Variations in tissue quality led to different sample sizes across brain regions. For medial bed nucleus of the stria terminalis (BSTm) and caudoventrolateral lateral septum (LSc.vl), N = 7 for Sound/Bac+Musc, and N = 6 for all other groups. For the caudal medial preoptic area (mPOAc), N = 7 for the Sound/Saline and Sound/Bac+Musc groups, and N = 6 for Silence/Saline and Silence/Bac+Musc groups. Tissue damage was more frequent at the ventral tip of the hypothalamus. For the medial ventromedial nucleus of the hypothalamus (VMHm), N = 5 for the Sound/Bac+Musc group and N = 4 for other groups. For lateral VMH (VMHl), N = 5 for the Silence/Saline and Sound/Bac+Musc groups, and N = 4 for Sound/Saline and Silence/Bac+Musc groups. G The green area represents primary auditory pallium, and blue areas represent secondary auditory pallium. (Left) When birds receive a saline infusion in auditory pallium, egr-1 expression in the VMHl is higher in birds exposed to playback than birds exposed to silence. (Right) When birds receive a GABA receptor agonist infusion in the auditory pallium, there is no difference in egr-1 expression in VMHl between silence and playback-exposed birds.

For the lateral ventromedial nucleus of the hypothalamus (VMHl) egr-1 counts, Cook’s Distance identified one outlier greater than 2 standard deviations from the group mean. Due to small sample sizes, we ran analyses with and without these outliers and obtained qualitatively identical results, so here we report analyses using all data, with outliers identified on plots in Fig. 3. A negative binomial model revealed a significant main effect of cannula infusion treatment (Z = −3.07, p = 0.002), and a significant interaction between song treatment and cannula infusion treatment (Z = 2.32, p = 0.020). Consistent with previous accounts that VMHl is auditory responsive6, Tukey HSD postdoc tests revealed a significant song playback effect for birds that received a saline infusion in auditory pallium; birds that heard song had greater numbers of egr-1 cells compared to birds that didn’t (Z = 3.51, p = 0.003; Fig. 3D). By contrast, there was no such difference for birds that received GABA receptor agonist infusion in auditory pallium; birds that heard song had equivalent numbers of egr-1 cells compared to birds that didn’t. Unexpectedly, birds that received GABA receptor agonist infusions in auditory pallium, regardless of song exposure, had higher egr-1 cell counts compared to birds that received saline infusions and heard silence (Silence & Baclofen Muscimol VS Silence & Saline, Z = 3.07, p = 0.012; Sound & Baclofen Muscimol VS Silence & Saline, Z = 3.61, p = 0.002; Fig. 3D). No other groups were statistically different from one another (all other p > 0.9; Fig. 3D).

No other brain regions measured, including neighboring medial VMH (VMHm), had a significant difference in egr-1 cell counts among the 4 treatment groups (including the medial bed nucleus of the stria terminalis (BSTm), the caudoventrolateral lateral septum (LSc.vl), or the caudal medial preoptic area (mPOAc); all p > 0.12 for all main effects and interactions in negative binomial models in each region; Fig. 3B, C, E, F; Supplementary Fig. 1)). Thus, these findings reveal a clear descending network influence of the auditory pallium on a specific nucleus in the Social Behavior Network, the lateral VMH (Fig. 3G).

Auditory pallial disruption alters foraging behavior

Next, we examined how experimental treatments affected behavior to ascertain whether disruption of auditory pallium influenced how birds attend to sounds, as well as other control behaviors. Behavior was recorded for a subset of the total experimental animals (Silence/Saline, N = 5; Song/Saline, N = 7; Silence/Baclofen&Muscimol, N = 4; Song/Baclofen&Muscimol, N = 7). We compared four behavioral measures between groups, including attentive behaviors (alert postures and head tilting), beak gaping (a behavior sometimes observed following handling stress), and foraging (i.e., feeding) behavior (Fig. 4A–D). A negative binomial regression revealed a significant main effect of cannula infusion treatment in auditory pallium on bouts of foraging behavior (Z = 2.01, p = 0.044; Fig. 4D), as well as a trend but non-significant interaction between song treatment and cannula infusion treatment (Z = 1.86, p = 0.063). Tukey’s HSD posthoc tests showed that birds receiving GABA receptor agonist infusions in auditory pallium engaged in significantly more foraging bouts than birds given saline infusions in auditory pallium (Z = 4.38, p < 0.0001; Fig. 4D). No other significant differences in behavior were identified across treatments.

Fig. 4: Effects of sound and cannula infusion treatments on number of discrete behaviors.
figure 4

A–D We observed A alert responses, B total time beak gaping, C number of head tilts, and D bouts of foraging behavior. Bac+Musc = baclofen & muscimol (GABA receptor agonists). Final behavioral sample sizes per group: Silence/Saline, N = 5; Sound/Saline, N = 7; Silence/Bac+Musc, N = 4; Sound/Bac+Musc, N = 7. EH For birds that heard song playback, we also calculated the proportions of behavior specifically when song playback was on compared to when song playback was off. The red dashed line at 0.5 denotes where 50% of behavior was during song playback periods. Red ‘X’s at zero on the Y-axis denotes animals that did not perform the behavior during the paradigm; because these animals by definition cannot have a proportion of behavior performed during either song or silence, these animals were excluded from the analysis. I Exemplar ethograms over the 30 min period of song playback for a bird with saline infused in the auditory pallium (top) and baclofen & muscimol (Bac+Musc) infused in the auditory pallium (bottom). The inset below each ethogram is an enlargement or the ethogram between 10 and 20 minutes. Shaded columns are bounded by the duration of song playback periods to demonstrate alignment of behavior during versus outside song. J Scatterplot of foraging behavior and egr-1 immunolabeling in the lateral ventromedial nucleus of the hypothalamus (VMHl). Concatenation of samples with both behavioral recordings and egr-1 cell counts led to final sample sizes of Silence/Saline, N = 4; Sound/Saline, N = 4; Silence/Bac+Musc, N = 2; Sound/Bac+Musc, N = 5. Colors represent the experimental treatment group. The empty circle represents the identified outlier (see main text).

It was unexpected that GABA receptor agonist infusions in auditory pallium would affect foraging behavior; we reasoned that disruption of high-level auditory signal processing may have led birds to divert time allocation to other behaviors like foraging. As a follow-up analysis, we compared the proportions of behavior during periods of playback versus periods of silence (playback treatments consisted of alternating periods of 30 sec playback and 30 sec silence, see Methods). Across groups, birds performed more alert behaviors (Saline infusion and Song: t4 = 3.18, p = 0.034; GABA receptor agonist infusion and Song: t5 = 3.27, p = 0.022) and head tilts (Saline infusion and Song: t5 = 5.27, p = 0.003; GABA receptor agonist infusion and Song: t6 = 5.05, p = 0.002) during song compared to silence (Fig. 4E, G, I). Whereas, the number of foraging behaviors and beak gaping did not differ during song or silence (Fig. 4F, H; all p > 0.3; with the caveat that few birds in the song & saline group foraged at all). Therefore, neither the number of attentive behaviors, nor their timing, explain the increase in foraging behavior caused by the infusion of GABA receptor agonists in auditory pallium.

The specificity with which GABA receptor agonists in auditory pallium affected both foraging behavior as well as egr-1 expression in VMHl recalls previous literature that demonstrated the involvement of VMH in feeding behavior in rodents and birds71,72,73,74,75,76. We asked whether individual variation in egr-1 induction in VMHl was related to changes in foraging behavior. Overlap of the subset of birds for which behavioral videos and VMHl cell counts were available led to the following sample sizes: Silence/Saline, N = 4; Song/Saline, N = 4; Silence/Baclofen&Muscimol, N = 2; Song/Baclofen&Muscimol, N = 5. As above (Fig. 3D), we ran analyses with and without the identified outlier in VMHl egr-1 counts due to small sample size. When excluding the outlier, a negative binomial regression with foraging behavior, song treatment and cannula infusion treatment, and the interaction of these two treatments, revealed a significant song X cannula infusion interaction (Z = 4.08, p < 0.001; as expected from findings above in Fig. 3D). Additionally, foraging behavior significantly explained VMHl egr-1 labeling (Fig. 4J; Z = 2.17, p = 0.030). However, when the same analysis was run including the outlier (open circle in Fig. 4J), no predictor variables were significant (song X cannula infusion: Z = 1.75, p = 0.080; foraging behavior: Z = 0.45, p = 0.65). Thus, foraging behavior may be specifically related to auditory input to VMHl, but this warrants further, formalized exploration.

A majority of VMHl single units respond to social and non-social auditory stimuli

Although our egr-1 experiment demonstrated a genomic reaction to song playback, it is unclear to what extent VMHl neurons actually respond to or encode vocalizations in real-time. Using tungsten electrodes, we targeted VMHl for recordings in N = 3 adult females in response to auditory playback of conspecific vocalizations and a non-social stimulus, white noise (Fig. 5A, B). We asked how many single units had a significant increase in firing rate in response to at least one of our stimuli (defined from here on as an “auditory response”). We encountered single units with strong, obvious auditory responses, and others with little to no response across all stimuli (Fig. 5C). In total, approximately 75% of single units (n = 40 out of 54) had a significant auditory response to at least one auditory stimulus (Fig. 5D). To examine the characteristics of auditory responsive units further, we examined the proportion of single units that responded to each stimulus type (Fig. 5E) as well as the number of stimuli for which a given single unit had a significant auditory response (Fig. 5F). A high proportion of single units was responsive to each stimulus, and the proportion of single units that responded significantly to all 5 stimuli was similar to the proportion responding significantly to only 1 stimulus (Fig. 5E, F).

Fig. 5: Electrophysiological responses of single units in the VMHl to auditory stimuli.
figure 5

A Experimental design, showing (left) a tungsten electrode targeted at VMHl on a medial sagittal schematic of the Zebra finch brain; (right) we recorded from the VMHl of awake female Zebra finches (N = 3; n = 54 single units total) during playback of vocalizations and a non-auditory foil stimulus, white noise. B Example of a lesion within the boundaries of VMHl. A coronal section showing the ventral hypothalamus is Nissl-stained. Ovals are drawn around the borders of VMH subregions for reference (left ventral oval = VMHm; right and doral oval = VMHl). A glial scar is visible within the right oval (VMHl). A dashed line is drawn on the midline dividing hemispheres. C Exemplar raster plots and histograms showing (top) a single unit that responded to a male song (green sonogram above the raster plots) across ten different presentations compared to (bottom) a single unit that did not respond to the song. D Quantification of the number of single units recorded that responded to at least one auditory stimulus presented with a significantly higher firing rate during stimulus presentation compared to its baseline firing rate (n = 40 out of n = 54 total; hereafter referred to as ‘auditory-responsive single units’). E Proportion of auditory-responsive single units that responded to each stimulus used during recordings. F Proportion of auditory-responsive single units that had a significant response to one, two, three, four, or all five auditory stimuli. G Population firing rate responses in Hertz (Hz) of auditory responsive single units during Baseline periods (lavender bars) and during subsequent Stimulus periods (magenta bars). * = p < 0.001 and $ = responses to white noise differ from other auditory stimuli specifically during the Stimulus period with p < 0.001 for all contrasts. H Rcorr accuracy is derived from a pattern classifier that detects a pattern in the timing of action potentials to a given auditory stimulus; it receives an accuracy score for its ability to match individual spike trains to the correct stimulus (out of the five presented). Red dashed line at 0.2 = chance accuracy levels. * = p < 0.001.

We next examined the population firing rate during baseline compared to during playback periods, focusing analyses on auditory responsive single units. A linear mixed model using a nested random effect structure to account for differences across birds, and differing numbers of single units recorded in each bird, revealed significant main effects of both the period of sampling (i.e., baseline firing versus stimulus firing; χ21 = 483.98, p < 0.001), as well as stimulus type (χ24 = 22.26, p < 0.001). Additionally, there was a period by stimulus type interaction (χ24 = 15.86, p = 0.003). Bonferroni-corrected posthoc contrasts for this interaction confirmed significant increases in firing rates for each auditory stimulus during stimulus presentation compared to baseline (Fig. 5G; female call, t477 = 10.44, p < 0.001; male song 1, t477 = 9.66, p < 0.001; male song 2, t477 = 8.07, p < 0.001; male song 3, t477 = 8.12, p < 0.001; white noise, t477 = 12.90, p < 0.001). The interaction reflected that, though there were no differences in firing rates across stimulus types at baseline, during stimulus, white noise had higher average firing rates compared to other auditory stimuli (Fig. 5G; largely due to two single units strongly driven by white noise; white noise to female call, t477 = 3.88, p = 0.005; white noise to male song 1, t477 = 4.14, p = 0.002; white noise to male song 2, t477 = 4.09, p = 0.002; white noise to male song 3, t477 = 5.28, p < 0.001).

Finally, in VMHl spike trains, we often observed some degree of time specificity in auditory responses across trials (e.g., Fig. 5C). Are VMHl neurons acting solely as detectors of auditory signals that pass a threshold, or do timing patterns in responses to stimuli carry sufficient information to detect differences among stimulus categories? We used a custom pattern classifier to ask whether individual spike trains could be accurately sorted by the classifier using only patterns in spike timing for auditory responsive single units (see methods for details). Classifier accuracy was significantly above chance levels, evaluated using Bonferroni-corrected one-sample Wilcoxon Signed Rank tests, for all stimuli except for one conspecific song (Fig. 5H; female call, V = 766, p < 0.001; male song 1, V = 733, p < 0.001; male song 2, V = 732, p < 0.001; male song 3, V = 573, p = 0.14; white noise, V = 725, p < 0.001), and on average, 70% of single units across stimuli were above chance levels (min = 50%, max = 83.33%). Collectively these data implicate a strong majority of VMHl single units in processing social and non-social auditory stimuli, including spike pattern timing information that suggests VMHl single units carry sufficient information to distinguish individual stimuli.

Discussion

Auditory responses in nuclei critical for social behavioral control have long been known to be sensitive to social context6,7,8,9,10. This auditory information is critical for social decision-making, and is processed, in part, by the auditory pallium25,52,53,54,55,56,57. In the present study, we reveal a functional link from the auditory pallium to a specific hypothalamic nucleus critical for social behavior, the VMHl. Accordingly, electrophysiological recordings of single units in VMHl revealed a majority of recorded units that were responsive to social, as well as non-social auditory cues. The timing of action potential responses to auditory cues additionally contained sufficient information such that a pattern classifier, which used action potential timing as an input, was able to identify stimulus type using only the neurons’ electrophysiological responses. Thus, we provide evidence that VMHl (1) receives information about social auditory cues processed by the pallium and (2) encodes individualized sensory representations of auditory stimuli, including complex conspecific vocalizations. Descending connections from the auditory pallium to the hypothalamus may therefore be critical pathways for social decision-making.

At present, the available data leave open the possibility for both direct and indirect pathways from auditory pallium to VMHl. In a recent study focused on VMHm68, retrograde tracing identified sparse inputs from NCM, a secondary auditory pallial region, but it is unclear to what extent VMHl is included in these direct descending projections. There are, at least, three main possibilities for indirect pathways from auditory pallium to VMHl. One is through the shell of the auditory thalamus. This shell pathway projects to VMHl68, though it is unclear whether this shell area receives descending projections from the auditory pallium (though see Mello et al., 77 for a possible indirect descending route from auditory pallium to auditory thalamus shell through RA cup). In mammals, a similar shell structure around the auditory thalamus has emerged as a candidate “secondary thalamic” area78. Also in mice, a multimodal thalamic nucleus that receives descending input from the auditory cortex sends a major projection to the paraventricular nucleus of the hypothalamus41. A second possibility is via amygdalar regions such as the medial arcopallium in birds, which projects to VMH and is auditory-responsive (refs. 6,68,79; also see ref. 67 for one possible multisynaptic pathway through the ventral arcopallium and the mesolimbic reward system). Furthermore, the medial arcopallium is also thought to be analogous to the mammalian medial amygdala80,81,82; in mammals the medial amygdala is thought to route salient olfactory sensory information from the pallium to the VMH to guide decision-making83. Finally, prefrontal cortex may be important in conveying highly processed sensory information to the hypothalamus in general (e.g.84,85), though no major efferent pathways to the hypothalamus from the avian analog have yet been described. An important next step will be using intersectional tracing approaches to identify brain regions that link auditory pallium and the VMHl.

Regardless of the exact synaptic path linking auditory pallium to the VMHl, electrophysiological recordings in this study demonstrated that a majority of neurons in this region respond to social as well as non-social auditory stimuli. Although VMHl auditory activity has been demonstrated in birds6, previous work has been restricted to immediate early gene expression. One study in doves used electrophysiology to characterize VMH neurons that respond to conspecific calls, and only identified six neurons across all recording sites and animals (6.8% of all recorded single units)11. It is possible this study targeted a nearby subdivision of the VMH, which would be consistent with our recording tracks that produced misses in VMHl (neither recording tracks lateral nor medial to VMHl yielded obvious auditory responses), or that this represents a species difference. Our data provide important insight into how VMHl might encode socially-relevant auditory stimuli, given that we identified single units that responded to all auditory stimuli presented as well as equal numbers that were highly selective (i.e., responded to only one of five stimuli).

Although VMHl single units responded strongly to conspecific vocalizations, population firing rate responses were highest to white noise in the present study (whereas in auditory pallium, response to white noise can be lower than to conspecific vocalizations, e.g., Fig. 1C). This was unexpected, as previous work in songbirds showed that VMHl egr-1 expression was higher in response to songs compared to pure tone controls6. One explanation for our data is based on experimenter observations that white noise could have been aversive to birds (movement was qualitatively more common during white noise presentation), and that VMHl responds strongly to aversive sounds. This would be consistent with previous studies in finches that use white noise as a punishment during operant conditioning52. Alternatively, VMHl may strongly encode broadband noise. Zebra finches time their breeding with seasonal Australian rains86, a naturally occurring broadband auditory stimulus, and VMH is strongly implicated in female reproduction across species87,88,89. Future studies will be needed to assess the extent to which broadband noises vs. natural rainfall sounds drive VMHl activity and subsequent reproductive behaviors.

Studies in mice that have recorded from VMH4 (specifically the ventrolateral VMH) demonstrate that olfactory responses in this nucleus are dependent on learning. For example, investigation of scents from animals who socially dominated the focal animal elicit much higher activity90. Similarly, here the timing of spikes from single units in VMHl in zebra finches is somewhat differentiated by stimulus category. Previous work demonstrates that songbird auditory pallium can carry stimulus-specific information in spike timing91,92,93, and this feature may be inherited by VMHl as a mechanism for differentiating auditory stimuli.

An intriguing aspect of our findings is that GABA receptor agonist treatment in the auditory pallium, regardless of song treatment, increased egr-1 expression in VMHl. First, this emphasizes that pharmacological inhibition of auditory pallium in the absence of external auditory activation is sufficient to reveal the functional connection from auditory pallium to VMHl. One possibility is that auditory pallial inputs to VMHl (whether direct or indirect) are normally suppressive. Disrupting auditory pallium may disinhibit VMHl, leading to increases in general excitability. However, this interpretation is insufficient to explain data in the present study. For example, playback of auditory cues leads to firing rate increases in both auditory pallium and VMHl (Fig. 5), which is inconsistent with a simple, descending inhibitory projection. There are several possible models that could account for these data (e.g., auditory pallium activates interneurons in VMHl, driving sparse, rebound excitation of VMHl neurons, a circuit mechanism outlined in refs. 94,95), and so future experiments will be necessary to tease apart local circuits in VMHl as well as how incoming auditory information synapses onto various cell types in the nucleus.

Another possible explanation is that, with GABA receptor agonists in the auditory pallium, egr-1 expression in VMHl reflects neuronal activity associated with other behaviors. The VMH is implicated in the control of multiple social behaviors across vertebrate species35,87,88,96, and is also involved in aspects of homeostasis, including feeding state and fat regulation72,74,75,76,87,97, similar to other hypothalamic and preoptic regions that have dual roles in social behavior and homeostatic functions98,99. In the present study, one unexpected finding was that inactivation of auditory pallium had parallel effects on social signal processing in VMHl and frequency of foraging behavior. Furthermore, although further work would be needed to begin to test this hypothesis, if foraging behavior and immediate early gene expression in the VMHl are related, this predicts that VMHl balances feeding drive and motivation to engage in social behaviors (as suggested by the near absent foraging behaviors in birds with intact auditory pallium exposed to conspecific songs, Fig. 4J). Alternatively, inactivation of auditory pallium could increase VMHl egr-1 expression and foraging behavior by completely separate mechanisms. These findings generate new hypotheses for circuit mechanisms by which hypothalamic nuclei influence motivational state depending on social sensory input.

Our paradigm detected an effect of song on egr-1 expression specifically in the lateral VMH. This is consistent with a previous study in female White-throated sparrows that showed, without estradiol treatment, that VMHl was the only SBN nucleus that differentiated songs from tones6. Our data do not, however, suggest VMHl is the only subpallial nucleus critical for social behavior that receives input from auditory pallium. Maney et al. 6 detected several nuclei with auditory responses in the SBN framework that were not detected in the present study. This may be due partly to the choice of study species. Zebra finches are socially monogamous and opportunistic breeders that form pair-bonds, whereas White-throated sparrows are seasonally-breeding and highly territorial songbirds. Thus, in our paradigm, female finches may have evaluated stranger song for reproductive salience, but this type of stimulus in this species may not have activated cell populations in other SBN regions that may reflect social vigilance, anxiety, aggression, etc. Also, although we focused on regions defined by the SBN framework, there are other regions not included in our study that are critical for social decision-making (e.g., see frameworks that expand the focus beyond the SBN18,89, or proposals to reconsider these network frameworks entirely100). Future studies can expand upon the richness of social stimuli and naturalistic contexts and also examine males in these contexts as well, as there may be sex differences in auditory pallial input to nuclei involved in social behavioral control.

Although the present study focuses on how the hypothalamus receives sensory input from the auditory pallium, the auditory pallium also receives information from the hypothalamus to modulate sensory representations. In mammals, oxytocin neurons in the periventricular nucleus of the hypothalamus project to the primary auditory cortex to modulate auditory responses to infant vocalizations47. In birds, the medial preoptic area, through indirect projections to the pallial premotor nucleus HVC, modulates the motivation to produce courtship song101,102. These observations, in combination with the present study, predict the occurrence of bidirectional loops between higher pallium hypothalamus and limbic regions in the SBN that continuously modulate sensory processing in concert with the motivational state.

In developing the SBN framework, Sarah Newman originally used data from male rodents to propose that social decision-making emerged from the integration of internal and external cues1. In this context, ‘external cues’ referred to the coding of social scents, while ‘internal cues’ referred to sex steroid signaling that altered motivational states within the SBN. In a variety of systems, nuclei within this SBN framework are now known to process a wide array of sensory cues3,4,5,6,7,8,9,10,12,13,14,15,16,17,99,103,104. Thus, external sensory inputs to brain regions that control social behavior are critical components of social circuitry. Yet in many cases, such ‘sensory-social’ brain pathways have remained undefined. In this study in female songbirds, we have identified two components of such a pathway functionally connecting auditory pallium and the VMHl. These findings highlight the importance of expanding our understanding of social circuits to include pallial structures providing input to SBN nuclei from sensory modalities critical for social interactions.

Materials and Methods

Animals

We used adult female Zebra finches (Taeniopygia guttata; >100 days post hatch) in this study housed in unisex aviaries under a photoperiod of 14-h light: 10-h dark. Birds were in acoustic and visual contact with a neighboring unisex aviary of male birds. Food and water were provided ad libitum, and birds received weekly dietary enrichment (e.g., egg food, fresh millet branches, cuttlebone). For electrophysiology, N = 1 female was used to confirm network effects of pharmacological manipulation of Field L. N = 28 females were used in cannulation surgeries and subsequent behavioral testing and immediate early gene labeling. All procedures and protocols adhered to the guidelines of the National Institutes of Health Guide for the Care and Use of Laboratory Animals, and were approved by the University of Massachusetts, Amherst Institutional Animal Care and Use Committee.

Anesthetized electrophysiology for verification of pharmacological treatment

We performed extracellular recordings in two separate experiments. In the first, N = 1 female zebra finch was recorded in vivo under urethane anesthesia to confirm the validity of a restricted manipulation in auditory pallium in disrupting auditory pallium writ large. Urethane anesthesia, following methods in previous studies70. To administer urethane anesthesia, injections of 30 µL 20% urethane were made in the pectoral muscle every 45 min (specific amount depended on the mass of the bird), totaling 90-120 µL. We then moved the anesthetized bird to a custom stereotaxic apparatus, where we performed a crainiotomy to expose the brain surface dorsal to Field L and the caudomedial nidopallium (NCM; a secondary auditory region that receives dense synaptic input from Field L), and fixed a stainless steel headpost to the head using acrylic cement. Immediately following, we moved the bird to a sound-attenuation booth (Industrial Acoustics) on an air table (TMC, Peabody, MA) where the bird was fixed to a custom stereotax (Herb Adams Engineering) at a 45o head angle using the attached headpost. Experiments were conducted in the left hemisphere.

Coordinates used were in reference to the caudal edge of the bifurcation of the sagittal sinus with the head tiled at a 45o angle (Field L = 1.8 rostral, 1.2 lateral, 1.7 ventral; NCM = 1.1 rostral, 0.7 lateral, 1.75 ventral). Prior to recordings, a glass pipette (tip diameter: 30 µm) was loaded with mineral oil and attached to a Nanoject III (Drummond Scientific Company, Broomall, PA). Using the Nanoject, a solution of GABA receptor agonists was drawn into the pipette tip, containing 1 mM baclofen, 0.1 mM muscimol, and 5% fluorescent dextran amines (3000 MW; Life Technologies Corporation, Carlsbad, CA) in 0.9% saline. The Nanoject was attached to one of two micromanipulators (World Precision Instruments w/ Kantetec, Sarasota, FL), and the glass pipette was lowered to reach Field L. Using the other micromanipulator, a tungsten electrode (0.5 M Ohms; A-M Systems, Sequim, WA) was lowered into NCM.

We conducted auditory playback experiments as in previous studies70,105, using 3 conspecific vocalizations (1 contact call; directed songs from 2 different unfamiliar males) and white noise, pseudorandomly presented 5 times each, with an interstimulus interval of 10 ± 2 sec. Recordings were made using the tungsten electrode in NCM. After two playback trials, we injected 300 nL of the GABA receptor agonist solution into Field L. 5, 10, 17, and 60 minutes following the injection in Field L, we conducted auditory playback experiments at the same site in NCM (5 minutes after the injection), and then an additional site 100 μm ventral for NCM recordings at 10, 17, and 60 minutes post-injection. Following the experiment, while retracting the tungsten electrode from the brain, we performed qualitative checks of the response of other NCM sites to auditory playback to confirm the observed responses were not an artifact of the chosen sites. Following recordings, the bird was transcardially perfused and the brain was extracted for anatomical confirmation of the Field L injection site (localized using fluorescence in injection cocktail).

Awake electrophysiological recordings

We performed awake, head-fixed in vivo electrophysiological recordings in the VMHl of N = 3 female zebra finches. Birds were anesthetized with isoflurane, after which birds were fixed to a custom stereotax as above and bilateral craniotomies were made over VMHl with respect to the bifurcation of the mid-saggital sinus, with the head tilted 50o below the horizontal. Also as above, stainless steel headposts were then fixed to the skull using acrylic cement. Coordinates to VMHl were adapted from Wild et al., 68, in which the authors targeted the medial subnucleus (VMHm) for tract tracing in Zebra finches. We used a lesion protocol by passing current through the tungsten electrode to confirm the location of our recordings (see below) and were able to successfully target VMHl using the following coordinates: 1.25–1.3 rostral, 0.3–0.35 lateral, 6–6.5 mm ventral.

On a separate day (1-4 days following surgery), birds were moved to a sound attenuation booth and recordings were conducted as above using a tungsten electrode. Auditory playback during recordings consisted of unfamiliar conspecific vocalizations and white noise. In each recording, birds were exposed to one conspecific call, three male songs (from different males), and white noise. Every three trials, male songs were swapped with a separate set. In total, there were 9 different male songs used during trials. In each trial, auditory stimuli were pseudorandomly presented 10 times each, with an interstimulus interval of 10 ± 2 sec.

At the end of each recording, a lesion was made at the lowest recording site by passing current in a square pulse for 16 s through the tungsten electrode (see Fig. 5B for typical lesion histology).

Cannulation surgeries

One week prior to test day, we implanted N = 28 female zebra finches with bilateral cannulas (Plastics One; 2.4 mm center-to-center distance) targeting Field L, the primary thalamorecipient of ascending auditory signals in the avian pallium. Birds were fasted ~30 min prior to surgery, anesthetized with 2% isoflurane (VetOne) in 2 L/min O2, fixed to a custom stereotax (Herb Adams Engineering) equipped with a heating pad (DC Neurocraft) at a 45o head angle, and maintained on 1.5% isoflurane, 1 L/min O2 for the duration of the surgery. Points overlying Field L were marked by scoring the skull lateral and rostral to our coordinates (i.e., a crosshair), and a craniotomy exposed the brain surface (Field L coordinates = 1.8 mm rostral, 1.2 mm lateral of stereotaxic zero, defined as the caudal edge of the bifurcation of the midsaggital sinus). Dummy bilateral cannula were fitted into guide cannula, and lowered 1.4 mm below the brain surface, just dorsal to the main myelinated fiber tracks terminating in Field L (as shown in Fig. 2B). The cannulae were targeted at positions where data suggest mid-to-high range frequencies are processed in Field L in order to disrupt key spectral features of zebra finch vocalizations without damaging Field L with the cannulae (see Figs. 2,3 in Kim and Doupe 2011106). Metabond (C&B) was applied to the skull around the cannula as an anchor for the subsequent application of dental cement, which secured the guide cannula to the skull.

Testing paradigm

24 hours before the experiment began, birds were habituated with a mock injection, which mimicked the handling and time course of a real injection. Birds were isolated in an acoustic chamber overnight with a playback speaker and a video camera (Sony), and all experiments were run the following day between 8:30 AM and 11 AM. Birds were given one of four treatments in a factorial design: Silence (i.e., heard no song), Saline infusion; Silence, GABA receptor agonist infusion (Baclofen (1 mM) and Muscimol (0.1 mM) as described above) dissolved in saline infusion; Song (i.e., playback of conspecific zebra finch songs), Saline infusion; Song, GABA receptor agonist infusion (N = 7 per treatment; N = 28 birds total). All solutions included 0.2% Alexa 488 fluorophore to aid in visualizing the site of infusion (Life Technologies, Carlsbad, CA). Birds were run in a randomized block design; treatments were randomized among cohorts of 4 birds at a time (but within each cohort, each treatment combination was represented once).

For infusion, the dummy cannula was removed and replaced with an infusion cannula. In all treatments, we infused 300 nL using 15 µL Hamilton syringes over the course of 1 min, driven by a Harvard Apparatus pump (Harvard Apparatus, Hollison, MA), followed by a 2 min wait to allow the solution to diffuse. The infusion cannula was then withdrawn and replaced with the dummy cannula. Tubing filled with ddH20 connected the syringes to the infusion cannula. We confirmed flow through the cannula by following the boundary (determined by Alexa 488 dye) between the treatment solution and the ddH20 filling the rest of the tubing. The infusion procedure took approximately 5 minutes from the time the bird was caught to when it was returned to the chamber. Song playback or silence commenced 5 min after the bird was returned to the chamber, and lasted 30 min. We used a triplicate song playback protocol, which is used to drive high levels of egr-1 expression in the brain107. This consists of playback of three conspecific songs from three separate individuals unknown to any female in the present study (with 5 sec interstimulus intervals) for 30 sec, followed by 30 sec of silence, repeated 30 times. All playbacks were normalized to 70 dB. Following these 30 min, we waited an additional 30 min to allow for egr-1 expression6,108. During this time, birds were video recorded for offline quantification of behavior. Birds were then rapidly captured, overdosed on isoflurane, and transcardially perfused with approximately 30 mL of chilled 0.01 M phosphate-buffered saline (PBS), followed by approximately 30 mL of chilled 4% paraformaldehyde.

Two birds passed away in between cannula infusion and trial day, leaving pre-tissue processing sample sizes at: Silence/Saline, N = 6; Song/Saline, N = 7; Silence/Baclofen&Muscimol, N = 6; Song/Baclofen&Muscimol, N = 7.

Tissue Processing and Histology

Brains were dissected from the skull and left overnight in 4% paraformaldehyde, then transferred to dehydrate in a 30% sucrose in 0.01 M PBS until the brain sank. We then froze brains by immersing them in cryo-embedding compound (Ted Pella Inc., Redding, CA) in 2 × 2 x 2 inch plastic blocks at −80 °C. Brains were sectioned in the coronal plane in a cryostat (Leica) at 30 μm. Sections were stored at −20 °C in cryoprotectant solution (30% sucrose, 30% ethylene glycol, 1% polyvinylpyrrolidone, in 0.1 M phosphate buffer).

To verify cannula placement, every fourth section was collected onto glass slides and coverslipped with Prolong antifade mounting medium with a fluorescent stain for DAPI (Invitrogen, catalog #P36962). Cannula tracks were identified under a light microscope (Zeiss) and the proximity of the track to Field L was first confirmed through comparison to available Zebra Finch brain atlases109,110. We then used a confocal microscope (Nikon) to assess the presence of Alexa fluorophore at the cannula tips. As an additional confirmation of placement, we layered a second image on the same microscope using spectral confocal reflectance (SCoRe) microscopy to visualize the dense myelin that characterizes Field L2a in this region of the pallium109,111,112. In all but 2 cases, cannula barrel tips were either within the boundaries of Field L or just dorsal to it (i.e., within 300 microns dorsal of fibers; see Fig. 2A). In the 2 cases that cannula barrel tips were greater than 300 microns dorsal to Field L fibers, barrel tips were within the ventral caudal mesopallium, a part of the auditory pallium. For one of the two animals, that received a saline infusion, analyses were unchanged by its inclusion or removal. For the other, which received a baclofen & muscimol infusion, this animal was identified in outlier analyses in every brain region as having >2 SD egr-1 expression compared to other animals in its group, and thus analyses are presented with and without this data point independent of consideration of cannula placement (see statistics below).

We conducted fluorescent immunolabeling to detect immediate early gene expression and protein markers that aid in defining brain regions of interest. Tissue sections were washed 5 times for 5 min each in 0.01 M PBS, 3 times for 10 min each in 0.01 M PBS with 0.3% triton (PBT), blocked for 1 hr using 10% normal donkey serum, and incubated for two days at 4 °C in primary antibodies mixed in blocking serum. Tissue sections were run in two series: one series was incubated with rabbit anti-egr1 at a dilution of 1:1,000 (RRID: AB_2231020) and mouse anti-parvalbumin at a dilution of 1:10,000 (RRID: AB_2174013); the second series was incubated with rabbit anti-aromatase at a dilution of 1:2,000 (aromatase antibody provided as a generous gift from Dr. Colin Saldanha) and mouse anti-tyrosine hydroxylase at a dilution of 1:2,000 (RRID: AB_572268).

Sections were then moved to room temperature and washed 3 times for 10 min each in 0.1% PBT, followed by a 1 hr incubation in secondary antibodies at a dilution of 1:500 in 0.3% PBT. Secondary antibodies used included donkey anti-rabbit Alexa 488 (RRID: AB_2556546), and donkey anti-mouse 594 (RRID: AB_2556543). From the secondary antibody step forward, tissue was covered to prevent photobleaching of fluorescence. Sections were finally washed 3 times for 10 min each in 0.1% PBT, mounted onto subbed microscope slides (Fisher), and coverslipped using Prolong antifade mounting medium with a fluorescent stain for DAPI (Invitrogen, catalog #P36962). Slides dried overnight at room temperature and were stored at 4 °C until imaging.

To confirm location of electrophysiology probes for the awake recordings, we first collected and processed brains as above. For these brains, every other brain section was collected on glass slides for Nissl histology (Fig. 5B).

Imaging and Regions of Interest

Sections were imaged on a confocal microscope (A1SP; Nikon, Tokyo, Japan) at the UMass light microscopy core facility. Images were acquired using NIS-Elements software (RRID: SCR_002776). We determined gain and laser intensity separately for each tissue section to minimize background fluorescence.

We focused our analysis on three nodes of the Social Behavior Network that have previously exhibited song-induced egr-1 expression in female songbirds and two regions that did not, based on previous work6. Regions measured include the ventrolateral subdivision of the lateral septum (LSvl), the medial bed nucleus of the stria terminalis (BSTm), the caudal medial preoptic area (mPOAc; also abbreviated as POM in other bird literature), and the medial and lateral subdivisions of the ventromedial nucleus of the hypothalamus (VMHm and VMHl). For all regions, we took 10 μm Z stacks with 2 μm steps at either 20x or 40x (6 images per Z stack total). Brain regions were localized using landmarks established in previous literature (refs. 6,17,68; Fig. 2C,D). For LS, the area quantified was a square with 195 μm sides. For BSTm, the area quantified was a rectangle 400 μm by 300 μm. For mPOAc, the area quantified was a square with 318 μm sides. For VMH, we took Z stacks across tiled large images (3 wide by 4 tall) with 5% stitching. To define subregions of VMH, we used guidance from previous literature6,68. VMHm is a compact ovoid nucleus at the level of the anterior commissure that sits above the optic tract. The area quantified was an ellipse 500 μm on the dorsal/ventral axis and 250 μm on the medial/lateral axis (area = 98 μm2). For VMHl, we quantified the area dorsolateral to VMHm, the extent of which was determined based on aromatase fluorescent immunolabeling on alternative sections (Fig. 2D). The area quantified was an ellipse 800 μm on the max axis, and 250 microns on the minimum axis (area = 157 μm2); using the max axis at 90 degrees as a reference point, the ellipse was tilted approximately ±15 degrees to match the slope of the aromatase positive population on alternate sections.

Data analysis

Electrophysiological recordings were amplified, bandpass filtered (300 to 5000 Hz; A-M Systems), and digitized at 16.67 kHz (Micro 1401, Spike2 software; Cambridge Electronic Design). Data were processed in Spike2 (version 7.04). Recordings were thresholded by the experimenter using a level crossed only by high-amplitude events, excluding all small-amplitude events, as in previous work (e.g.62,113).

For recordings in NCM, peristimulus histograms, and raster plots were generated to examine multiunit responses to auditory stimuli. Mean multinunit firing rates were calculated in Hz for each auditory stimulus across trials at each time point relative to baclofen & muscimol infusion into Field L.

For recordings in VMHl, we sorted data to isolate single units as in previous studies (e.g.105). Units were isolated by their waveform characteristics, by selecting for units that only had an interspike interal >1 ms, and were confirmed using a principal component analysis (i.e., non-overlapping clusters). Baseline noise in VMHl was consistently and relatively high across all recordings (compared to neighboring brain areas dorsal and lateral), and so sorting was performed conservatively, with typically 1–2 well-isolated single units per recording site. Total sample size from N = 3 females was n = 54 single units.

We calculated mean firing rates in Hz at baseline and for each auditory stimulus to assess the extent to which single units responded significantly to auditory stimuli (see statistics).

To assess whether VMHl single units carried information that could allow differentiation amongst auditory stimuli, we used a custom pattern classifier as in previous studies91,92,105. Briefly, the classifier attempts to identify, for each single unit, a consistent pattern of time-locked spiking across trials that characterize a response to one stimulus versus another. To do this, it pseudorandomly picks one spike train response per stimulus as a template. The percentage of trials the classifier correctly assigns to each stimulus category averaged across 1000 renditions of the procedure above, was used as an accuracy measure to assess information consistently represented in VMHl single unit spike trains.

To analyze behavioral data, two observers unaware of experimental treatment quantified behavioral videos for the following behaviors: alert responses, which were defined by a quick straightening of posture in which the bird lengthened its neck; head tilts, which were defined as a head movement that changed the angle of the head relative to the ground; beak gaping, which was defined as beak opening not associated with foraging (both number of bouts, and bout length, was quantified); foraging, which was defined as pecking at either the food bowl, or directed at objects such as spilled seed on the cage floor. Video camera issues prevented quantification in 3 cases, making sample sizes for behavior as follows: Silence/Saline, N = 5; Song/Saline, N = 7; Silence/Baclofen&Muscimol, N = 4; Song/Baclofen&Muscimol, N = 7. One observer quantified all videos, and the second observer quantified 75% of videos – videos scored by both observers were used to confirm consistency of scoring. Each observer scored each video without audio. Following behavioral scoring, observers recorded timestamps for when individual playback periods began and ended; these timestamps were used later to test whether behavior in experimental treatments that heard playback differed between silent periods and playback periods. Observers quantified behavior for each animal for 35 minutes: this included the 5 min period immediately following Field L infusion, as well as the following 30 min of playback or silence. Behavior from all 35 min was used to assess group differences in the total number of a given behavior. For analyses assessing the proportion of behavior during song playback or silence, we used behavior from the 30 min period of playback to control for the amount of time animals spent listening to song playback VS sitting in silence.

For cell counting analyses, cells positive for egr-1 immunolabeling were quantified using Nikon NIS Elements at the UMass light microscopy core facility. Automatic detection of positive cells in NIS Elements was curated using the following parameters: 1) A threshold above background levels of fluorescence, 2) 3-dimensional volume of label across Z-stacks, and 3) Sphericity (to filter out elongated objects). To ascertain the quality of automated counts, a person, unaware of treatment condition and of the result of automated counts, manually performed cell counting in a subset of images in each brain region.

Tissue damage during brain processing made some regions unquantifiable, leading to the final sample sizes for each brain region. For BSTm and LSc.vl egr-1, N = 7 for Sound/Bac+Musc, and N = 6 for all other groups. For mPOAc egr-1, N = 7 for the Sound/Saline and Sound/Bac+Musc groups, and N = 6 for Silence/Saline and Silence/Bac+Musc groups. Tissue damage was more frequent at the ventral tip of the hypothalamus. For VMHm egr-1, N = 5 for the Sound/Bac+Musc group and N = 4 for other groups. For VMHl egr-1, N = 5 for the Silence/Saline and Sound/Bac+Musc groups, and N = 4 for Sound/Saline and Silence/Bac+Musc groups.

Statistics and Reproducibility

Data processing was conducted in both Python (version 3.7; using Spyder (version 4.0.1114), and the pandas115 and numpy116 libraries), and R (version 4.2.0; using R Studio117 and the tidyverse package118); all statistical analyses were conducted in R.

To test whether GABA receptor agonist infusion into Field L disrupted the response of NCM to auditory playback, we collected data from N = 1 adult female zebra finch, from which we recorded multiunit firing rate activity at one site before GABA receptor agonist infusion, and 5 minutes after, followed by a second site at three further time points (10, 17, and 60 minutes), for a total of 5 time points. Each trial consisted of multiunit recordings of each of 4 auditory stimuli, each repeated 5 times. We used a linear mixed model with stimulus firing rate (Hz) as the dependent variable, and the recording time course as an independent variable, with auditory stimulus ID as a random effect. Posthoc linear contrasts were evaluated using Tukey’s HSD posthoc tests (emmeans package119 in R).

To test whether egr-1 labeling or behavioral counts differed between our experimental treatments, we used Poisson regression. Specifically, because our count data had variances unequal to the mean in all cases, we used negative binomial models120 (glm.nb function in MASS package121 in R). These models included main effects of song stimulus (i.e., song, or no song) and of cannula infusion treatment (i.e., with GABA receptor agonists baclofen & muscimol, or with saline), and their interaction. If significant main effects or interactions were identified in omnibus ANOVA tests, we then conducted Tukey’s HSD posthoc tests (emmeans package). Sample sizes for each brain region are as follows: For BSTm and LSc.vl egr-1, N = 7 for Sound/Bac+Musc, and N = 6 for all other groups. For mPOAc egr-1, N = 7 for the Sound/Saline and Sound/Bac+Musc groups, and N = 6 for Silence/Saline and Silence/Bac+Musc groups. Tissue damage was more frequent at the ventral tip of the hypothalamus. For VMHm egr-1, N = 5 for the Sound/Bac+Musc group and N = 4 for other groups. For VMHl egr-1, N = 5 for the Silence/Saline and Sound/Bac+Musc groups, and N = 4 for Sound/Saline and Silence/Bac+Musc groups. We quantified cells in each hemisphere on adjacent tissue sections that contained each brain region (range of hemispheres quantified: 1-4), and took the mean of cell counts across hemispheres. Sample sizes for behavioral counts are as follows: Silence/Saline, N = 5; Song/Saline, N = 7; Silence/Baclofen&Muscimol, N = 4; Song/Baclofen&Muscimol, N = 7. Behavioral counts were summed across the length of the video recording, resulting in one count per behavior per video.

For comparing the proportion of behaviors done during song playback in birds that received either GABA receptor agonists or saline in the auditory pallium, we used one-sample Welch’s t-tests. Data were log-transformed if assumptions for normality and equal variances were not met (in the case data were log-transformed, data then met assumptions). Sample sizes for behavioral proportion analyses are as follows: Song/Saline, N = 7; Song/Baclofen&Muscimol, N = 7. The number of behavioral counts performed during song playback was divided by the total behavioral count across the entire length of the video recording, resulting in one proportion score per behavior per video.

For assessing the relationship between behavior and egr-1 labeling in VMHl, we used a negative binomial regression that included the main effects of song stimulus and of cannula infusion treatment and their interaction (as above), and foraging behavior as an additional predictor variable. Combining available data for VMHl cell counts and behavioral videos, sample sizes were: Silence/Saline, N = 4; Song/Saline, N = 4; Silence/Baclofen&Muscimol, N = 2; Song/Baclofen&Muscimol, N = 5. Replicates for cell counting and behavior were determined as described above.

To determine significant responses to auditory stimuli, we used paired Wilcoxon Signed Rank tests to compare the stimulus firing rates to baseline firing rates on a per-stimulus basis. A single unit was considered auditory responsive if it had a significant response to at least one auditory stimulus. For assessing population differences in firing rates, we used a linear mixed model, with bird ID as a random effect and a single unit as a nested random effect. To assess whether the pattern classifier values were above chance for each auditory stimulus, we used Bonferroni-corrected one-sample Wilcoxon Signed Rank tests. The total sample size from N = 3 females was n = 54 single units. Firing rates and classifier input were calculated using 10 presentations of each of 5 auditory stimuli.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.