Introduction

Language-related impairment, such as difficulty in speech comprehension and semantic/grammatical coherence, is a key symptom in schizophrenia1,2, which often results in poor social functioning, including social engagement3,4,5, and consequently leads to poor quality of life6. These language-related symptoms have been suggested to be influenced by several cognitive function domains, e.g., verbal fluency, executive function, and semantic memory1. Therefore, refining the assessment of semantic memory may help provide useful information to better predict social/personal outcomes in patients with schizophrenia.

The verbal fluency tests (VFTs) are widely used as a measure of cognitive function related to verbal abilities. Among the VFTs, the letter fluency test (LFT) requires subjects to generate as many words as possible beginning with a designated letter (e.g., “F”) within a designated time (usually 1 min). On the other hand, the category fluency test (CFT) requires participants to produce words within a designated category (e.g., animals). Research has indicated that scores on the VFTs, especially those on the CFT, are significantly lower in schizophrenia patients7,8. This may be related to the idea that poor performance on the CFT represents semantic memory disorganization in patients with schizophrenia, which is more pronounced compared with the case for the LFT9,10. In light of this, the CFT has been specifically used to assess semantic memory11,12,13,14,15,16,17. Moreover, semantic memory disorganization has been identified as an intermediate cognitive phenotype among patients with schizophrenia18.

So far, the number of words produced in the CFTs, a conventional parameter of verbal fluency, has been associated with social functioning and quality of life19,20,21,22, in addition to negative symptoms23, alogia23,24,25, disorganization26, and delusion25. In contrast, higher cognitive function, e.g., semantic memory, as derived from CFT data, has rarely been explored in relation to social functioning in patients with schizophrenia, in spite of its association with psychiatric symptoms including alogia23, formal thought disorder24,27, and delusion15. It is possible, however, that semantic memory parameters, based on performance on the CFT, may provide an indicator of patient functionality, e.g., social functioning, and that these parameters are more sensitive than simple word counts.

In terms of evaluating semantic memory disorganization using CFT responses, previous studies have often focused on group-level analyses13,14,15,16,23,27,28,29,30, in spite of the importance of identifying the parameters of the semantic organization at an individual level. Specifically, obtaining individual-level scores representing semantic memory organization may facilitate the exploration of its relationship with social functioning. The absence of such an endeavor may be related to the fact that individual-level analyses have traditionally been labor-intensive, requiring detailed manual scoring that is time-consuming and prone to human errors31 due partly to the lack of automated scoring methods, including text mining technology. Moreover, the lack of such methods has hampered objective and reliable individual-level analyses. To address this deficit, in this study we will use text mining, referred to as text data mining, which is a process of extracting meaningful information from natural language text by identifying patterns, relationships, and contextual connotations32. More specifically, we will apply advanced text-mining approaches to CFT responses, thereby generating individual-level semantic memory scores.

Research on semantic memory measurement using the CFT has been advanced by several investigators. The above-referenced method by Troyer et al.11 has been refined by Kim et al.31 who developed an automated scoring method to assess the switching of sub-categories. The automated scoring method uses distributional representations to calculate the similarity between two consecutive words based on the word2vec model33, with switching counted when the similarity between two consecutive words falls below a predefined threshold31. Further, Ovando-Tellez et al.34 developed a more efficient and clearer interpretation method to assess semantic memory. Specifically, information on various aspects of switching and clustering (e.g., the variety of subclusters referenced by responses and the number of switches between subclusters) was collected, and a principal component analysis (PCA) was conducted. In this approach, the dimensionality of the data was reduced by combining multiple metrics into fewer composite variables, thereby enhancing the validity of the analysis. As a result, two component scores were derived: the switching component and the clustering component. In this study, we utilized these automated and PCA methods to enhance the efficiency and validity of semantic memory analysis.

The present study has two main objectives. First, to circumvent the above problems, we sought to determine whether advanced analysis, based on text mining approaches, would provide an effective measure of semantic memory organization in individual patients. Second, we explored the influence of semantic memory deficits on social functioning. It is speculated that the organization of semantic memory is associated with social functioning in patients with schizophrenia.

Methods

Participants

This study recruited two groups of participants: 239 patients with schizophrenia and 98 healthy controls (HC). Of the initial 239 patients, 53 were excluded due to not having an actual diagnosis of schizophrenia or because of withdrawal of consent. Additionally, 47 patients were excluded due to missing data for the main analysis, resulting in a final sample of 139 participants. The original recruitment took place between December 2013 and July 2024, targeting patients with schizophrenia within 5 years of onset who visited the Early Detection and Intervention Center for Schizophrenia (EDICS) at the National Center of Neurology and Psychiatry in Tokyo, Japan. EDICS follows up with patients annually to monitor their progress. All patients received a diagnosis of schizophrenia based on the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition35. Patients were excluded if their physician determined that participation could be detrimental to them. In contrast, we recruited 98 HC at Fukushima University in Fukushima, Japan, between May 2000 and March 2005. The HC data that were used in this study were collected as part of another study at Fukushima University. The HC group consisted of individuals with no psychiatric diagnoses and was recruited from vocational schools and universities.

The study design was approved by the Institutional Review Board of the National Center of Neurology and Psychiatry (Approval No. A 2023-1306), and written informed consent was obtained from all participants. When the participant was a minor (i.e., <20 years of age), written consent was obtained from a parent or guardian, with additional written assent obtained from patients aged 16–19 years. The use of HC data was approved by the Research Ethics Committee of Fukushima University.

Premorbid IQ

The 25-item short version of the Japanese Adult Reading Test (JART25)36 was used to estimate premorbid IQ. The JART25 measures the ability to read 25 kanji characters.

Psychiatric symptoms

The Brief Psychiatric Rating Scale (BPRS) is an 18-item scale used to assess psychiatric symptoms37. This scale includes such domains as somatic concern, anxiety, emotional withdrawal, conceptual disorganization, feelings of guilt, tension, mannerisms and posturing, grandiosity, depressive mood, hostility, suspiciousness, hallucinatory behavior, motor retardation, uncooperativeness, unusual thought content, blunted affect, excitement, and disorientation. Each item is rated on a seven-point Likert scale, ranging from 1 to 7, with higher scores indicating more severe symptoms. The primary measure of this scale is the total score from the 18 domains.

Cognitive function

Cognitive function was assessed using the Brief Assessment of Cognition in Schizophrenia (BACS) Japanese version38,39. This instrument evaluates six cognitive domains: verbal memory, working memory, motor speed, attention and speed of information processing, verbal fluency, and executive function. The verbal fluency domain consists of two types of VFT: the CFT and the LFT. In the CFT, participants are asked to generate as many words as possible orally within 1 min, with “animal” as the category cue. The LFT requires participants to generate as many words as possible beginning with designated letters (i.e., “KA” and “TA”). The primary measure of the BACS is a composite score calculated from the six domains, with higher scores indicating better cognitive functioning. The composite score has high test-retest reliability in patients with schizophrenia and healthy controls (intraclass correlations >0.80)39.

Social functioning

An abbreviated version of the Japanese version of the Specific Levels of Functioning Scale (SLOF), a 24-item instrument designed to evaluate social functioning, was used in accordance with well-established methodological practices from prior studies40,41. This scale covers three domains: interpersonal relationships, activities, and work. Ratings were obtained through interviews conducted by trained clinicians, with input from both patients and caregivers. Each item is scored on a five-point Likert scale, and higher scores reflect better functioning. In this study, the primary measure was the total score of the SLOF, i.e., the sum of its three domain scores ranging from 24 to 120. This method is based on the recommendation reported in previous studies40,41.

Calculation of semantic memory scores based on the CFT

The structure of semantic memory was quantified using data from the CFT, following the method originally automated by Kim et al.31, and later adapted to quantify semantic memory using the method developed by Troyer et al.11. Specifically, the following five indicators were measured.

  • Number of produced words: Total count of unique words produced within a 1-min timeframe.

  • Rank of the first switch: Position of the first response where the participant generated a second subcluster. In cases where no switches occurred, the rank of the first switch was set to zero.

  • Number of subclusters: Number of distinct subclusters that the responses reference.

  • Number of switches: Number of occurrences where two consecutive words refer to different subclusters.

  • Largest cluster size: The largest number of consecutive responses that refer to a single subcluster.

Figure 1 shows the outline of the computation of the semantic memory scores. Solely for the collection of the list of animal words, we used both baseline and follow-up CFT responses (i.e., a total of 581 instances); however, for the multivariate analyses, we only used 139 CFT responses from the baseline assessment after excluding missing data.

Fig. 1
figure 1

Process of calculating semantic component scores.

Initially, to evaluate the structure of semantic memory, preprocessing was conducted to exclude intrusion and repetition errors and to standardize word notation. Intrusion errors were defined as words belonging to categories other than the target category (animals), while repetition errors referred to repeated words. In the notation standardization process, words with the same meaning were unified under a single notation.

Next, the threshold of switches, which marks the boundary between semantic clusters, was calculated. The evaluation of switches followed the method proposed by Kim et al.31. Specifically, cosine similarity, which measures the similarity between two vectors based on the cosine of the angle between them, was used to determine the semantic relationship among words with higher cosine similarity, indicating greater similarity. The cosine similarity between two words’ vectors (WA, WB) in n-dimensional space is calculated as follows:

$${\rm{Given}}\;{\rm{vector}}\,{W}_{A}=\left(\begin{array}{c}{W}_{A1}\\ \vdots \\ {W}_{{An}}\end{array}\right),{W}_{B}=\left(\begin{array}{c}{W}_{B1}\\ \vdots \\ {W}_{{Bn}}\end{array}\right)\in {{\mathbb{R}}}^{n},$$

The cosine of the angle, or cosine similarity, between the two vectors is given by:

$$\begin{array}{lll}{Cosine}\,{similarity}\left({W}_{A}{W}_{B}\right) &=\frac{\left\langle {W}_{A},{W}_{B}\right\rangle }{\left|\left|{W}_{A}\right|\right|{|}{|}{W}_{B}{|}{|}}\\ &=\frac{{\sum }_{j=1}^{n}{{W}_{Aj}W}_{Bj}}{\sqrt{{\sum }_{j=1}^{n}{{W}_{Aj}}^{2}}\sqrt{{\sum }_{j=1}^{n}{{W}_{Bj}}^{2}}}\end{array}$$

To calculate cosine similarity, word vector representations—numerical vectors that capture the semantic meaning of words—were derived from chiVe, a pre-trained Japanese word2vec model33,42. This model was trained on a large-scale Japanese web corpus consisting of approximately 100 million web pages, allowing for accurate semantic comparisons between words. The threshold for determining a switch was calculated by computing the cosine similarity for all response words (430 words, forming 92,235 pairs). This threshold was chosen as it showed the strongest correlation with traditional manual switch detection methods31. Therefore, the median value of 0.235 was adopted (Fig. 2). Figure 2 illustrates a histogram of cosine similarity values, which provides a visual representation of the distribution of semantic similarity between word pairs. This histogram is crucial for understanding the thresholds used to define switches and clusters in the CFT. A switch was defined as occurring when the similarity between two consecutive words fell below this threshold31, while a cluster was defined as a sequence of two or more consecutive words with a similarity above the threshold.

Fig. 2: Histogram showing the relative frequency of cosine similarity values.
figure 2

The distribution of cosine similarity values between all response word pairs (430 words forming 92,235 pairs). The distribution peaks around median cosine similarity 0.235.

Finally, based on these definitions of switches and clusters, the five indicators were evaluated. An example of the evaluation of these parameters is shown in Fig. 3. After scoring the structure of semantic memory, each score was standardized.

Fig. 3: Example of the semantic parameters.
figure 3

The figure illustrates an example of cosine similarity between adjacent words in a semantic fluency sequence. The number of switches was counted when the similarity between two consecutive words fell below a predefined threshold. Conversely, words were grouped within the same subcluster when the similarity between two consecutive words exceeded this threshold. Following this method, five semantic memory parameters (number of produced words, rank of the first switch, number of subclusters, number of switches, and largest cluster size) were calculated. Dashed line, threshold level of 0.235.

Subsequently, PCA was performed using all of the indicators, following the approach of Ovando-Tellez et al.34.

Statistics

A two-tailed t-test was conducted to examine group differences in the semantic memory scores.

Multivariate analyses were conducted using four generalized linear models (GLMs) to investigate the association between the semantic memory scores (derived from the PCA) and the SLOF total scores. A GLM was chosen because it can accommodate the bounded nature of the SLOF scores, allowing more accurate modeling within the scale’s limits. Independent variables included in all four models were age, sex (female = 0, male = 1), education (in years), and the PCA components. Since the SLOF score is a bounded discrete variable, the dependent variable was transformed to a scoring rate to ensure proper scaling, and a GLM with a binomial distribution and a logit link function was utilized to prevent scores from exceeding the upper limit. The goodness-of-fit of the GLMs was evaluated using McFadden’s pseudo-R², which is a commonly used metric for logistic regression models. As part of additional exploratory work, we undertook three analyses to examine factors influencing each SLOF domain (interpersonal relationships, activities, and work). Pearson correlation analyses were also performed to explore the associations between the semantic memory scores and specific clinical traits (JART, BACS, BPRS conceptual disorganization).

Results were considered statistically significant at p-values of less than 0.05. All statistical analyses were performed with Python version 3.11.7. As this study is exploratory in nature, no formal power analysis was conducted to determine the sample size.

Results

Demographic data

The demographic characteristics of the study participants are presented in Table 1. In the schizophrenia group, the mean ± SD age of the participants was 27.63 ± 9.26 [15 to 57] years, while the average length of education was 13.98 ± 2.40 [9 to 21] years. In the HC group, the mean ± SD age of the participants was 26.08 ± 10.23 [18 to 55] years, while the average length of education was 13.97 ± 1.68 [11 to 19] years. There were no significant differences between the two groups in age (t = 1.21, p = 0.23) or education (t = 0.04, p = 0.97).

Table 1 Demographic and clinical profiles of the participants.

Calculation of semantic memory scores based on the CFT

After quantifying the five indicators, we conducted a PCA of these indicators. Consistent with previous research34, two factors with eigenvalues explaining 77% of the variance were identified. Specifically, the first component explained 45% of the variance, and the second component explained 32% of the variance. The component loadings are presented in Table 2. The first component is characterized by the number of subclusters and the number of switches, suggesting it may represent switching between semantic subclusters. The second component is characterized by the number of produced words, the rank of the first switch, and the largest cluster size, indicating it likely reflects the clustering of responses within semantic subclusters11. Therefore, the first component score was defined as the switching component score, and the second component score was defined as the clustering component score. This interpretation is consistent with prior research34, which also identified switching and clustering as key cognitive processes underlying verbal fluency performance. Each principal component score was calculated as follows:

  • Switching component score = 0.27 × number of produced words + (−0.11) × rank of the first switch + 0.63 × number of subclusters + 0.62 × number of switches + (−0.37) × largest cluster size.

  • Clustering component score = 0.67 × number of produced words + 0.45 × rank of the first switch + 0.08 × number of subclusters + 0.06 × number of switches + 0.58 × largest cluster size.

Table 2 Component loading scores of the semantic memory scores.

Comparisons of semantic memory scores between HC and schizophrenia patients

The results of the t-test between the schizophrenia group and the HC group are shown in Table 3. The scores for the rank of the first switch, the largest cluster size, and the clustering component score were significantly lower in the schizophrenia group compared to the HC group (t = 2.14, p = 0.03; t = 2.05, p = 0.04; t = 2.04, p = 0.04). Additionally, no significant differences were found between the two groups in the other scores, i.e., the number of produced words, the number of subclusters, the number of switches, or the switching component score (t = 1.89, p = 0.06; t = −1.41, p = 0.16; t = −1.74, p = 0.08; t = −1.48, p = 0.14). Based on the above results, there was an asymmetry in the relationship between the switching component score and clustering component score across the two groups.

Table 3 Comparisons of semantic memory scores between healthy control and schizophrenia patients.

Generalized multiple linear regression analysis

Table 4 shows the results of the GLM analysis in the schizophrenia group. The independent variables included in all four models were age, sex (female = 0, male = 1), education (in years), and the PCA components: the switching component score and the clustering component score. The GLM, which was used to examine the total SLOF score, had a pseudo-R2 of 0.29. The clustering component score, but not the switching score, was significantly associated with the SLOF score. The relevant statistical information is provided in Table 4. Figure 4 presents the individual data points illustrating the relationships between the social functioning (SLOF total scores) vs. the switching component scores (left panel) and clustering component scores (right panel).

Fig. 4
figure 4

Associations between scores on the Specific Levels of Functioning Scale Japanese version vs. those on the switching component (left panel) or clustering component (right panel).

Table 4 Generalized multiple linear regression analysis with Specific Levels of Functioning Scale total scores as independent variable.

Exploratory analysis: linking semantic memory scores to clinical and social functioning

Additional analyses were conducted to examine factors influencing each SLOF domain (i.e., interpersonal relationships, activities, and work). The analysis examining the SLOF interpersonal relationships score had a pseudo-R2 of 0.05. The clustering component score was significantly associated with the SLOF score (β = 0.05, p = 0.04, standardized β = 0.06). For the activities domain, the pseudo-R2 was 0.54. Both the switching and clustering component scores were significantly associated with the SLOF score (β = 0.05, p = 0.02, standardized β = 0.09; β = 0.17, p < 0.01, standardized β = 0.21). For the analysis of the SLOF work skills domain, the pseudo-R2 was 0.11. There was no evidence that the semantic memory component scores had an influence on the work skills domain scores (data not shown).

Pearson correlation analyses revealed significant positive correlations between the switching component score and BACS score (r = 0.21, p = 0.01). On the other hand, the clustering component score was significantly associated with both the JART premorbid IQ (r = 0.21, p = 0.01) and BACS score (r = 0.32, p < 0.01). No significant associations were found between the BPRS conceptual disorganization score and either semantic memory component score.

Discussion

This study is the first to examine the association between semantic memory organization and social functioning in patients with schizophrenia. The results of the PCA using a data mining approach is consistent with those in a previous study34. On the other hand, the clustering score was lower in schizophrenia patients. Importantly, the results from the GLM analysis showed that the clustering component score of semantic memory was associated with the SLOF total score.

The results of the PCA underscored the notion that there are two distinct aspects of specific semantic memory parameters as generated by the automated scoring of CFT data: a switching component and a clustering component. These results are consistent with those from previous research34, where similar semantic memory scores contributed to defining the same PCA component scores. Hence, both components were interpreted similarly; the switching component represents transitions between semantic subclusters, while the clustering component reflects the grouping of responses within semantic subclusters34. Also, given that traditional methods define switching and clustering based on the number of switches and mean cluster size11, it was expected that switching and clustering would influence each other (i.e., larger clusters would result in fewer switches). A high score for the rank of the first switch and the largest cluster size, both of which strongly influence the clustering component score, was associated with a low switching component score, as revealed by the results from the PCA (Table 2).

The observed asymmetry between the switching and clustering components across groups deserves some deeper theoretical consideration. In line with the exploration-exploitation framework, memory retrieval can be seen as a foraging-like process43,44. Individuals first exploit a cluster of highly associated items until their retrieval rate starts to decline44. Once the return from this cluster diminishes, they switch to exploring a new cluster in search of additional items44. In our study, participants who had a low clustering component score may have compensated by switching clusters more frequently, thereby sustaining their overall word production.

The absence of significant differences in the number of produced words between the patients and HC subjects, studied here, was not consistent with the results of previous meta-analyses7,8. This may be due to the characteristics of the research setting. The EDICS is an advanced medical care center for early schizophrenia; most participants had already received treatment and were at the initial phase of the disease.

When including these PCA component scores, the comparison of the semantic memory parameters between the HC subjects and schizophrenia patients suggests that there is semantic memory disorganization in the latter group. This is indicated by the lower scores on some semantic memory parameters, namely, the rank of the first switch, the size of the largest cluster, and the clustering score. These findings confirm the importance of not only the quantitative, but also the qualitative aspects of word production when it comes to fully understanding the semantic network disturbances associated with schizophrenia. Our data also demonstrate the utility of PCA in examining semantic memory organization.

Participants with lower semantic memory scores, particularly those with lower clustering component scores, exhibited poorer social functioning. Given that previous research has not extensively explored the link between semantic memory and social functioning, our findings are potentially of high importance. Such an association suggests that a well-organized semantic memory may be key to navigating and managing daily living activities in patients with schizophrenia. In contrast, the switching component was related only to the Activities domain, and did not exhibit a broad influence across other social functioning domains. This aligns with the notion that language-related symptoms in schizophrenia, driven by semantic memory disorganization1, lead to poor social functioning3,4,5. Since the study was cross-sectional, further studies with a longitudinal design could help validate whether semantic memory organization predicts changes in social functioning over time.

An exploratory analysis was conducted to examine the associations between the distinct aspects of semantic memory and specific clinical traits. The results revealed that the switching component score was significantly associated only with cognitive function, while the clustering component score was correlated with both premorbid IQ and cognitive function. In contrast, conceptual disorganization measured by the BPRS did not show significant associations with either aspect of semantic memory. These findings suggest that the two aspects of semantic memory may have different underlying mechanisms and should be treated separately from conceptual disorganization in future research.

Based on the discussion so far, assessing higher cognitive functions, particularly semantic memory as measured by the CFT, may provide a novel approach to the estimation of social functioning. As considerable effort is usually required to manage patients, the CFT’s ability to feasibly assess semantic memory may help estimate social functioning in clinical settings with time constraints. While developing automated analysis systems may be challenging, the availability of pre-trained word2vec models may facilitate the creation and distribution of standardized analysis tools for broader use in clinical practice. Further, if considerable impairments of semantic networks are identified by the above method, some treatments, such as Semantic Feature Analysis to enhance word retrieval45,46,47, may be effective in improving social outcomes in patients with schizophrenia.

The limitations of this study should be acknowledged. First, patient data were only collected from a single institution, which may introduce sample bias, as most participants studied here had already received treatment and were early-onset patients. This may also limit the generalizability of the findings. Second, the test-retest reliability of these scores was not examined, as this study focused solely on the initial assessment of semantic memory scores. Future research should aim to evaluate the stability of these scores over time to further validate these measures. Third, the generalizability of our analysis remains unclear due to the use of a single category in this study (i.e., animals). It is uncertain whether our analytical approach would yield consistent results across different word categories. Finally, while automated methods offer consistency and scalability, they require custom programming due to the current lack of standardized analysis tools, which, as noted above, may limit their applicability.

Conclusions

The results of this study suggest that semantic memory scores, derived from utilizing a text mining approach, facilitate the assessment of semantic memory and analysis of social functioning in patients with schizophrenia. Prior to this study, assessing higher cognitive functioning, such as semantic memory, was challenging due to technical constraints. Thus, the use of text mining techniques, as reported in the present study, may pave the way for the development of automated methods (e.g., deep learning approaches) to further refine semantic memory analysis. These endeavors should provide a framework for the exploration of social functioning by assessing semantic memory organization in patients with schizophrenia.