1 Introduction

Lung adenocarcinoma (LUAD) is the most common subtype of lung cancer worldwide, accounting for approximately 40% of all lung cancer cases [1, 2]. It originates from the bronchial mucosal epithelium, primarily due to prolonged smoking and environmental influences. Compared to squamous cell carcinoma, it exhibits different genetic mutations and alterations in signaling pathways [3]. Although various therapeutic strategies such as radiotherapy, surgical resection, targeted therapy, and immunotherapy have been used in the clinical treatment of LUAD in recent years, the 5-year overall survival rate of LUAD is only 15% [4]. Consequently, it’s critically important to develop effective biomedical predictive models that can assist in categorizing and managing LUAD patients and clinical diagnosis and treatment.

The centrosome is a multifunctional, membraneless organelle. It consists of two centrioles surrounded by pericentriolar material (PCM) and serves as the primary microtubule-organizing center (MTOC) in most animal cells [5]. Centrosomes are considered to play a central role in the life of the cell, primarily regulating the cell cycle and maintaining cell polarity. Centrosome abnormalities are closely associated with many cancers, including ovarian cancer, breast cancer, lung cancer, and multiple myeloma [6,7,8,9,10]. Centrosome abnormalities can be categorized into numerical abnormalities (copy number increase) and structural abnormalities (such as centrosome amplification). Studies have shown that centrosome dysfunction is linked to the tumor immune microenvironment [11]. The latest view is that centrosome aberrations (CA) are closely related to cancer metastasis [12]. Chromosomal instability due to centrosome amplification contributes to the acquisition of a more malignant phenotype, which promotes tumor progression [13]. However, there are no studies reporting the potential association of centrosome-related genes with the prognosis of LUAD.

Polyamine metabolism is intimately linked to the tumor microenvironment (TME) and plays an important role in regulating anti-tumor immune responses [14]. Polyamines mainly include spermine, spermidine, and putrescine [15]. Spermidine interacts with nucleic acids and proteins, among others, through electrostatic binding and plays a crucial role in cell division and proliferation by maintaining DNA genome stability and regulating gene transcription and translation [16]. Polyamine metabolic enzymes are often dysregulated in tumor cells, with high levels of polyamines supporting tumor cell growth and inhibiting the differentiation of immune cell subsets. Research shows that tumor cell-derived spermidine inhibits T cell activation and suppresses CD8 T cell activity by reducing cholesterol levels [17]. Spermidine synthase (SRM) is involved in the biosynthesis of spermidine [18]. In renal clear cell carcinoma, SRM has been identified as a core gene in polyamine metabolism highly associated with poor prognosis [19]. In bladder cancer, knocking out ODC1 and SRM can sensitize T24/THP cells to chemotherapy, while the overexpression of these two enzymes can enhance chemotherapy resistance [20]. miR-1908-mediated SRM regulation promotes cancer progression by controlling extracellular vesicle (EV) secretion in prostate cancer (PCa) [21]. However, in LUAD, SRM and polyamine metabolism have not been adequately studied, and their molecular mechanisms affecting LUAD progression need to be further investigated.

In our study, 14 centrosome-related genes (CRGs) were first collected from the literature, and 351 CRGs that were differentially expressed between normal and LUAD tissues were identified in the TCGA dataset by the ssGSEA and WGCNA (Weighted Gene Co-Expression Network Analysis) algorithms. We then used univariate Cox regression to screen for prognostically relevant CRGs, and consensus clustering to differentiate between two centrosome subtypes. Then, we screened 10 prognostic CRGs of LUAD patients using the least absolute shrinkage selection operator (LASSO) regression analysis. Based on these 10 genes, we constructed centrosome-related prognostic models and centrosome-related risk scores (CRS) for LUAD patients. Based on the CRS, we categorized LUAD patients into high-risk and low-risk groups. We characterized CRS at the bulk transcriptome and single-cell transcriptome levels, with the high CRS group tending to have multiple oncogenic pathway activation and immunosuppression, and the low CRS group having higher T- and B-cell infiltration and more benefit from immunotherapy. We confirmed that the high-risk group had a poorer prognosis, a higher frequency of genomic mutations, a lower level of immune infiltration, and a “cold” tumor phenotype closer to the proliferative active state. In conclusion, we conclude that this predictive model provides a novel and powerful reference for the diagnosis and treatment of LUAD.

2 Methods

2.1 Data collection and processing

Transcriptomic data of LUAD patients and corresponding clinical data were obtained from the TCGA (https://portal.gdc.cancer.gov/) database and GEO (http://www.ncbi.nlm.nih.gov/geo/) database. 500 LUAD cases (primary tumor samples with complete survival information) from the TCGA database were used to construct relevant prognostic signatures, and GSE31210 (n = 226), GSE50081 (n = 127), GSE72094 (n = 398), and GSE3141 (n = 111) were used to validate the centrosome-related genes for centrosomal prognosis based on LUAD related characteristics. Fourteen important centrosome genes were collected from published articles [22]. Somatic mutation data were obtained from the TCGA data portal, and somatic mutation data sorted in Mutation Annotation Format (MAF) were analyzed using the R package “maftools”. The RNA-seq data types for TCGA processing were all log2(TPM + 1). The corresponding data for this study can be found in the supplementary data files.

2.2 WGCNA identifies centrosome-related genes

The WGCNA package in R version 4.0.2 implements WGCNA [23]. In the TCGA cohort, the expression profiles of the top 10,000 genes, ranked by median absolute deviation (MAD), were input into WGCNA for subsequent analysis. The correlation between genes and centrosome scores is quantified by gene significance, while the correlation between module-representative genes and gene expression profiles is represented by module membership. The β = 5 power automatically calculated by the pickSoftThreshold function and the scale-free R2 = 0.9 calculated by the softConnectivity function were set to the softthreshold parameter to ensure a scale-free topological network and to generate the TOM matrix. The correlation between the blue module and the centrosome score and Stage staging defined by ssGSEA was 0.77 and 0.17, respectively, suggesting that the blue module was significant. The correlation between the blue module, centrosome, and clinicopathologic staging was 0.77 and 0.17, respectively. The blue module is considered centrosome-specific and has been selected for further analysis. The correlation between module membership and gene significance is visualized using verboseScatterplot. Genes within the blue module were therefore selected for GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) functional enrichment analysis.

2.3 Collection and processing of data for single-cell RNA-seq analysis

We used single-cell RNA sequencing data (50,093 transcriptomes after quality control and filtering) from 10 samples (5 normal lungs and 5 LUAD samples) from the single-cell sequencing data of Bischoff et al. Single-cell sequencing data were analyzed using the “Seurat” software package. Quality control (QC) was performed by retaining cells with less than 10% of mitochondrial genes and genes with expression ranging from 100 to 8000 in at least three cells. We then identified highly variable genes and set the number of highly variable genes to 2000 for subsequent analysis. The “Harmony” software package was used to remove batch effects. We constructed cell clusters using the “FindClusters” and “FindNeighbors” functions and visualized them using the “t-SNE” method. Finally, we performed cellular annotation based on the marker genes of different cell types. The CRS model gene set was quantified using the AddModuleScore, Singsore, AUCell, and UCell algorithms in the “irGSEA” package [24]. Cell-cell interactions were analyzed using the “cellchat” package, and we also used Monocle2 to infer cell trajectories. After downscaling and cell sequencing, differentiation trajectories were inferred using standard parameters.

2.4 Description and validation of centrosome-related subtypes

The 788 genes extracted from the blue module were intersected with the differential expressed genes (DEGs) of the TCGA cohort, 351 genes were analyzed by univariate Cox regression analysis, and prognostic genes were identified with the criterion of P < 0.05. Based on the 296 prognostic genes, unsupervised cluster analysis was performed using the “K-means method” in the “ConsensusClusterPlus” R package. Cumulative Distribution Function (CDF) and Consensus Heatmap were used to evaluate the optimal K-value of 2. Survival analysis was performed after clustering to determine the effect of clustering on prognosis. Principal component analysis (PCA) was performed on subgroups to determine differences between samples. In addition, additional validation was performed in the GSE31210 and GSE50081 cohorts.

2.5 Functional enrichment analysis

“limma” package for analyzing DEGs between high and low-risk groups [25], with the thresholds set at False Discovery Rate (FDR) < 0.05 and |log2 (Fold Change) |> 1. GO and KEGG pathway enrichment analysis was performed using the “clusterProfiler” R package [26]. Gene Set Enrichment Analysis (GSEA) was used to explore altered biological processes for centrosome-related prognostic signatures. “PROGENy” R package was used to assess 14 signaling pathway activities (androgen, estrogen, EGFR, hypoxia, JAK-STAT, MAPK, NFkB, PI3K, p53, TGFb, TNFa, Trail, VEGF and WNT) in patients [27].

2.6 Gene set variation analysis (GSVA)

We investigated the variation of biological processes between the two clusters using the ssGSEA and GSVA algorithms in the R package “GSVA”. Clear biological features were obtained from the Hallmark gene set (h.all.v2023.1.Hs.symbols) and the gene set constructed by Mariathasan et al. (from the IMvigor210CoreBioologies package) [28]. The gene set “c2.cp.kegg.v7.4.symbols.gmt” was also downloaded from MSigDB as a reference gene set.

2.7 Assessment of the immune infiltration microenvironment

XCELL, MCPCOUNTER, CIBERSORT, TIMER, EPIC, QUANTISEQ, and IPS algorithms for immuno-microenvironmental quantification using the R package “IOBR” [29]. Single-sample gene enrichment analysis (ssGSEA) was introduced to quantify the relative infiltration of 28 immune cell types in the TME, with unique combinations of characteristic genes for each immune cell subtype taken from the most recent literature. Immunophenoscore (IPS), a machine learning algorithm from The Cancer Immunome Atlas (TCIA), was used to quantify the relative infiltration of the 28 immune cell types in the TME [30], which is used to quantify the IPS score of LUAD patients based on their transcriptional profiles, indicating response to immunotherapy. The Tumor Immune Dysfunction and Exclusion (TIDE) algorithm is also used to assess a patient’s response to immunotherapy [31].

2.8 Construction and validation of centrosome-related prognostic risk profiles

To further investigate the prognostic value of CRGs for LUAD, centrosome-related prognostic characteristics were determined by Lasso analysis and stepwise multivariate Cox regression analysis. The CRS for each LUAD patient was then calculated based on the risk coefficients and LUAD expression profiles obtained in the multivariate Cox regression analysis, using the formula: CRS= − 0.653 * DBF4 + 0.201 * ANLN − 0.274 * MCM2 + 0.352 * GNPNAT1 + 0.251 * SRM + 0.369 * LDHA + 0.159 * UCK2 + 0.226 * CCT6A + 0.229 * MELTF + 0.206 * RCCD1. Subsequently, we used LUAD patients from the TCGA cohort as the training set and the GSE31210, GSE50081, GSE72094, and GSE3141 datasets from the GEO database as the validation sets. Based on the median risk score, we divided LUAD patients into two groups: low-risk and high-risk. The receiver operating characteristic (ROC) analysis was performed using the “timeROC” R package to analyze the predictive performance of risk characteristics.

2.9 Risk characterization and nomogram construction

To construct a nomogram model for clinical use, we first performed univariate and multivariate Cox regression analyses of clinicopathologic and risk characteristics. A nomogram was constructed based on clinical characteristics and risk scores to predict LUAD prognosis using the “rms” package. Calibration curves were generated to assess the predictive accuracy of the models. The reliability of the model was assessed using decision curve analysis (DCA).

2.10 Prediction of immunotherapeutic response and drug response

The IMvigor210 cohort, a cohort of uroepithelial cancers treated with the anti-PD-L1 antibody, was used to predict patient response to immunotherapy. Based on the Creative Commons 3.0 license, full expression data and clinical data are available from http://research-pub.Gene.com/IMvigor210CoreBiologies [32]. The GSE78220 cohort, a melanoma cohort treated with anti-PD-1 (pembrolizumab or nivolumab) immunotherapy, was also included to predict response to immunotherapy [33]. The GSE100797 cohort, a melanoma cohort treated with adoptive T-cell therapy (ACT), was also included as a predictor of response to immunotherapy [34]. We used the “oncoPredict” R package to assess the predictive power of CRS chemotherapeutic agents by calculating patient IC50s for various common chemotherapeutic agents. The Wilcoxon rank test was then used to compare IC50 differences between high/low-risk groups [35].

2.11 Exploration of scRNA

The CancerSEA online database [36] was used to study the function of CCT6A and SRM in LUAD at single-cell resolution in 14 different states, including angiogenesis, apoptosis, invasion, EMT, differentiation, proliferation, DNA damage, metastasis, hypoxia, inflammation, cell cycle, DNA repair, stemness, and quiescence. To gain more insight into the differential expression of model genes at the single-cell level, we used single-cell sequencing data from multiple NSCLC cohorts in the TISCH2 database (http://tisch.comp-genomics.org/home/) [37].

2.12 Cell culture and RT-qPCR

Human LUAD cell lines A549, H1299, and H1975 and human normal lung epithelial cell line BEAS-2B were purchased from the cell bank of the Chinese Academy of Sciences (Shanghai, China). We used A549, H1299, and H1975 cells for in vitro culture in DMEM medium and RPMI 1640 medium (Gibco, ThermoFisher Scientific, United States) supplemented with 10% fetal bovine serum, 1% penicillin and streptomycin (Gibco). SYBR Green qPCR mixture (Vazyme, China) was used to synthesize cDNA for real-time PCR, and GAPDH was standardized as a reference gene. Primer sequences: SRM, F-GTGGTGGCCTATGCCTACTG, R-CTCCTGGAAGTTCGTGCTCG; GAPDH, F-GGAGCG AGATCCCTCCAAAAT, R-GGCTGTTGTCATACTTCTCATGG.

2.13 Immunohistochemistry

We collected 5 lung tumor tissues and 5 peripheral lung tissues from patients with LUAD. The following primary antibody and antigen recovery protocol was used: SRM (Proteintech, 19858-1-AP, China). Embedded wax blocks were sliced and deparaffinized with a gradient of xylene and alcohol. After rinsing with water, endogenous peroxidase interference was removed by hydrogen peroxide immersion, followed by antigen repair by adding EDTA in an autoclave. After cooling at room temperature, it was rinsed with PBS buffer, closed by adding serum, and incubated in an incubator at 37 °C for 30 min. Primary and secondary antibodies were incubated separately, and then color was developed using DAB and background stained with hematoxylin.

2.14 Statistical analysis

Univariate Cox analysis of OS was performed to identify related genes and their prognostic value. Kaplan-Meier survival curves were generated and compared between the two groups using log-rank tests. Associations between prognostic model risk scores and gene expression, signaling pathway scores, and immune scores were assessed using Spearman’s correlation analysis. Two pairs of between-group tests were compared using the Wilcoxon test, and categorical variables were compared using the chi-square test or Fisher’s exact test. All statistical analyses were performed using R version 4.0.2 and its appropriate software packages. Differences were considered statistically significant at *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001.

3 Results

3.1 Identification of centrosome-related genes

We sought to determine the prognostic value of centrosomes in LUAD by studying centrosome-related genes using WGCNA. After stratifying patients by the high or low median level of centrosomes calculated by the ssGSEA algorithm, survival analyses revealed a clear difference between the two subgroups (Supplementary Figure S1A). To assess the potential prognostic value of centrosomes, we performed WGCNA using the expression profiles of the top 10,000 genes in the TCGA cohort as input to search for centrosome-specific genes. The power β = 5 was chosen as a soft threshold for scale-free network construction, and 13 modules were recognized by clustering dendrograms, with the blue modules selected for subsequent analysis (Fig. 1A). We further found a significant positive correlation between module members and centrosome-related genes (cor = 0.93) (Fig. 1B). Through the DESeq2 package, 5421 DEGs were identified. We then intersected these with 788 blue module genes, resulting in 351 CRGs (Fig. 1C). GO enrichment analysis indicates that CRGs are significantly enriched in chromosome segregation, karyokinesis, spindle, and cytokinesis. KEGG analysis showed enrichment in the cell cycle, oocyte meiosis, cellular senescence, and the p53 signaling pathway (Fig. 1D, E).

Fig. 1
figure 1

WGCNA algorithm to identify centrosome-related genes. A Correlation analysis of module and centrosome scores. B Scatter plot between blue gene significance (GS) and module membership (MM). C Venn diagram of DEGs and blue module genes. D GO enrichment analysis of centrosome-related genes. E KEGG enrichment analysis of centrosome-related genes

3.2 Construction and validation of centrosome-related features

Subsequently, we conducted univariate Cox regression analysis on 351 CRGs to obtain 296 prognosis-related CRGs for subsequent centrosome feature generation. KM clustering was performed on LUAD patients with corresponding gene expression data in the TCGA cohort. The optimal number of clusters was assessed using the ConsensusClusterPlus package. The results were most stable when the number of clusters was 2 (Fig. 2A, B). Survival analysis of both clusters confirmed that cluster 2 has a poorer survival prognosis (Fig. 2C). PCA shows that the cluster can significantly distinguish samples in the TCGA cohort (Fig. 2D). In addition, we performed external validation in the GSE31210 and GSE50081 cohorts and obtained similar results (Supplementary Figure S1B, D). A delineated group based on 296 prognostic centrosome-related genes showed different clinical features and centrosome-level patterns (Fig. 2E). Associations between estimated scores of immune infiltrating microenvironments and clusters as well as immune cell levels were examined in the TCGA cohort (Fig. 2F). ESTIMATEScores, ImuneScores, and StromalScores were higher in Cluster 1 than in Cluster 2. GSVA enrichment analyses showed that Cluster 2 was highly expressed in cell cycle pathways such as E2F targets, G2M checkpoints, and mitotic spindles; Cluster 1 was significantly enriched in IL2 STAT5 signaling, IL6 JAK STAT3 signaling, and inflammation-responsive immune-related signaling pathways (Fig. 2G). To further explore the relationship between centrosome-related clusters and TME characteristics, cluster 1 showed significantly higher levels of infiltration of central memory CD4T cells, activated B cells, myeloid-derived suppressor cells (MDSCs), and mast cells, while cluster 2 showed significantly up-regulated levels of infiltration of activated CD4T cells, type 2 helper T cells, and memory B cells (Fig. 2H). Further to explore the possible alterations of centrosome-related clusters in the cancer-immunity cycle, we found that the activity of the highly proliferative Cluster 2 was significantly down-regulated in the second (antigen presentation and processing), third (T-cell activation), and fifth (tumor-tissue T-cell infiltration) steps (Fig. 2I). Then, we investigated the expression levels of major immune checkpoints between the two clusters and found that all were significantly upregulated in Cluster 2 except BTLA (such as LAG3, CD274, IDO1, and PDCD1) (Fig. 2J). Finally, we found that the TIDE score was significantly higher in cluster 2, suggesting a lower likelihood of immunotherapeutic benefit in cluster 2 patients; and the centrosome score also showed a significant upregulation (Fig. 2K, L).

Fig. 2
figure 2

Construction and validation of centrosome-related clusters. A Consensus heatmap matrix for TCGA queue. B Consensus matrix CDF curves. C Kaplan-Meier survival analysis of the two clusters. D Sample clustering of the TCGA dataset using PCA. E The heatmap shows good separation of the two clusters identified by the KM algorithm through the features. F ESTIMATEScores, ImuneScores, and StromalScores for the two clusters in TCGA. G Heatmap showing the enriched biological pathways between the two centrosome-related clusters as calculated by the GSVA algorithm. H Expression abundance of 28 infiltrating immune cell types in the two clusters. I Box plot showing the difference in the fraction of cancer-immunity cycles between the two clusters. J Differences in expression of immune checkpoint genes. K Difference in TIDE scores between the two clusters. L Difference in centrosome scores between two clusters

3.3 Construction of centrosome-related prognostic signature

The above prognostically relevant CRGs were extracted for Lasso regression and stepwise multifactorial Cox regression analysis to construct a prognostic signature based on 10 core centrosome prognostic-related genes, including eight high-risk genes (LDHA, GNPNAT1, SRM, MELTF, CCT6A, RCCD1, ANLN, and UCK2) and two low-risk genes (MCM2 and DBF4) (Fig. 3A–C). To make the CRS more clinically applicable, we developed a nomogram by integrating CRS with clinical characteristics (Fig. 3D). The calibration curves showed a strong consistency between the nomogram predictions and the actual observations (Fig. 3E). Decision curve analysis (DCA) further revealed that the nomogram provides better clinical benefits compared to other clinical features (Fig. 3F). LUAD patients were divided into high-risk and low-risk groups based on the median risk score. Survival analysis showed that patients in the high-risk group had a shorter OS (log-rank test, p < 0.0001; Fig. 3G), and time-dependent ROC curves were plotted, with the area under curve (AUC) values at 1-, 2-, 3-, 4-, and 5-years of 0.74, 0.75, 0.73, 0.74, and 0.69 (Fig. 3H). Excellent predictive performance was also demonstrated in the GEO cohort, with a 1-year AUC of 0.9 in the GSE31210 cohort (Fig. 3I, J), and 1-, 3-, and 5-year AUC values for the GSE50081 and GSE72094 cohorts are all greater than 0.7, and excellent predictive performance was also shown in the GSE3141 cohort (Fig. 3K–P). In addition, to comprehensively compare CRS with other prognostic signatures, we searched the literature on relevant prognostic characteristics published in the past 5 years and incorporated 10 different signatures [38,39,40,41,42,43,44,45,46,47] that CRS exhibited better C-index performance than other models (Supplementary Figure S1C). Univariate and multivariate Cox regressions were performed to assess the independent predictive power of risk scores, and the results showed that risk scores (p < 0.001) could be used as independent predictors (Supplementary Figure S1E, F). In addition, we determined the cell cycle score by a z-score algorithm and observed a significant correlation between CRS z-score and cell cycle z-score (R = 0.77, p < 0.0001) in the TCGA pan-cancer group (Supplementary Figure S1G). These results indicated that the centrosome-related prognostic model based on 10 prognostically relevant CRGs demonstrated high accuracy and considerable stability in predicting LUAD prognosis.

Fig. 3
figure 3

Construction and validation of centrosome-related prognostic signatures. A Trajectories of each independent variable with lambda. B The plot of coefficient distributions of the log(lambda) series used for parameter selection (lambda). C Multivariate Cox coefficients for each gene in the risk profile. D Nomogram combining age, grade, gender, T stage, stage, and risk score. E Calibration curves for the constructed 1-, 3-, and 5-year survival line plots. F DCA decision curve analysis. GP K-M curves and ROC curves of risk models were constructed for 10 genes in the TCGA cohort and GEO cohort

3.4 Clinical characterization and functional enrichment of centrosome-related prognostic signature

We first analyzed the correlation between CRS and clinical characteristics. In the TCGA-LUAD cohort, there were significant differences in Stage, T stage, and N stage between the high-risk and low-risk groups (p < 0.01, chi-square test); Additionally, the high-risk group had a higher proportion of cluster 2-type patients (Fig. 4A). The box plot demonstrated that the risk score for centrosome-related cluster 2 is relatively high, while the risk score increased with stage, T stage, and N Stage (Fig. 4B). GSEA analysis was conducted on the DEGs of high-risk and low-risk groups to investigate the specific mechanisms of CRS-mediated prognostic stratification. The results showed that patients in the high-risk group were mainly enriched in the cell cycle, DNA replication, ECM receptor interactions, the P53 signaling pathway, and pyrimidine metabolism. Low-risk patients were primarily enriched in immune response-related pathways such as allograft rejection, cell adhesion molecules, and the intestinal immune network for IgA production (Fig. 4C, D). In addition, signaling pathways such as cell cycle progression (E2F targets, G2M checkpoints), MYC targets, EMT, and TGF-β signaling pathway are enriched in patients with a high CRS (Fig. 4E). Finally, we calculated the scores for 14 cancer-related pathways, and the results showed that most oncogenic pathways were enriched in the high CRS group. Moreover, the hypoxia feature score of the high-risk group was significantly higher than that of the low-risk group (Fig. 4F).

Fig. 4
figure 4

Clinical features and functional enrichment of centrosome-related prognostic signature. A Correlation of low-risk and high-risk groups with clinical features. B Differences in risk scores across clusters and pathologic staging. C, D Gene set enrichment analysis (GSEA) demonstrating enrichment pathways in high-risk and low-risk groups. E GSEA demonstrating Hallmark pathway enrichment results. F Box line plot demonstrating the differences of 14 cancer-related pathways between high and low risk groups

3.5 Characterization of CRS in pan-cancer, mutation and drug sensitivity

To further explore the panorama of CRS in pan-cancer, using a cancer-wide analysis of expression and survival data from 32 cancer types in the TCGA database, we found that tumors originating from the skin (UVM and SKCM) and the secretory glands (PAAD, CHOL) usually have higher CRS, such as those of brain origin (LGG and GBM), thyroid carcinoma (THCA), and kidney smoky cell carcinoma (KICH), which usually have lower CRS (Fig. 5A). Based on the transcriptomes of the two tumor groups with the highest and lowest 30% CRS, the relevant cellular signaling of centrosomes in cancer was investigated by GSEA. The results showed that cell cycle-related pathways (G2M, E2F), genome repair (DNA repair), protein expression (unfolded protein response), and glycolysis were enriched in the high CRS group, while in most of the tumors we found that immune-related pathways (interferon and inflammatory response, IL6 JAK STAT3 signaling) were associated with lower CRS, and at the same time we paid attention to oncogenic pathway EMT pathway was significantly activated in the high CRS group (Fig. 5B). Several studies have shown that mutation patterns play a significant role in tumor progression. Then assessing the level of tumor mutational burden (TMB), we found that the high-risk group had higher TMB scores (Fig. 5C). Prognostic analysis indicates that high TMB levels were associated with favorable outcomes (Fig. 5D). Comparing the mutational landscapes between patients in different risk groups, different mutation rates were observed between the high/low-risk groups (Fig. 5E). Since TMB and CRS may have a synergistic effect on prognosis, we integrated TMB and CRS for a new stratified prognosis analysis. The results showed that patients with high CRS and low TMB had the worst prognosis, while those with low CRS and high TMB had a better prognosis (Fig. 5F). These results indicate that somatic mutations are more frequent in high-risk patients, and the combined use of CRS and TMB may optimize the ability of CRS to predict the prognosis of LUAD patients. Finally, we analyzed the differences in the drug half-maximal inhibitory concentration (IC50) between high- and low-risk groups to predict treatment response, and 5-fluoropyrimidine (chemotherapeutic agent), gefitinib (targeted agent, epidermal growth factor receptor-TK inhibitor), dasatinib (targeted agent, Src/Bcr-Abl inhibitor), and ulitinib (targeted agent, ERK inhibitor) may be more effective for high-CRS group of patients may be more effective. In addition, oxaliplatin (chemotherapeutic agent), MK2206 (targeted agent, AKT inhibitor), BMS.754,807 (targeted agent, IGF-1R/IR inhibitor), and sorafenib (targeted agent, Raf inhibitor) may be more effective for patients in the low CRS group (Fig. 5G).

Fig. 5
figure 5

Mutational landscape of centrosome-related prognostic signature. A Differences in CRS scores and prognostic performance in different cancers. B Pathway enrichment analysis between tumor tissues of high CRS and low CRS groups. C Box-and-line graph plot demonstrating TMB differences between high and low risk groups. D Survival curves of LUAD patients between high and low TMB groups. E Waterfall plot demonstrating the mutational landscape between high and low risk groups. F Survival curve analysis of OS by combining TMB score and risk score. G Drug sensitivity analysis between high and low risk groups

3.6 CRS-based immune landscape and immunotherapy response prediction

The ESTIMATE algorithm was used to analyze the overall infiltration levels between risk groups. Patients in the low CRS group have higher stromal and immune scores, with the difference in immune scores being more significant (Fig. 6A). It is speculated that the poor prognosis in high-risk populations may be due to reduced or ineffective immune cell infiltration (immune-desert or immune-exclusion phenotypes). To further reveal the immune landscape heterogeneity in LUAD patients, we analyzed the distribution differences of immune cells in the TME of LUAD patients using seven immune infiltration algorithms. The results showed that the infiltration scores of B cells, CD8 T cells, and CD4 T cells (TIMER, MCPcounter) was significantly reduced in the high CRS group; and fibroblasts (XCELL), and M2-type macrophages (quantlseq) had higher infiltration in the low CRS group (Fig. 6B). Assessing specific pathways and immune scores by the ssGSEA algorithm, the high CRS group exhibited higher scores in cell cycle, DNA damage repair, and pan-fibroblast TGF-β response signature (Pan-F-TBRS), and the low CRS group exhibited higher cytolytic activity (CYT) scores and T-cell inflammation GEP scores (Fig. 6C). This suggests that patients in the low CRS group not only have a better prognosis but may also benefit more from immunotherapy. Meanwhile, in the low CRS group, there was a significant increase in the infiltration of activated CD8 T cells and activated B cells (Fig. 6D). In addition, the expression of most immune checkpoints is higher in the low CRS group (Fig. 6E). We analyzed the differences in TIDE and IPS scores, and our results suggested that patients in the low CRS group might be more responsive to immunotherapy (Fig. 6F, G). Pathway correlation analysis also found that risk scores were associated with oncogenic pathways (EMT1/2 and WNT target) as well as cell cycle regulation (Fig. 6H). In the cancer-immunity cycle, we found high and low-risk groups with centrosome-related clusters with similar manifestations (Fig. 6I). In summary, patients in different risk groups exhibited significant differences in immune cell infiltration and tumor biological functions, and patients in the high-risk group exhibited immune desert and immunosuppressive phenotypes. Next, we selected three independently published immunotherapy cohorts to further evaluate the accuracy of the risk score in predicting immunotherapy responses. First, we analyzed the urothelial carcinoma cohort (IMvigor210) that received anti-PD-L1 treatment. Compared to the high-risk group, the low-risk group showed better prognostic outcomes (Fig. 7A). Meanwhile, patients in the low-risk group have a higher response rate to immunotherapy (Fig. 7B, C). Moreover, it demonstrated greater predictive capability in Stage I & II patients (Fig. 7D, E). In the Willy Hugo et al. (GSE78220) cohort, we found that the proportion of PD/SD patients is higher in the high-risk group (Fig. 7F–H). In the Martin Lauss et al. (GSE100797) cohort, patients in the low-risk group exhibit excellent prognosis and benefit significantly from immunotherapy (Fig. 7I–K). In the TCGA cohort, using the TIDE algorithm to assess the response of LUAD patients to anti-PD1 and anti-CTLA4 treatments, the bar chart indicates that the low-risk group benefits more from immunotherapy (Fig. 7L). Overall, the prognostic signature constructed based on centrosome-related genes can effectively predict the prognosis and immunotherapy response of LUAD patients.

Fig. 6
figure 6

TME changes in centrosome-related prognostic signatures. A Differences in stromal and immune scores between high and low risk groups. B Heatmap demonstrating differences in immune cell infiltration between high and low risk groups as assessed by multiple immune infiltration algorithms. C Score changes for specific gene sets assessed by the GSVA algorithm. D Differences in expression abundance of 28 infiltrating immune cell types between high and low risk groups. E Differences in expression of immune checkpoint genes between high and low risk groups. F, G Differences in TIDE and IPS scores between the two groups. H Correlation between risk scores and tumor-related function. I Box plots show differences in cancer-immunity cycle scores between high and low risk groups

Fig. 7
figure 7

CRS predicts immunotherapy response. A Prognostic differences between risk score groups in the IMvigor210 cohort; B differences in risk scores for immunotherapy response in the IMvigor210 cohort; C distribution of immunotherapy response in risk score groups in the IMvigor210 cohort; D prognostic differences between risk score groups of patients in early stages of the IMvigor210 cohort; E prognostic differences between risk score groups of late-stage patients in the IMvigor210 cohort. FH Prognostic differences among risk-scoring groups and differences in scores and distribution of immunotherapy responses (GSE782220). IK Prognostic differences among risk-scoring groups, as well as differences in scores and distribution of immunotherapy responses (GSE100797). L Differences in the distribution of immunotherapy responses between the two groups in the TCGA-LUAD cohort

3.7 Characterization of CRS at the single-cell level

We utilized the single-cell sequencing data of LUAD from Bischoff et al. Using marker genes for different cell types, we categorized the cells into 10 major clusters: T cells, NK cells, macrophages, B cells, epithelial cells, dendritic cells, endothelial cells, mast cells, tumor cells, and fibroblasts (Fig. 8A). To investigate the specific role of CRS in the TME at the single-cell transcriptome level, we further quantified CRS scores at the scRNAseq level using the AddModuleScore, singsore, AUcell, and Ucell algorithms. All algorithms showed higher CRS scores for epithelial cells, tumor cells, and fibroblasts, and lower CRS scores for T, B, and NK cells (Fig. 8B, C). We analyzed the expression patterns of CCT6A, SRM, LDHA, GNPNAT1, UCK2, MELTF, RCCD1, DBF4, MCM2, and ANLN in different cell types, and we found that SRM, CCT6A, and LDHA were broadly expressed in immune cells such as in dendritic cells, NK cells, macrophages, and fibroblasts with high levels of expression (Fig. 8D, E). Similar results were also obtained in the NSCLC dataset in the TISCH2 database (Supplementary Fig. 1A–C), in addition we found that in Treg cells, Tprolif cells and CD8Tex, suggesting that SRM, CCT6A and LDHA may be involved in regulating the formation of the immunosuppressive microenvironment. To capture the gene expression at single-cell resolution and its correlation with the functional status of cancer in LUAD, we used the GSE85534 dataset, which was analyzed through the CancerSEA database, and as shown the expression of CCT6A and SRM was positively correlated with the functional status of proliferation, cell cycle, DNA damage, and so on in LUAD (Fig. 8F). Using the CRS model formula, we calculated the CRS of each cell in the single-cell transcriptome and classified tumor cells into high-risk scoring and low-risk scoring groups and investigated their interactions with other cell types in the TME. The results showed that tumor cells with high-risk scores communicated with more types of TME cells and acted as stronger receivers and influencers in TGFb signaling, Notch signaling, EPHB signaling, and HGF signaling. Thus, they may regulate tumor cell proliferation, differentiation, adhesion and migration, and have an impact on cancer cell survival (Fig. 8G–J). Macrophages play an important role as TMEs and most CRGs were highly expressed in macrophages, so we extracted macrophages for further analysis, by reclustering and annotating them by dimensionality reduction and labeled CD163 + macrophages (Fig. 8K). Then by macrophage trajectory analysis, we found that CD163 + macrophages were at a later stage of differentiation time and were proportionally higher in states 2 and 3 (Fig. 8L). Meanwhile, the expression of CCT6A, SRM, and LDHA was also upregulated with a differentiation time expression, consistent with the differentiation trajectory of CD163 + macrophages (Supplementary Fig. 2D).

Fig. 8
figure 8

Characterization of CRS at the single-cell level. A t-SNE profiles of cells in all scRNA-seq samples, annotated and colored by cell type. B, C Bubble plots and violin plots showing the enrichment scores for the CRS gene set for each cell type using AUCell, UCell, singscore, and AddModulescore. D, E Expression of model genes in different cell types. F Correlation between gene expression and different functional states in LUAD. GJ Circos plots showing multiple signaling pathway networks, as well as heat maps showing the roles of different cell types in the pathway networks. K t-SNE mapping of CD163+ macrophages and other cell types. L Macrophage trajectory analysis

3.8 Comprehensive analysis of key genes

To further characterize hub genes, survival analysis was performed on ten modeled genes, and nine genes except DBF4 were found to have poor overall survival (OS) based on Cox regression analysis (Supplementary Figure S3A–J). Subsequently, we examined their expression between normal and tumor samples and found that they were all significantly upregulated in tumors (Supplementary Figure S3K). We also analyzed the potential pathways associated with each risk gene. A total of 39 pathways were significantly associated with these nine genes, including angiogenesis, apical junctions, and apoptosis (Fig. 9A, B). Our data showed that RCCD1, MCM2, SRM, CCT6A, and UCK2 were significantly negatively correlated with the stromal score, the immunity score, and the estimated score; however, MELTF was significantly positively correlated with the immunity score (Fig. 9C). Correlation analysis showed that UCK2, CCT6A, SRM, GNPNAT1, and RCCD1 were significantly negatively correlated with T-cell infiltration, and CCT6A and SRM, similar to RCCD1, may be involved in the composition of the immunosuppressive microenvironment (Fig. 9D). And SRM was significantly positively correlated with Treg cells. In addition, LDHA, GNPNAT1, and UCK2 were significantly positively correlated with M0 macrophages and neutrophils (Fig. 9E). In addition, we calculated the relationship between 28 immune cells and 9 genes by the ssGSEA algorithm and found that all were significantly positively correlated with Th2 cells and most of them were significantly negatively correlated with CD8 T cells (Supplementary Figure S4A–I). In addition, we analyzed the correlation between 9 genes and 14 identified centrosome genes, and the results showed that almost all of them were positively correlated with the expression of centrosome genes, and significantly positively correlated with both key centrosome-encoding genes PLK4 and SASS6 (Fig. 9F). Then, PPI network analysis was performed to identify hub genes for centrosome-related prognostic signatures, and we found that SRM was associated with most of the modeled genes, combined with the above results SRM was identified as a hub gene (Fig. 9G). Finally, we further analyzed the clinical value of SRM expression in LUAD. The ROC curve showed that SRM had a strong diagnostic ability in LUAD (AUC = 0.921) (Fig. 9H), while the difference in expression of SRM between tumor tissues and paired normal tissues was significant (Fig. 9I); and the level of mRNA expression of SRM increased with the increase of pathological stage (Fig. 9J–L). The results obtained from the TCGA database were validated using RT-qPCR experiments using normal and LUAD cells. The results showed that SRM expression was significantly increased in LUAD cells (A549, H1299, and H1975) compared to BEAS-2B, with the highest expression in H1299 cells (Fig. 9M). In addition, the IHC results verified that SRM expression was significantly higher in LUAD samples than in adjacent normal lung tissues (Fig. 9N) (Supplementary Figure S4J), and again confirmed by data from the CPTAC database (Fig. 9O).

Fig. 9
figure 9

Comprehensive analysis of key genes. A Heatmap of gene-pathway correlations. B Heatmap of enrichment scores for key pathways. C Correlation of 9 prognostic signatures with TME score. D, E Correlation of 9 prognostic signatures with immune cells (Spearman). F Correlation analysis of 9 prognostic signatures with 14 centrosome genes. G Modeling genes constructed in the PPI network. H ROC curves show the sensitivity and specificity of SRM for diagnosing lung adenomas. I Expression differences between tumor and normal tissue paired samples in the TCGA cohort. JL Differences in the expression of SRM in different stages and pathologic stages (T, N). M RT-qPCR validation of SRM expression. N The typical IHC staining images of SRM in normal lung tissues and LUAD tissues. O Comparison of LUAD and normal tissue SRM protein expression in CPTAC database

4 Discussion

Currently, LUAD remains the leading cause of death and health risks worldwide [48]. As a highly heterogeneous disease, LUAD exhibits extremely variable outcomes among individuals [49, 50]. With the adoption of various therapeutic strategies such as molecularly targeted drugs, surgery, radiotherapy, and immunotherapy, the 5-year survival rate has improved dramatically. In recent years, studies in genomics, transcriptomics, and proteomics, as well as single-cell multi-omics data, have deepened our understanding and knowledge of LUAD [51,52,53,54]. A novel HIF1α-STIL-FOXM1 axis regulating tumor metastasis was identified in a study by Wang et al. [22], STIL was associated with the regulation of centrosome integrity and mitotic spindle organization [55]. In lung cancer, STIL is highly expressed and highly correlated with poor prognosis, and STIL overexpression activates the EMT pathway. However, most of these studies have focused primarily on cellular functions and specific molecular targets, with few studies focusing on the role of certain subcellular structures.

In this study, we identified 351 CRGs that were differentially expressed between normal and LUAD tissues by the WGCNA algorithm and screened for prognostically relevant CRGs using univariate Cox regression. Consensus clustering identified two centrosome-related subtypes. Significant differences in survival prognosis, immune landscapes, and biological processes were observed between the subgroups. We then identified candidate genes for centrosome-related prognostic signatures by Lasso regression and multivariate Cox regression, including LDHA, GNPNAT1, MELTF, CCT6A, RCCD1, ANLN, UCK2, MCM2, DBF4, and SRM.

4.1 Biological functions of model genes

Abnormal enhancement of glycolytic metabolism is one of the key characteristics of cancer cells, with LDHA being a major key enzyme [56]. GNPNAT1 is a key regulatory enzyme in the hexosamine biosynthesis pathway (HBP), and the end products of HBP are involved in glycosylation modification [57, 58]. MELTF plays a role in various metabolic processes and has been identified as a serum biomarker for several types of cancer [59]. CCT6A regulates cytoskeletal folding, and CCT6A can inhibit the transcription factor SMAD2 downstream of the TGF-β pathway, thereby suppressing NSCLC metastasis [60]. RCCD1 is a tumor promoter in a variety of cancers, playing a role in α-tubulin acetylation and microtubule stabilization, thereby promoting cell proliferation and metastasis [61]. ANLN is an actin-binding protein that is significantly upregulated in various tumor tissues and enhances malignant tumor phenotypes [62]. In previous studies, ANLN was incorporated into mitotic spindle-related features [63].

UCK2 is a key rate-limiting enzyme in pyrimidine metabolism (salvage synthesis), upregulated in various cancers and promoting tumor progression through multiple oncogenic pathways [64]. UCK2 can support tumor proliferation by catalyzing sufficient nucleotides [65]. MCM2 belongs to the microchromosome maintenance (MCM) gene family, which plays a central role in the regulation of the initiation of DNA replication [66]. DBF4 and CDC7 interact to form an active DDK (DBF4-dependent kinase) complex, where DBF4 serves as the regulatory subunit of DDK, playing a critical role in DNA replication and cell proliferation [67]. Interestingly, research shows that MCM2 is a target regulated by Cdc7-Dbf4 during the initiation of DNA synthesis, and DDK regulates the initiation of DNA replication through the phosphorylation of the MCM hexameric complex to promote helicase activation [68, 69].

Subsequently, we assessed the prognostic prediction potential of the centrosome-related prognosis model in the TCGA cohort and four external validation cohorts. By using the median CRS value, we categorized LUAD patients into high-risk and low-risk groups. We analyzed the clinical characteristics, mutation landscape, and differences in drug sensitivity between the high and low CRS groups, and explored the differences in biological processes and the immune microenvironment of LUAD patients. Notably, the biological process changes in the high-risk group of patients are not only related to the cell cycle, genomic repair, and protein expression but also associated with various oncogenic signaling pathways such as mTORC1 signaling, epithelial–mesenchymal transition, and the TGF-β signaling pathway. Next, we predicted the differences in response efficacy to immunotherapy between the different risk groups, and by comprehensive analysis of modeled genes, SRM was identified hub gene, which is highly expressed in LUAD cells and clinical tissues, and significantly affects the prognosis and immune status of patients with LUAD. As the key enzyme in spermidine synthesis, the molecular mechanisms by which SRM affects cancer progression have not yet been fully studied, and we found that nine genes, including SRM, were significantly positively correlated with centrosome-related genes, and their interactions remain to be further investigated.

Recent studies have shown that centrosomal dysfunction (mainly consisting of numerical and structural abnormalities) promotes the secretion of multiple inflammatory factors and drives aging and tumor immune escape [11]. Centrosome amplification is the most common centrosome defect in tumors and is often accompanied by aneuploidy, multipolar mitosis, and chromosomal instability (CIN) [70]. In addition, recent studies have found that multiple centrosomes enhanced the migratory capacity and immune cell effector function of dendritic cells, and dendritic cells with additional centrosomes had a stronger capacity to secrete inflammatory cytokines and T-cell responses [71]. Therefore, we focused on centrosome-related gene profiles altered in TME, with multiple anti-tumor immune cells such as B-cells, dendritic cells, CD4 T-cells, and CD8T-cells infiltration being significantly down-regulated in the high-risk group. Correlation analysis of model genes also revealed that RCCD1 and SRM were significantly negatively correlated with multiple immune-related pathways (inflammatory response, IL2 STAT5 signaling, and IL6 JAK STAT3 signaling), which correlates with the immunosuppressive state of patients in the high-risk group. In previous studies, SRM has been mostly focused on the study of polyamine metabolism, which is closely related to the cell cycle and significantly affects cell proliferation and growth in a variety of tumors [72]. In addition, we noticed an upregulation of TP53 mutation rate in the high CRS group, and the mutational status of the TP53 gene plays an important role in the tumor immune microenvironment. According to recent studies, TP53-mutated tumor cells may promote the aggregation of tumor-associated macrophages through the secretion of interleukin 34 (IL-34), which suppresses the anti-tumor immune response of CD8 T cells [73]. This mechanism suggests that TP53 mutant tumors may require more personalized immunotherapy regimens.

Given the heterogeneity in the immune status of patients with different risk scores, we analyzed the differences in responses to immunotherapy among different groups of patients. We found that patients in the high-risk group were resistant to immunotherapy. We also validated the stability of CRS-predicted immunotherapy in other immunotherapy cohorts, including urothelial carcinoma (IMvigor210), melanoma cohorts treated with anti-PD-1 immunotherapy (GSE78220), and melanoma cohorts treated with adoptive T-cell therapy (ACT) (GSE100797). These results indicate that the high-risk population is resistant to immunotherapy and has a lower response rate. Additionally, we calculated the IC50 of several common chemotherapeutic drugs for LUAD, such as 5-fluorouracil and gefitinib, and found that high-risk patients might be more sensitive to these chemotherapeutic drugs.

At the bulk transcriptome and single-cell transcriptome level, we first noted that the high-risk group of genes in the EPIC algorithm had enriched cancer-associated fibroblasts (CAFs) and the CRS quantified by multiple single-cell algorithms was highest in tumor cells and fibroblasts, which represents a possible important role of fibroblasts in the TME in influencing LUAD progression and prognosis. In addition, the function of macrophages in TME is multifaceted, both in promoting tumor progression and in suppressing tumors by mediating tumor cell killing and activating adaptive immune responses. CD163 + macrophages, as a specific type of tumor-associated macrophages (TAMs), have a particularly important role in the TME. In the late stage of tumor progression, TAMs play an immunosuppressive role in the TME by releasing factors such as IL-1 and TGF-β, which promote tumor metastasis and remodeling of the extracellular matrix (ECM), and by secreting factors such as IL-10, TGF-β, and Prostaglandins, which promote Treg expansion and inhibit T cell metabolism [74]. And the expression of CCT6A, SRM and LDHA was correlated with the differentiation trajectory of CD163+ macrophages, suggesting that they may play an important role in macrophage differentiation and function. In addition, GSEA enrichment revealed that the EMT pathway was significantly enriched in the high CRS group and was significantly correlated with the GSVA enrichment scores of CRS with EMT signature gene sets, and cellular communication analysis also revealed that multiple EMT-related pathways play more important functions in high CRS tumor cells, thus CRS is closely associated with may be closely associated with EMT and metastatic processes in LUAD. In addition, the specific relationship between centrosome abnormalities and TME and the regulatory mechanisms need to be further explored.

However, our study still has some limitations. First, this study primarily analyzed public databases, so inherent case selection bias may have affected the results, and further validation in larger prospective trials is needed to assess the value of centrosome-related prognostic models and the potential for clinical applicability. Second, due to the limited sample size, large cohort studies are essential to assess the value of the model. Third, we investigated differences in response to CRS-predicted immunotherapy using cohorts such as IMvigor210, and the applicability to patients with LUAD remains to be further validated in clinical trials in LUAD. Fourth, further experimental studies are needed to elucidate the biological functions of these genes, for example, to explore the possible link between centrosome-related genes and centrosome dysfunction. Finally, we recognize that transferring CRS from bioinformatics analysis to clinical practice is a complex process with many hurdles to overcome. However, with a deeper understanding of the tumor microenvironment and advances in sequencing technology, we believe that CRS has the potential to become an important tool for guiding clinical diagnosis and prognostic assistance.

In summary, we developed a new centrosome-related prognostic model to predict the prognosis of patients with LUAD, which can be used to classify the prognosis of patients with LUAD. It contributes to a better understanding of the centrosome-related features of LUAD, and provides new insights into targeting and immunotherapy.