Your privacy, your choice

We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media.

By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.

See our privacy policy for more information on the use of your personal data.

for further information and to change your choices.

Skip to main content

Comprehensive analysis of scRNA-seq and bulk RNA-seq data via machine learning and bioinformatics reveals the role of lysine metabolism-related genes in gastric carcinogenesis

Abstract

Background

Gastric cancer (GC) is a highly aggressive and heterogeneous cancer with extremely complex biological characteristics. Lysine and its metabolism are closely related to human cancer, but little is known about how lysine metabolism-related genes contribute to gastric carcinogenesis.

Methods

The roles of lysine metabolism-related genes in GC were investigated by in-depth analysis of single-cell RNA sequencing (scRNA-seq) and bulk RNA sequencing (RNA-seq) data via machine learning and multiple bioinformatics methods and confirmed by multiple cell and molecular biology methods.

Results

By systematically analyzing the heterogeneity of GC cells and interactions among cell subtypes, two key genes, solute carrier family 7 member 7 (SLC7A7) and vimentin (VIM), were innovatively identified as lysine metabolism-related genes involved in gastric carcinogenesis. The potential functional mechanisms involved immune infiltration, signaling pathway regulation, drug sensitivity, molecular regulatory networks, tumor regulatory genes, and metabolic pathways. A reliable prognostic risk nomogram was established for GC prognosis prediction. Moreover, the expression of the lysine metabolism-related genes SLC7A7 and VIM and their effect on cellular phenotypes in gastric carcinogenesis were verified in clinical samples and in vitro experiments, including cell proliferation, migration, invasion and cell cycle assays.

Conclusions

We explored the role of lysine metabolism-related genes and prognostic models in GC with multiple datasets, providing novel metabolic targets.

Peer Review reports

Background

Gastric cancer (GC) is one of the most common malignant tumors of the digestive system and ranks fifth in incidence and fifth in mortality worldwide [1]. Stomach adenocarcinoma (STAD) is the most common pathological type of GC and poses a serious threat to human health [2, 3]. At diagnosis, more than half of all GC patients are already in the advanced stage, at which point the optimal treatment window for surgical intervention has passed. Current clinical treatments for advanced GC include surgery, radiotherapy, chemotherapy, and immunotherapy. Although GC patients can benefit from these treatments, their 5-year survival rate is still less than 30% [4]. The factors leading to a poor prognosis for patients with GC include unclear early symptoms, genetic heterogeneity, drug resistance. In addition, the unclear pathogenesis of GC greatly limits the effects of clinical molecular drug therapy. Therefore, exploring GC pathogenesis and related molecular targets is crucial for GC treatment and prognosis evaluation.

Metabolic reprogramming promotes rapid cell growth and proliferation by regulating energy metabolism and has been recognized in recent years as an important feature of cancer, as it shows great potential as a marker for cancer pathogenesis, early diagnosis, differential diagnosis, staging, and prognostic judgment [5,6,7]. Amino acid metabolism in cancer not only serves as a substrate for protein synthesis and allows tumors to maintain their proliferation but also plays a role in energy production, nucleoside synthesis, and the maintenance of cellular redox homeostasis [8].

Lysine is an essential amino acid. In recent years, studies have revealed that lysine metabolism is associated with cancer occurrence and development, and its role in cancer development has come into focus [9]. Lysine metabolism regulates the metabolic plasticity of cancer cells not only through posttranslational modifications of proteins such as acetylation and methylation but also through histone crotonylation to modulate cancer immunity [10, 11]. Furthermore, lysine-222 succinylation suppresses the lysosomal degradation of lactate dehydrogenase A and promotes cancer cell invasion and proliferation [12]. However, little is known about how lysine metabolism-related regulatory genes contribute to gastric carcinogenesis.

Therefore, in this study, we conducted a comprehensive investigation of lysine metabolism-related regulatory genes in GC by in-depth analysis of single-cell RNA sequencing (scRNA-seq) data combined with bulk RNA sequencing (RNA-seq) data, multiple bioinformatics methods, and molecular biology methods. Our research scheme is shown in Fig. 1. First, we conducted single-cell level analysis, cell subpopulation annotation, and intercellular communication analysis on the scRNA-seq data and identified key genes related to lysine metabolism in GC by utilizing bulk RNA-seq data and machine learning algorithms. Solute carrier family 7 member 7 (SLC7A7) and vimentin (VIM) were innovatively identified as key lysine metabolism-related genes involved in gastric carcinogenesis. Next, we explored the potential role of lysine metabolism-related genes SLC7A7 and VIM involved in gastric carcinogenesis, including immune infiltration, signaling pathway regulation, drug sensitivity, transcriptional regulatory networks and microRNA (miRNA) networks, and correlation analysis of tumor regulatory genes as well as the relationships among SLC7A7 and VIM, metabolism-related genes, and metabolic pathways. Moreover, SLC7A7 and VIM and important clinical indicators were selected to construct a prognostic risk model, and its clinical predictive performance was assessed. Finally, the expression of SLC7A7 and VIM and their effect on cellular phenotypes in gastric carcinogenesis were verified in clinical samples and in vitro experiments. In this study, we explored the role of lysine metabolism-related genes in gastric carcinogenesis and constructed prognostic models for GC with multiple datasets, providing novel metabolic targets.

Fig. 1
figure 1

Research scheme. Comprehensive analysis of scRNA-seq and bulk RNA-seq data via machine learning and bioinformatics reveals the role of lysine metabolism-related genes in gastric carcinogenesis

Materials and methods

Data acquisition

The Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/info/datasets.html) is a gene expression database created and maintained by the National Center for Biotechnology Information (NCBI), USA. A single-cell data file for GSE163558 was downloaded from the NCBI GEO public database, and data from four samples (GSM5004180, GSM5004181, GSM5004182 and GSM5004183) with complete single-cell expression profiles were obtained for single-cell analysis. The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/), the largest database of cancer gene information, contains data on gene expression, miRNA expression, copy number variation, DNA methylation, single-nucleotide polymorphism (SNP), etc. The raw mRNA expression data of the processed STAD dataset were downloaded, including those from the normal control group (n = 36) and cancer group (n = 412). The Human Protein Atlas provides immunohistochemical staining images of gastric cancer or normal gastric tissues and cell location information.

Single-cell analysis

The expression profiles were first imported into the Seurat package to filter out genes with low expression (nFeature_RNA > 50 and percent.mt < 20). The data were sequentially processed for normalization, homogenization, and principal component analysis (PCA), and the optimal number of PCs was determined via ElbowPlot. The selection of the number of PCs for this analysis is based on two considerations: the overall trend of the points in the elbow plot and the distance between the points. Specifically, the number of PCs with contributions less than 5% of the standard deviation and a cumulative contribution of 90% of the standard deviation is the optimal cutoff point of the curve. The positional relationship between each cluster was subsequently obtained by T-distributed stochastic neighbor embedding (TSNE) analysis. The clusters were annotated by the Celldex package according to whether the cell cluster has an important relationship with tumorigenesis. Genes with logFC > 0.585 and adjusted p value < 0.05 were selected as unique marker genes for each cell subtype [13, 14].

Ligand‒receptor interaction analysis

CellChat enables quantitative inference and analysis of intercellular communication networks from single-cell data and uses network analysis and pattern recognition methods to predict the main signal inputs and outputs of cells, as well as how these cells and signals coordinate their functions [15]. In this analysis, normalized single-cell expression profiles were used as input data, and cell subtypes derived from single-cell analysis were used as cellular information to analyze cell-related interactions and quantify the closeness of interactions in terms of weights and counts of interactions between cells to observe the activity and impact of each type of cell in carcinogenesis.

Random survival forest analysis

The RandomForestSRC package was used for feature selection, and the random survival forest algorithm was applied to rank the importance of related differentially expressed genes (DEGs). Nrep = 100 indicates that the number of iterations was 100 in the Monte Carlo simulation. Genes with relative importance > 0.2 were identified as the final marker genes.

Immune infiltration analysis

CIBERSORT, a widely used method for evaluating immune cell types in microenvironments, is based on the principle of support vector regression and performs deconvolution analysis on the expression matrix of immune cell subtypes [16]. It contains 547 biomarkers distinguishing 22 human immune cell phenotypes, including T cells, B cells, plasma cells, and myeloid cell subsets. The CIBERSORT algorithm was used in this study to analyze patient data to infer the relative proportions of 22 types of infiltrating immune cells and conduct Spearman correlation analysis of gene expression and immune cell content.

Gene Set Enrichment Analysis (GSEA)

GSEA uses predefined gene sets to rank genes according to their differential expression levels in two types of samples and then tests whether the predefined gene set is enriched at the top or bottom of this ranked list [17]. In this study, the differences in Kyoto Encyclopedia of Genes and Genomes (KEGG) signaling pathways between the high-expression group and low-expression group were compared to explore the molecular mechanisms of key genes by GSEA. The number of permutations was set to 1000, and the type of permutation was set to phenotype.

Drug sensitivity analysis

The R package pRRophetic was used to predict the chemotherapy sensitivity of each cancer patient sample based on the pharmacogenomic database Genomics of Drug Sensitivity in Cancer (GDSC, https://www.cancerrxgene.org/). Regression methods were used to obtain estimated inhibitory concentration 50 (IC50) values for each specific chemotherapy drug, and 10-fold cross-validation was conducted on the Genomics of Drug Sensitivity in Cancer (GDSC) training set to assess the accuracy of the regression and predictions. For all parameters, the default values were selected, including “combo”; this setting enables the removal of batch effects by using the average expression value for duplicate genes.

Transcriptional regulation analysis of key genes

The Cistrome DB (http://cistrome.org/db) is a comprehensive database used for researching chromatin immunoprecipitation-sequencing (ChIP-seq) and DNase-sequencing (DNase-seq) data. This study explored the regulatory relationships between transcription factors and key genes through the Cistrome DB; the genome file was set to hg38, the transcription start site was set to 10 kb, and visualization was carried out via Cytoscape.

MiRNA analysis

miRNAs are small noncoding RNAs that regulate target gene expression by promoting the degradation of mRNAs or inhibiting their translation. Therefore, this study further analyzed a fraction of miRNAs associated with key genes whether that regulate the transcription or degradation of some risk-associated genes. miRNAs related to key genes were obtained from the miRcode database (http://www.mircode.org/index.php), and the miRNA regulatory network of key genes was visualized via Cytoscape.

Nomogram model construction

The nomogram was constructed via regression analysis. Its predictive value was assessed by constructing a multivariate regression model, assigning a score to each level of influential factors according to their contribution to the outcome variable, and then summing the individual scores to obtain a total score.

Clinical specimen collection

A total of 46 paired GC tissues and adjacent normal tissues were collected from surgical patients at The First Affiliated Hospital of Ningbo University between 2013 and 2022. None of the patients received radiotherapy or chemotherapy before surgery, and all cancer tissues were confirmed by pathological examination. The tissues were immersed in RNA-fixer Reagent (Bioteke, Beijing, China) and then stored at -80 °C until use. This study was approved by the Human Research Ethics Committee of Ningbo University and the ethics committee of the First Affiliated Hospital of Ningbo University (IRB No. 20120303 and No. KY20220101). Written informed consent was obtained from the patients.

RNA extraction and RT‒PCR detection

RNA was extracted from tissues and cells using TRIzol (Ambion, Carlsbad, CA) and reverse transcribed into cDNA using the GoScript Reverse Transcription System (Promega, Madison, WI, USA). Quantitative real-time reverse transcription‒polymerase chain reaction (RT‒qPCR) was performed using GoTaq qPCR master mix (Promega). The reaction conditions were as follows: predenaturation at 95°C for 5 min, then 40 cycles of denaturation at 95°C for 15 s, annealing at 54°C for 30 s, and extension at 72°C for 30 s. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was used as the control. The primers used were as follows: 5’-CCTGCTTATATCCAGGACCAA-3’ (sense) and 5’-GGCCACTTCATACTCAGTGCT-3’ (antisense) for SLC7A7; 5’-AGTCCACTGAGTACCGGAGAC-3’ (sense) and 5’-CATTTCACGCATCTGGCGTTC-3’ (antisense) for VIM; and 5’-ACCCACTCCTCCACCTTTGAC-3’ (sense) and 5’-TGTTGCTGTAGCCAAATTCGTT-3’ (antisense) for GAPDH.

Cell culture

Human GC cell lines AGS (Cellosaurus ID: CVCL_0139), HGC-27 (Cellosaurus ID: CVCL_1279) and NUGC-3 (Cellosaurus ID: CVCL_1612) and the normal gastric mucosal epithelial cell line GES-1 (Cellosaurus ID: CVCL_EQ22) were purchased from the Cell Bank of the Chinese Academy of Sciences or Shanghai Institute of Biochemistry and Cell Biology. Cell lines were identified by short tandem repeat (STR), and negative for mycoplasma test. The cells were cultured in RPMI-1640 medium (Invitrogen, Grand Island, NY, USA) supplemented with 10% fetal bovine serum (FBS) and incubated at 37 °C with 5% CO2.

Cell transfection

The cells were inoculated into 6-well plates, and the number of inoculations per well was appropriate to achieve a cell density of 30 ~ 60% within 24 h. Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA) was used for the cell transfection experiments, and the transfection efficiency was verified by PCR. siRNA oligos were synthesized by Shanghai GenePharma Co., Ltd. The si-SLC7A7 sequences were as follows: the sense strand 5ʹ-GCCUGCAUUUGUCUCUUAATT-3ʹ and the antisense strand 5ʹ-UUAAGAGACAAAUGCAGGCTT-3ʹ. The si-VIM sequences were as follows: the sense strand 5ʹ-GCAGAAGAAUGGUACAAAUTT-3ʹ and the antisense strand 5ʹ-AUUUGUACCAUUCUUCUGCTT-3ʹ. Finally, the sequences corresponding to si-NC were as follows: the sense strand 5ʹ-UUCUCCGAACGUGUCACGUTT-3ʹ and the antisense strand 5ʹ-ACGUGACACGUUCGGAGAATT-3ʹ.

Real-time analysis of cell proliferation

After transfecting logarithmic growth phase cells for 24 h, the cells were seeded into 96-well E-plates at a density of 5000 cells/well in a real-time cell analyzer (RTCA; ACEA Biosciences, San Diego, CA, USA). The RTCA Analyzer was programmed according to the manufacturer’s instructions. The RTCA software was used for real-time cell proliferation analysis, and a cell proliferation curve was generated after 96 h of continuous measurements.

Plate colony formation assay

The transfected cells were collected and inoculated into 6-well plates at 500 cells/well. Approximately 2 weeks later, the cell culture was terminated. The plates were washed with phosphate-buffered saline (Solarbio, Beijing, China), fixed with 4% paraformaldehyde (Solarbio) for 20 min, and then stained with 0.1% crystal violet (Solarbio) for 20 min.

Transwell migration and invasion assays

The cells were resuspended in serum-free medium as a single-cell suspension (1 ~ 1.5 × 105 cells/mL), 200 µL of the suspension was uniformly inoculated into the upper chambers of transwell chambers (Corning, NY, USA), and 600 µL of medium containing 30% FBS was added to the lower chambers. For the invasion assays, the upper chamber was pretreated with Matrigel (100 µg/mL). After 24 h of incubation, the cells were fixed with 4% paraformaldehyde for 20 min and stained with 0.1% crystal violet staining solution for 20 min.

Flow cytometric analysis

Transfected cells were collected after centrifugation and resuspended by adding precooled PBS, and this process was repeated twice. Cell cycle distributions were assessed using PI/RNase Staining Buffer (BD Biosciences, Sparks, MD) with a FACSCalibur flow cytometer (BD Biosciences).

Western blot analysis

Cellular proteins were extracted by a protein extraction kit (Solarbio) and quantified via a bicinchoninic acid (BCA) protein assay kit (Solarbio). After separation by 12% SDS‒polyacrylamide gel electrophoresis, the protein samples were transferred to a polyvinylidene fluoride (PVDF) membrane (Millipore, Billerica, MA). The membranes were blocked and incubated with primary antibody (VIM, Cat No. 80232-1-RR; SLC7A7, Cat No. 84743-1-RR) followed by secondary antibody (Cat no: RGAR001). Finally, the membranes were detected with a Clinx GenoSens 1600 integrated gel imaging analysis system (Clinx, Shanghai, China). All the antibodies were purchased from Proteintech Group.

Statistical analysis

Statistical analyses in this study were performed using R software (version 4.2) or Statistical Product and Service Solutions (SPSS) 19.0 software. The data were analyzed by Student’s t test, one-way ANOVA, Pearson’s correlation or Kaplan‒Meier analysis according to the actual conditions. R software and GraphPad Prism 5.0 (GraphPad Software, La Jolla, CA) software were used to construct the graphs. p < 0.05 was considered to indicate statistical significance.

Results

Single-cell level analysis via scRNA-seq

A single-cell data file for GSE163558 was downloaded from the NCBI GEO public database, and data from four samples (GSM5004180, GSM5004181, GSM5004182 and GSM5004183) with complete single-cell expression profiles were obtained for single-cell analysis. Three samples with primary gastric cancer from three different patients and one adjacent non-tumoral NT1 control from one patient. nFeature_RNA and nCount_RNA (nFeature_RNA > 50 and percent.mt < 20) were used to conduct preliminary screening on the data of these samples (Fig. 2A-B), and the 10 genes with the highest standard deviation among them are displayed (Fig. 2C). Principal component analysis (PCA) for dimensionality reduction revealed that the batch effect between samples was not significant (Fig. 2D), and the optimal number of principal components (PCs) was 15 according to the elbow plot (Fig. 2E). Finally, a total of 14 subgroups were obtained through TSNE analysis (Fig. 2F), and the top 10 marker genes in terms of differential expression levels among subtypes were identified (Fig. 2G).

Fig. 2
figure 2

Single-cell level analysis of gastric cancer via scRNA-seq. (A, B) scRNA-seq quality control was used to identify and exclude genes with low expression (nFeature_RNA > 50 and percent.mt < 20). (C) The plot shows differences in gene expression in gastric cancer cells. (D) PCA downscaling analysis revealed no significant batch effect on the cellular distribution of the samples. (E) The optimal number of PCs is shown in ElbowPlot. (F) t-SNE analysis revealed 14 cell subgroups. (G) Heatmap of marker gene expression levels between subtypes

Annotation of single-cell data

In this study, each subtype was annotated by the R package SingleR, and 14 clusters were annotated into 11 cell categories, including neutrophils, NK cells, CD8 + T cells, CD4 + T cells, epithelial cells, DCs, macrophages, B cells, fibroblasts, endothelial cells, and neurons (Fig. 3A). The samples from both the control and cancer groups were included in different cell categories (Fig. 3B). Marker genes specific to each cell subtype were extracted from the single-cell data via the FindAllMarkers function (Supplementary data).

Fig. 3
figure 3

Annotation of cell types based on single-cell data. (A) Fourteen clusters were annotated into 11 cell categories. (B) Analysis of the proportions of cells in different cell categories

Intercellular communication analysis

Ligand‒receptor relationships of features in single-cell expression profiles were analyzed via the software package CellChat. There were complex interactions among these cell subtypes (Fig. 4A-B). Further statistical analysis revealed that the count of interactions among different cell types shows that macrophages have the highest number, suggesting that macrophages play a dominant role in intercellular communication and have the closest potential interaction with other cells (Fig. 4C). Therefore, the marker genes of macrophages were ultimately selected as the candidate gene set. The gene set related to lysine was subsequently obtained from the GeneCards database (https://www.genecards.org/), and genes with a relevance score greater than 2 were extracted. The intersection of these genes with the macrophage gene set revealed a total of 60 intersecting genes (Fig. 4D).

Fig. 4
figure 4

Intercellular communication analysis. (A) Counts of interactions between cellular subtypes. The sizes of the various colored circles on the periphery indicate the number of cells, with larger circles indicating a greater number of cells. The more ligand‒receptor pairs there are, the thicker the line becomes. (B) Weights and strength of interactions between cellular subtypes. Strength is the sum of probability values. (C) Identification of cellular subtypes with close interactions. (D) Venn diagram showing the intersection of lysine-related genes with macrophage-related genes

Machine learning algorithm for identifying key genes related to lysine metabolism

To further identify the key genes that affect GC, we downloaded the processed mRNA expression data of GC from the TCGA database and extracted 60 intersecting genes for random survival forest analysis. The genes with a relative importance > 0.2 were identified as the final markers, and the order of importance of the 7 genes was obtained (Fig. 5A). A survival analysis of these 7 genes with high importance was subsequently performed (Fig. 5B-H), and the results revealed that the solute carrier family 7 member 7 (SLC7A7) and vimentin (VIM) genes were significantly associated with survival (P < 0.05). Therefore, SLC7A7 and VIM were considered key genes for subsequent studies.

Fig. 5
figure 5

Identification of key lysine-related genes via machine learning algorithms. (A) Random survival forest analysis was used to rank the importance of the intersection genes. Relative importance refers to the mean decrease Gini score. The mean decrease Gini score is calculated on the basis of the reduction in Gini impurity. For classification problems, Gini impurity measures the degree of mixing of different categories within a node. When a feature leads to a significant reduction in Gini impurity during tree splitting, it is considered important. The genes with relative importance > 0.2 were identified as the final markers, and the order of importance of the 7 genes was obtained. (B-H) Survival curves of patients grouped on the basis of the expression levels of 7 genes with high importance (p < 0.05). The solute carrier family 7 member 7 (SLC7A7) and vimentin (VIM) genes were significantly associated with survival

Immune infiltration analysis

In addition to cancer cells, the tumor microenvironment is composed mainly of tumor-associated fibroblasts, immune cells, the extracellular matrix, various growth factors, inflammatory factors, and special physicochemical features, which significantly affects the diagnosis of cancers, the outcome of survival outcome and the sensitivity of patients to clinical treatment. We analyzed the relationships between SLC7A7 and VIM and immune infiltration in a cancer dataset to explore the mechanism by which SLC7A7 and VIM affect the progression of GC. We analyzed the percentage of immune cells in each patient sample and the correlations among different immune cell types (Fig. 6A-B) and found that the cancer group samples had significantly greater proportions of naive B cells, activated CD4 + T cells, and M0 macrophages than did the normal patient samples (Fig. 6C). Next, the relationships between SLC7A7 and VIM and immune cells were further explored, and SLC7A7 and VIM were found to be highly correlated with the levels of immune cells. SLC7A7 was significantly positively correlated with the levels of M0 macrophages, M1 macrophages, etc., and significantly negatively correlated with the levels of resting mast cells, resting memory CD4 + T cells, etc. (Fig. 6D). Similarly, the VIM gene was significantly positively correlated with the numbers of M2 macrophages, gamma delta T cells, etc., and significantly negatively correlated with the numbers of M0 macrophages, resting NK cells, etc. (Fig. 6E). Correlation analysis revealed that SLC7A7 and VIM were not correlated at the expression level (Fig. S1), which confirmed that there was no mutual interference between them. Moreover, we further explored the biological relevance of the cell type-specific expression patterns of SLC7A7 and VIM (Fig. 6F-H). The above analysis confirmed that the key genes SLC7A7 and VIM were closely associated with the level of immune cell infiltration and the immune microenvironment.

Fig. 6
figure 6

Immune infiltration analysis. (A) Percentages of immune cells in normal controls and cancer patients. (B) Correlation analysis between immune cells. (C) Differences in immune cell counts between cancer and normal control groups. (D) Correlation between immune cells and SLC7A7 expression. (E) Correlation between immune cells and VIM expression. (F) Annotation of SLC7A7 expression in various cells. (G) Annotation of VIM expression in various cells. (H) Distribution proportions of various cells

Signaling pathways involving SLC7A7 and VIM

The signaling pathways associated with SLC7A7 and VIM and the potential molecular mechanisms by which SLC7A7 and VIM affect cancer progression were explored. GSEA revealed that high SLC7A7 expression was enriched in cytokine receptor interactions, lysosomes, the Nod-like receptor signaling pathway, and the Toll-like receptor signaling pathway (Fig. S2A). Similarly, high expression of VIM is associated with arachidonic acid metabolism, drug metabolism, cytochrome p450, linoleic acid metabolism, and xenobiotic metabolism via cytochrome p450 signaling pathways (Fig. S2B).

Relationships between SLC7A7 and VIM and drug sensitivity

Surgery combined with chemotherapy is effective in the treatment of GC. Thus, we next assessed drug sensitivity data from the GDSC database, predicted the chemotherapy sensitivity of each cancer patient sample through the R package “pRRophetic”, and further explored the relationships between SLC7A7 and VIM and the sensitivity of common chemotherapeutic drugs. The results revealed that SLC7A7 and VIM expression levels were significantly associated with sensitivity to bortezomib, CMK, cytarabine, dasatinib, elesclomol, and embelin (Fig. S3).

Construction of key gene-related transcriptional regulatory and MiRNA networks

Transcription factors of the SLC7A7 and VIM were obtained by predicting the relevant transcription factors through the online Cistrome DB; a total of 52 and 102 transcription factors were predicted for SLC7A7 and VIM, respectively. Cytoscape was used for visualization to construct a comprehensive transcriptional regulatory network of the SLC7A7 and VIM in GC (Fig. S4A). In addition, reverse prediction of SLC7A7 and VIM was conducted through the miRcode database and yielded a total of 54 miRNAs and 68 pairs of mRNA‒miRNA regulatory pairs, which were all visualized using Cytoscape (Fig. S4B).

Nomogram construction and calibration curve analysis

This study further constructed a nomogram model to predict patient prognosis. The results of logistic regression analysis revealed that in all our samples, the different clinical indicators of GC and the distributions of SLC7A7 and VIM values contributed to varying degrees throughout the scoring process. A nomogram model was further constructed to predict the prognosis of patients with GC, and the results of logistic regression analysis revealed that the values of different clinical indicators of GC and SLC7A7 and VIM expression contributed to the total score to different degrees in all our samples (Fig. 7A). The calibration curve was used to evaluate the consistency between the prediction probability and the actual probability of overall survival (OS) for both the 3-year and 5-year periods, indicating that the model has good prediction ability (Fig. 7B). Moreover, on the basis of the establishment of a nomogram prediction model, the GSE13861 dataset was used as an independent validation cohort to verify the prognostic value of the nomogram. The prediction efficiency of the nomogram was further confirmed by constructing a receiver operating characteristic (ROC) curve (Fig. 7C).

Fig. 7
figure 7

Nomogram and calibration curve. (A) A nomogram combining SLC7A7 and VIM with clinical features was constructed to predict the prognosis of patients with GC. SLC7A7 and VIM expression levels were transformed using log2 (TPM + 1). (B) Calibration curves of the nomogram for overall survival prediction at 3 and 5 years. (C) The GSE13861 dataset was used as an independent validation cohort to verify the prognostic value of the nomogram. The prediction efficiency of the nomogram was further confirmed by constructing a receiver operating characteristic (ROC) curve

Correlation analysis between SLC7A7 and VIM and cancer-related genes

GC-related oncogenes were obtained from the GeneCards database (https://www.genecards.org/). Next, we analyzed the expression levels of SLC7A7 and VIM and the top 20 oncogenes with the highest relevance scores and found that the expression levels of SLC7A7 and VIM were significantly correlated with the expression levels of multiple GC-related oncogenes. VIM was significantly negatively correlated with cadherin 1 (CDH1) (r=-0.461), whereas SLC7A7 was significantly positively correlated with MutL homolog 1 (MLH1) (r = 0.199) (Fig. 8).

Fig. 8
figure 8

Correlation analysis between key genes and cancer-related genes. The expression levels of two key genes, SLC7A7 and VIM, were significantly correlated with the expression levels of multiple gastric cancer-related oncogenes. VIM was significantly negatively correlated with CDH1 (r=-0.461), whereas SLC7A7 was significantly positively correlated with MLH1 (r = 0.199)

Coexpression of SLC7A7 and VIM and metabolism-related genes and their relationships with metabolic pathways in single-cell data

The expression of the SLC7A7 and VIM in 11 cell types was analyzed (Fig. 9A-B), and the coexpression of the SLC7A7 and VIM with the metabolism-related gene CYP2D6 in 11 cell types was visualized (Fig. 9C-D). The single-sample gene set enrichment analysis (ssGSEA) algorithm was subsequently used to quantify the metabolic levels of the DEGs in all the samples, and the metabolic pathways were visualized through the “pheatmap” package. According to the metabolic pathway heatmap, cancer samples had higher scores than normal samples in all 6 metabolic pathway categories (Fig. 9E). In addition, the correlations of the SLC7A7 and VIM with the 6 metabolic pathway categories are also displayed (Figs. S5-6).

Fig. 9
figure 9

Coexpression of key genes and metabolism-related genes and their relationships with metabolic pathways in single-cell data. (A, B) Expression of key genes in 11 cell types. (C) Coexpression of the key gene SLC7A7 and the metabolism-related gene CYP2D6 in single cells. (D) Coexpression of the key gene VIM and the metabolism-related gene CYP2D6 in single cells. (E) Heatmap revealing the relationships between key genes and metabolic pathways

Validation of the differential expression and biological function of SLC7A7 and VIM in GC

To validate the differential expression, qRT‒PCR was performed to detect the expression levels of SLC7A7 and VIM in GC cell lines and tissues. The results revealed that SLC7A7 and VIM were both generally upregulated in GC cells compared with the normal gastric epithelial cell line GES-1 (Fig. 10A-B). Consistent with the cellular results, SLC7A7 and VIM expression levels were significantly increased in 69.57% (Fig. 10C) and 71.74% (Fig. 10D) of the GC tissues, respectively. SLC7A7 and VIM protein expression was higher in gastric cancer tissue than in normal gastric tissue according to the Human Protein Atlas (Fig. S7). Moreover, clinicopathological analysis confirmed that SLC7A7 expression was significantly correlated with tumor diameter and perineural invasion (Supplementary Table 1), whereas VIM expression was closely associated with perineural invasion (Supplementary Table 2).

Fig. 10
figure 10

Validation of the differential expression and biological function of key genes in GC. (A) SLC7A7 expression in GC cells. (B) VIM expression in GC cells. (C) SLC7A7 expression levels in cancer tissues and adjacent normal tissues (n = 46). (D) VIM expression levels in cancer tissues and adjacent normal tissues (n = 46). (E) SLC7A7 expression was effectively knocked down by si-SLC7A7. (F) VIM expression was effectively knocked down by si-VIM. (G) Western blot analyses revealed that SLC7A7 expression was effectively knocked down by si-SLC7A7. (H) Western blotting analyses revealed that VIM expression was effectively knocked down by si-VIM. (I) Silencing SLC7A7 and VIM expression significantly inhibited the proliferation of AGS GC cells. (J) Silencing the expression of both SLC7A7 and VIM significantly inhibited HGC-27 proliferation in GC cells. (K) Transwell migration assay. (L) Transwell invasion assay. (M) Plate colony formation assay. (N) Flow cytometry was used to analyze the cell cycle distribution of AGS cells. (O) Flow cytometry was used to analyze the cell cycle distribution of HGC-27 cells. The data are shown as the means ± SDs of three independent experiments. A lower ΔCt value indicates a higher gene expression level. Asterisks represent significant differences (*p < 0.05, **p < 0.01, ***p < 0.001)

siRNAs were transfected into AGS and HGC-27 GC cells to manipulate SLC7A7 and VIM expression levels (Fig. 10E-H). qRT‒PCR and Western blotting analyses revealed that SLC7A7 and VIM expression was effectively knocked down by si-SLC7A7 and si-VIM, respectively. The effect on cellular phenotypes of SLC7A7 and VIM in gastric carcinogenesis were subsequently validated through real-time cellular analysis (RTCA) (Fig. 10I-J), transwell migration (Fig. 10K), invasion (Fig. 10L), and plate colony formation (Fig. 10M) assays. Cell experiments demonstrated that silencing the expression of SLC7A7 and VIM significantly inhibited GC cell proliferation, migration, and invasion. Moreover, flow cytometry analysis revealed that GC cells transfected with siRNA exhibited significant G2/M arrest (Fig. 10N-O).

Discussion

With the extensive research on the molecular characteristics of human cancers, scRNA-seq and bulk RNA-seq are increasingly favored as two important types of high-throughput sequencing technology. Bulk RNA-seq is widely used in the fields of large-scale gene expression analysis [18]. scRNA-seq can be used to study gene expression at the overall level from the perspective of individual cells and explore heterogeneity among cancer cells to reveal new mechanisms of disease pathogenesis and more effective targets for individualized therapy [19, 20]. scRNA-seq combined with bulk RNA-seq was initially applied for heterogeneity analysis of human malignant cancer, demonstrating great potential in elucidating the complex mechanisms of tumorigenesis, development and metastasis and providing new ideas for establishing precise treatment strategies [21,22,23,24].

GC is a highly molecularly and phenotypically heterogeneous disease with complex metabolic patterns, and this heterogeneity greatly limits the therapeutic response and prognosis of patients with GC [25, 26]. Moreover, the specific molecular biological mechanisms and gene expression patterns involved in gastric carcinogenesis have not been fully elucidated. Recently, an increasing number of studies have focused on the relationship between amino acids and cancer metabolism and revealed that lysine metabolism plays an important role in the occurrence and development of cancer and is a potential new target for cancer diagnosis and treatment [27, 28]. In this study, we conducted a comprehensive analysis of scRNA-seq and bulk RNA-seq data through machine learning algorithms to reveal the potential role of lysine metabolism-related genes in GC and to construct a new prognostic model for clinical use. Single-cell level analysis of the scRNA-seq data and single-cell data annotation were performed to obtain 11 cell categories. Complex and frequent cellular communication occurs among these cell subtypes, most prominently among macrophages and other cell types (Fig. 4). As a heterogeneous cell group in the tumor microenvironment (TME), macrophages can alter the overall metabolic profile of the TME, which in turn has an impact on cancer progression and drug resistance [29].

A machine learning algorithm was used to identify key genes related to lysine metabolism via random survival forest analysis and survival analysis. In this study, relative importance refers to the mean decrease Gini score. The mean decrease Gini score is calculated based on the reduction in Gini impurity. For classification problems, Gini impurity measures the degree of mixing of different categories within a node. When a feature leads to a significant reduction in Gini impurity during tree splitting, it is considered important. We identified the top 20% of important features, including 7 target genes, using random forests. Higher scores indicate a greater impact of the feature on the model. Next, we conducted survival analysis on these 7 highly important genes to validate their significance in GC patients. Our results revealed that the SLC7A7 and VIM genes were significantly related to survival (Fig. 5), so they were considered key genes for subsequent studies.

As a key gene in this study, SLC7A7 is a protein-coding gene that is an essential member of the solute carrier superfamily [30]. SLC7A7 encodes y + L amino acid transporter 1 (y + LAT1), and its mutations can lead to urea cycle defects accompanied by protein intolerance, known as lysinuric protein intolerance (LPI) [31]. Previous studies have confirmed that SLC7A7 is significantly overexpressed in cancer patients and affects immune infiltration, radiotherapy sensitivity, and chemotherapeutic drug response in ovarian cancer and non-small cell lung cancer [32,33,34]. Similarly, VIM is also an important gene associated with carcinogenesis. The VIM gene is located on human chromosome 10p13, and VIM is a member of the intermediate filament protein family, which is significantly overexpressed in a variety of epithelial cancers [35, 36]. The overexpression of VIM mediates the epithelial–mesenchymal transition (EMT), which affects the proliferation, apoptosis, adherence, and migration of cancer cells [35, 36]. Moreover, VIM has been studied extensively as a biomarker for several cancers and, in turn, has shown strong therapeutic targeting capabilities [37,38,39]. In our study, to validate the differential expression of these genes, qRT‒PCR was performed to detect the expression levels of SLC7A7 and VIM in gastric cancer cell lines and tissues. Our data confirmed that both SLC7A7 and VIM were significantly overexpressed in GC tissues and cells (Fig. 10A-D). Cell experiments demonstrated that silencing the expression of both SLC7A7 and VIM significantly inhibited GC cell proliferation, migration and invasion, leading to GC cell arrest in the G2/M phase (Fig. 10K-O). Therefore, SLC7A7 and VIM are regulatory genes that play significant roles in gastric carcinogenesis.

Through comprehensive analysis, we revealed that the potential role of the SLC7A7 and VIM during gastric carcinogenesis. The metabolic balance of amino acids, which are crucial metabolites in cancer cells and the tumor microenvironment, influences the growth rate of cancer cells and the immune function of tumor-associated macrophages (TAMs); when TAMs present an M2 phenotype, they can promote cancer progression [40]. In this study, we first analyzed the relationships between SLC7A7 and VIM and immune infiltration in GC. Our results revealed that SLC7A7 was significantly positively correlated with the levels of M0 macrophages, M1 macrophages, etc., and significantly negatively correlated with the levels of resting mast cells, resting memory CD4 + T cells, etc., whereas VIM was significantly positively correlated with the levels of M2 macrophages, gamma delta T cells, etc., and significantly negatively correlated with the levels of M0 macrophages, resting NK cells, etc. (Fig. 6), confirming that the key genes SLC7A7 and VIM are closely associated with the level of immune cell infiltration in GC and play important roles in the immune microenvironment. Next, signaling pathways associated with the key genes SLC7A7 and VIM were explored. We found that SLC7A7 and VIM were enriched in cancer-associated signaling pathways, such as the Nod-like receptor signaling pathway, Toll-like receptor signaling pathway, and arachidonic acid metabolism (Fig. S2). These signaling pathways have been reported to be aberrantly activated in a variety of cancers, affecting the tumor microenvironment and thus mediating cancer development [41, 42]. The relationships between SLC7A7 and VIM and sensitivity to common chemotherapeutic drugs were further explored. Our data revealed that SLC7A7 and VIM were significantly associated with sensitivity to bortezomib, CMK, cytarabine, dasatinib, elesclomol, and embelin (Fig. S3). Although 5-fluorouracil is still the mainstream drug used in GC chemotherapy, the combination of new drugs and traditional drugs has proven to have great potential [43]. Therefore, in clinical practice, combinations of chemotherapy drugs can be precisely matched according to the gene expression characteristics of patients with GC to develop a more personalized treatment plan and improve their quality of life. Afterward, transcriptional regulatory and miRNA networks involving SLC7A7 and VIM were constructed, and the relationships between SLC7A7 and VIM and cancer-related genes were analyzed. The complex regulatory network between SLC7A7 and VIM and transcription factors, miRNAs and multiple gastric cancer-related oncogenes has been revealed (Fig. S4), laying the foundation for the functional mechanisms of SLC7A7 and VIM in gastric carcinogenesis. Finally, we investigated the coexpression of SLC7A7 and VIM and metabolism-related genes and their relationships with metabolic pathways in single-cell data. As expected, the expression of the SLC7A7 and VIM was significantly associated with that of CYP2D6 (Fig. 9C-D). Moreover, cancer patient samples had higher scores than normal control samples in all 6 metabolic pathway categories, again validating their metabolic relevance (Fig. 9E). The correlations of the SLC7A7 and VIM with the 6 metabolic pathway categories are also displayed (Figs. S5-6). Overall, our study revealed the potential role of the SLC7A7 and VIM during gastric carcinogenesis, providing novel metabolic targets.

Current research on GC prognosis has focused predominantly on the identification of molecular biomarkers and the construction of predictive models using clinical and pathological data [44]. Traditional prognostic models, such as those based on TNM staging, have been widely used but often fail to capture the complexity of the disease [45]. More recently, the advent of high-throughput technologies, including genomics, transcriptomics, and proteomics, has enabled the identification of novel biomarkers and the development of more sophisticated prognostic models [46,47,48]. These models often incorporate multiomics data, machine learning algorithms, and clinical variables to increase their predictive accuracy. Nomograms are reliable tools for visually assessing risk by providing numerical estimates of the probability of specific clinical events while incorporating key factors of cancer [49]. In addition, nomograms can have greater accuracy than traditional TNM staging in predicting patient survival and have been used to assess the prognostic risk of multiple cancers [50, 51]. In this study, to assess the prognostic value of key lysine metabolism-related genes, we constructed a prognostic nomogram that integrated SLC7A7 and VIM and all the important clinical features and demonstrated the high predictive accuracy of the model by calibration curves (Fig. 7). Our study underscores the importance of integrating diverse prognostic factors to improve the predictive accuracy of GC models. In future studies, by leveraging advancements in machine learning and multiomics data, we can develop more robust and clinically relevant prognostic tools. Ultimately, these efforts will contribute to personalized treatment strategies and improved patient outcomes in the context of GC management.

Our study also has limitations. First, although two key genes, SLC7A7 and VIM, have been identified and their potential role in gastric carcinogenesis have been analyzed, the direct interaction between SLC7A7 and VIM and the predicted drugs cannot be directly demonstrated, and the clinical efficacy and safety of these drugs have not been verified in large-scale clinical trials. Second, the in-depth study of specific signaling pathways and molecular mechanisms is insufficient, and how SLC7A7 and VIM affect the progression and prognosis of GC through specific signaling pathways still needs to be further elucidated. Third, this study investigates the relationship between lysine metabolism-related genes and gastric carcinogenesis from a macro perspective. Methodologically, it primarily utilizes sequencing big data, bioinformatics, machine learning, and clinical sample research. Although we have verified some conclusions of this manuscript at the micro level through cell experiments and clinical patient samples, due to time and funding constraints, clinical ethical restrictions, and experimental volume limitations, some conclusions and viewpoints still require verification through future molecular biology experiments.

Conclusion

In summary, we investigated the roles of lysine metabolism-related regulatory genes in GC by in-depth analysis of scRNA-seq data combined with bulk RNA-seq data, multiple bioinformatics methods and molecular biology experimental methods. We systematically analyzed the heterogeneity of GC cells and interactions among cell subtypes, identified SLC7A7 and VIM as two key genes related to gastric carcinogenesis, successfully revealed their potential roles in immune infiltration, signaling pathway regulation, drug sensitivity, molecular regulatory networks, correlation analysis of tumor regulatory genes, and metabolic pathways, and established a reliable prognostic risk nomogram. Our study provides novel metabolic targets.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, Jemma A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229–63.

    Article  PubMed  Google Scholar 

  2. Ajani JA, Lee J, Sano T, Janjigian YY, Fan D, Song S. Gastric adenocarcinoma. Nat Rev Dis Primers. 2017;3:17036.

    Article  PubMed  Google Scholar 

  3. Zavros Y, Merchant JL. The immune microenvironment in gastric adenocarcinoma. Nat Rev Gastroenterol Hepatol. 2022;19(7):451–67.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Rugge M, Genta RM, Di Mario F, El-Omar EM, El-Serag HB, Fassan M, Hunt RH, Kuipers EJ, Malfertheiner P, Sugano K, et al. Gastric cancer as preventable disease. Clin Gastroenterol Hepatol. 2017;15(12):1833–43.

    Article  PubMed  Google Scholar 

  5. Boroughs LK, DeBerardinis RJ. Metabolic pathways promoting cancer cell survival and growth. Nat Cell Biol. 2015;17(4):351–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Martínez-Reyes I, Chandel NS. Cancer metabolism: looking forward. Nat Rev Cancer. 2021;21(10):669–80.

    Article  PubMed  Google Scholar 

  7. Endicott M, Jones M, Hull J. Amino acid metabolism as a therapeutic target in cancer: a review. Amino Acids. 2021;53(8):1169–79.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Vettore L, Westbrook RL, Tennant DA. New aspects of amino acid metabolism in cancer. Br J Cancer. 2020;122(2):150–6.

    Article  CAS  PubMed  Google Scholar 

  9. Li Z, Zhang H. Reprogramming of glucose, fatty acid and amino acid metabolism for cancer progression. Cell Mol Life Sci. 2016;73(2):377–92.

    Article  CAS  PubMed  Google Scholar 

  10. Lieu EL, Nguyen T, Rhyne S, Kim J. Amino acids in cancer. Exp Mol Med. 2020;52(1):15–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yuan H, Wu X, Wu Q, Chatoff A, Megill E, Gao J, Huang T, Duan T, Yang K, Jin C, et al. Lysine catabolism reprograms tumour immunity through histone crotonylation. Nature. 2023;617(7962):818–26.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Li X, Zhang C, Zhao T, Su Z, Li M, Hu J, Wen J, Shen J, Wang C, Pan J, et al. Lysine-222 succinylation reduces lysosomal degradation of lactate dehydrogenase a and is increased in gastric cancer. J Exp Clin Cancer Res. 2020;39(1):172.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Wu S, Wang Y, Duan J, Teng Y, Wang D, Qi F. Identification of a shared gene signature and biological mechanism between diabetic foot ulcers and cutaneous lupus erythemnatosus by transcriptomic analysis. Front Physiol. 2024;15:1297810.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Cao H, Huang P, Qiu J, Gong X, Cao H. Immune landscape of hepatocellular carcinoma tumor microenvironment identifies a prognostic relevant model. Heliyon. 2024;10(3):e24861.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Jin S, Plikus MV, Nie Q. CellChat for systematic analysis of cell-cell communication from single-cell transcriptomics. Nat Protoc. 2025;20(1):180–219.

    Article  CAS  PubMed  Google Scholar 

  16. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5):453–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;25(43):15545–50.

    Article  Google Scholar 

  18. Thind AS, Monga I, Thakur PK, Kumari P, Dindhoria K, Krzak M, Ranson M, Ashford B. Demystifying emerging bulk RNA-Seq applications: the application and utility of bioinformatic methodology. Brief Bioinform. 2021;22(6):bbab259.

    Article  PubMed  Google Scholar 

  19. Suvà ML, Tirosh I, Single-Cell RNA. Sequencing in cancer: lessons learned and emerging challenges. Mol Cell. 2019;75(1):7–12.

    Article  PubMed  Google Scholar 

  20. Ziegenhain C, Vieth B, Parekh S, Reinius B, Guillaumet-Adkins A, Smets M, Leonhardt H, Heyn H, Hellmann I, Enard W. Comparative analysis of Single-Cell RNA sequencing methods. Mol Cell. 2017;65(4):631–e434.

    Article  CAS  PubMed  Google Scholar 

  21. Pang J, Yu Q, Chen Y, Yuan H, Sheng M, Tang W. Integrating Single-cell RNA-seq to construct a neutrophil prognostic model for predicting immune responses in non-small cell lung cancer. J Transl Med. 2022;20(1):531.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Liu X, Yan G, Xu B, Yu H, An Y, Sun M. Evaluating the role of IDO1 macrophages in immunotherapy using scRNA-seq and bulk-seq in colorectal cancer. Front Immunol. 2022;13:1006501.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Yu L, Shen N, Shi Y, Shi X, Fu X, Li S, Zhu B, Yu W, Zhang Y. Characterization of cancer-related fibroblasts (CAF) in hepatocellular carcinoma and construction of CAF-based risk signature based on single-cell RNA-seq and bulk RNA-seq data. Front Immunol. 2022;13:1009789.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Ding S, Chen X, Shen K. Single-cell RNA sequencing in breast cancer: Understanding tumor heterogeneity and paving roads to individualized therapy. Cancer Commun (Lond). 2020;40(8):329–44.

    Article  PubMed  Google Scholar 

  25. Smyth EC, Nilsson M, Grabsch HI, van Grieken NC, Lordick F. Gastric cancer. Lancet. 2020;396(10251):635–48.

    Article  CAS  PubMed  Google Scholar 

  26. O’Connor JP, Rose CJ, Waterton JC, Carano RA, Parker GJ, Jackson A. Imaging intratumor heterogeneity: role in therapy response, resistance, and clinical outcome. Clin Cancer Res. 2015;21(2):249–57.

    Article  PubMed  Google Scholar 

  27. Wang JH, Mao L, Wang J, Zhang X, Wu M, Wen Q, Yu SC. Beyond metabolic waste: lysine lactylation and its potential roles in cancer progression and cell fate determination. Cell Oncol (Dordr). 2023;46(3):465–80.

    Article  PubMed  Google Scholar 

  28. Wu Z, Wei D, Gao W, Xu Y, Hu Z, Ma Z, Gao C, Zhu X, Li Q. TPO-Induced metabolic reprogramming drives liver metastasis of colorectal cancer CD110 + Tumor-Initiating cells. Cell Stem Cell. 2015;17(1):47–59.

    Article  CAS  PubMed  Google Scholar 

  29. Vitale I, Manic G, Coussens LM, Kroemer G, Galluzzi L. Macrophages and metabolism in the tumor microenvironment. Cell Metab. 2019;30(1):36–50.

    Article  CAS  PubMed  Google Scholar 

  30. Borsani G, Bassi MT, Sperandeo MP, De Grandi A, Buoninconti A, Riboni M, Manzoni M, Incerti B, Pepe A, Andria G, et al. SLC7A7, encoding a putative permease-related protein, is mutated in patients with lysinuric protein intolerance. Nat Genet. 1999;21(3):297–301.

    Article  CAS  PubMed  Google Scholar 

  31. IJzermans T, van der Meijden W, Hoeks M, Huigen M, Rennings A, Nijenhuis T. Improving a rare metabolic disorder through kidney transplantation: A case report of a patient with lysinuric protein intolerance. Am J Kidney Dis. 2023;81(4):493–6.

    Article  CAS  PubMed  Google Scholar 

  32. Dai W, Feng J, Hu X, Chen Y, Gu Q, Gong W, Feng T, Wu J. SLC7A7 is a prognostic biomarker correlated with immune infiltrates in non-small cell lung cancer. Cancer Cell Int. 2021;21(1):106.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Xie L, Song X, Yu J, Guo W, Wei L, Liu Y, Wang X. Solute carrier protein family May involve in radiation-induced radioresistance of non-small cell lung cancer. J Cancer Res Clin Oncol. 2011;137(12):1739–47.

    Article  CAS  PubMed  Google Scholar 

  34. Cheng L, Lu W, Kulkarni B, Pejovic T, Yan X, Chiang JH, Hood L, Odunsi K, Lin B. Analysis of chemotherapy response programs in ovarian cancers by the next-generation sequencing technologies. Gynecol Oncol. 2010;117(2):159–69.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Satelli A, Li S. VIMentin in cancer and its potential as a molecular target for cancer therapy. Cell Mol Life Sci. 2011;68(18):3033–46.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Ivaska J, Pallari HM, Nevo J, Eriksson JE. Novel functions of VIMentin in cell adhesion, migration, and signaling. Exp Cell Res. 2007;313(10):2050–62.

    Article  CAS  PubMed  Google Scholar 

  37. Zhang N, Hua X, Tu H, Li J, Zhang Z, Max C. Isorhapontigenin (ISO) inhibits EMT through FOXO3A/METTL14/VIMENTIN pathway in bladder cancer cells. Cancer Lett. 2021;520:400–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Chen Z, Fang Z, Ma J. Regulatory mechanisms and clinical significance of VIMentin in breast cancer. Biomed Pharmacother. 2021;133:111068.

    Article  CAS  PubMed  Google Scholar 

  39. Gu C, Wang X, Long T, Wang X, Zhong Y, Ma Y, Hu Z, Li Z. FSTL1 interacts with VIM and promotes colorectal cancer metastasis via activating the focal adhesion signalling pathway. Cell Death Dis. 2018;9(6):654.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Mantovani A, Allavena P, Marchesi F, Garlanda C. Macrophages as tools and targets in cancer therapy. Nat Rev Drug Discov. 2022;21(11):799–820.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Duan T, Du Y, Xing C, Wang HY, Wang RF. Toll-Like receptor signaling and its role in Cell-Mediated immunity. Front Immunol. 2022;13:812774.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Liu P, Lu Z, Liu L, Li R, Liang Z, Shen M, Xu H, Ren D, Ji M, Yuan S, et al. NOD-like receptor signaling in inflammation-associated cancers: from functions to targeted therapies. Phytomedicine. 2019;64:152925.

    Article  CAS  PubMed  Google Scholar 

  43. Kim R, Emi M, Arihiro K, Tanabe K, Uchida Y, Toge T. Chemosensitization by STI571 targeting the platelet-derived growth factor/platelet-derived growth factor receptor-signaling pathway in the tumor progression and angiogenesis of gastric carcinoma. Cancer. 2005;103(9):1800–9.

    Article  CAS  PubMed  Google Scholar 

  44. Sato Y, Okamoto K, Kawano Y, Kasai A, Kawaguchi T, Sagawa T, Sogabe M, Miyamoto H, Takayama T. Novel biomarkers of gastric cancer: current research and future perspectives. J Clin Med. 2023;12(14):4646.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Zhang H, Shi J, Xie H, Liu X, Ruan G, Lin S, Ge Y, Liu C, Chen Y, Zheng X, et al. Superiority of CRP-albumin-lymphocyte index as a prognostic biomarker for patients with gastric cancer. Nutrition. 2023;116:112191.

    Article  CAS  PubMed  Google Scholar 

  46. Zhang X, Li Y, Chen Y. Development of a comprehensive gene signature linking hypoxia, glycolysis, lactylation, and metabolomic insights in gastric cancer through the integration of bulk and Single-Cell RNA-Seq data. Biomedicines. 2023;11(11):2948.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Liu N, Wu Y, Cheng W, Wu Y, Wang L, Zhuang L. Identification of novel prognostic biomarkers by integrating multi-omics data in gastric cancer. BMC Cancer. 2021;21(1):460.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Chen C, Chen X, Hu Y, Pan B, Huang Q, Dong Q, Xue X, Shen X, Chen X. Utilizing machine learning to integrate single-cell and bulk RNA sequencing data for constructing and validating a novel cell adhesion molecules related prognostic model in gastric cancer. Comput Biol Med. 2024;180:108998.

    Article  CAS  PubMed  Google Scholar 

  49. Wo Y, Yang H, Zhang Y, Wo J. Development and external validation of a nomogram for predicting survival in patients with stage IA Non-small cell lung cancer ≤ 2 cm undergoing sublobectomy. Front Oncol. 2019;9:1385.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Wu H, Ding P, Wu J, Sun C, Guo H, Chen S, Lowe S, Yang P, Tian Y, Liu Y, et al. A new online dynamic nomogram: construction and validation of a predictive model for distant metastasis risk and prognosis in patients with Gastrointestinal stromal tumors. J Gastrointest Surg. 2023;27(7):1429–44.

    Article  PubMed  Google Scholar 

  51. Ding J, Wang C, Sun Y, Guo J, Liu S, Cheng Z. Identification of an Autophagy-Related signature for prognosis and immunotherapy response prediction in ovarian cancer. Biomolecules. 2023;13(2):339.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank all contributors of high-quality data to these accessible public databases. Thanks for the technical support by the Core Facilities, Health Science Center, Ningbo University. We thank AJE for the language editing work.

Funding

This study was supported by grants from the Key Scientific and Technological Projects of Ningbo (No. 2021Z133, No.2022Z130), Ningbo Top Medical and Health Research Program (No. 2023020612), and the Youth Medical Backbone Talents Training Program of Ningbo.

Author information

Authors and Affiliations

Authors

Contributions

Ye and Shao made substantial contributions to conception and design of this manuscript. Shao, Yu and Chen were involved in drafting the manuscript and revising it critically for important intellectual content. Yu, Yan and Shao drew the figure in this manuscript. Shao and Guo reviewed and revised the final manuscript. All authors contributed to the figure discussions and approved the final manuscript submitted.

Corresponding authors

Correspondence to Junming Guo or Guoliang Ye.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Human Research Ethics Committee of Ningbo University and the ethics committee of the First Affiliated Hospital of Ningbo University (IRB No. 20120303 and No. KY20220101). Written informed consent was obtained from the patients.

Competing interests

The authors declare no competing interests.

Conflict of interest

The authors confirm that there are no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shao, Y., Chen, C., Yu, X. et al. Comprehensive analysis of scRNA-seq and bulk RNA-seq data via machine learning and bioinformatics reveals the role of lysine metabolism-related genes in gastric carcinogenesis. BMC Cancer 25, 644 (2025). https://doi.org/10.1186/s12885-025-14051-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12885-025-14051-w

Keywords