Introduction

Lung cancer is the second most prevalent cancer worldwide and is a leading cause of cancer-related mortality [1]. Statistical data demonstrate that lung cancer presents the highest morbidity and mortality rates in China, with annual fatalities exceeding 600,000 individuals [2, 3]. The prognosis of lung cancer treatment is predominantly influenced by the stage at which the disease is diagnosed [4, 5]. For patients diagnosed with early-stage (stage IA) non-small cell lung cancer (NSCLC) who are candidates for surgical resection followed by adjuvant therapy, the 5-year survival rate ranges from 80 to 93% [6]. However, for patients diagnosed with advanced stage IV lung cancer, the 1-year survival rate persists at below 20%, underscoring the crucial importance of early diagnosis in enhancing prognosis. [6]. Regrettably, merely approximately 16% of patients receive a diagnosis during the early stages of lung cancer [4]. Therefore, enhancing the diagnostic rate of early-stage lung cancer is crucial for the effective and curative treatment of these patients.

Chest computed tomography (CT) screening plays a significant role in the early detection of lung cancer [7]. Previous research from two significant randomized controlled trials, specifically the Nederlands-Leuvens Longkanker Screenings ONderzoek Trial (NELSON) and the National Lung Screening Trial (NLST), has shown that lung cancer screening utilizing low-dose computed tomography (LDCT) is linked to a decrease in mortality rates. [8, 9]. A study conducted in China demonstrated that screening with LDCT resulted in a 31% reduction in lung cancer mortality [10]. With the extensive implementation of CT screening, pulmonary nodules have increasingly been identified as incidental findings [11]. Distinguishing between benign and malignant pulmonary nodules is a crucial objective in the management of patients presenting with these findings [12, 13]. Despite the increasing use of LDCT for lung cancer screening, its efficacy is compromised by a notably high false-positive rate. According to findings from the NLST, only 3.8% of positive results are ultimately confirmed as lung cancer [8]. The differentiation between benign and malignant solid pulmonary nodules is typically based on an analysis of clinical data, CT findings, and tumor biomarker levels specific to each patient [14]. For instance, the Mayo model incorporates three lung nodule features (spiculation, diameter, and upper lobe location) and three clinical characteristics (cigarette-smoking status, age, and history of cancer) [15]. In recent years, studies have suggested that blood tumor biomarkers may have a significant role in the management of patients with indeterminate lung nodules [16]. Yang et al. evaluated the lung cancer biomarker panel (LCBP) within a Chinese cohort and subsequently developed the LCBP nodule risk model, which incorporated patients’ clinical characteristics (age, sex, smoke status), CT features of the nodule (diameter, spiculation), and blood biomarkers pro-gastrin-releasing peptide (Pro-GRP), squamous cell carcinoma antigen (SCC), cytokeratin-19 fragment (Cyfra21-1), carcinoembryonic antigen (CEA) [17]. Hou et al. demonstrated that CEA, CYFRA21-1, and CT radiological scores serve as significant predictors of malignant lung nodules [18]. Previous studies have indicated that a panel of four marker proteins (4MP), comprising cancer antigen 125 (CA125), CEA, the precursor form of surfactant protein B (Pro-SFTPB), and Cyfra21-1, significantly enhances the efficacy of lung cancer risk assessment [19]. In a prior single-center study, our findings indicated that 4MP exhibited exceptional efficacy in distinguishing between benign and malignant nodules [20]. Nevertheless, the specific role of 4MP in differentiating early-stage lung cancer from benign nodules among Chinese patients necessitates additional validation. Consequently, we incorporated a new study cohort to further substantiate the function of 4MP in distinguishing early-stage lung cancer from benign nodules within a Chinese population.

In this study, we examined the role of 4MP detection, clinical characteristics, and CT features of nodules in differentiating benign lung nodules from early-stage lung cancer. To improve the efficacy of the 4MP model, we devised an innovative composite model that integrates 4MP, clinical characteristics, and CT features for the differential diagnosis of benign and malignant lung nodules in Chinese patients. The findings from this research provide a more robust basis for the early diagnosis of early-stage lung cancer within the Chinese population.

Materials and methods

Study subjects

A cohort of 380 patients, who presented at Zhejiang Hospital between March 2021 and April 2024 with an initial diagnosis of pulmonary nodules, was selected for this study. The nodules were classified as either benign or malignant based on postoperative pathological findings. All patients in the cohort were required to meet the following criteria: (a) absence of extrathoracic malignant tumors; and (b) no history of chemotherapy or radiotherapy treatment within the 6 months preceding the study. Clinical data, including age, gender, body mass index (BMI), smoking history, alcohol consumption history, personal and family history of cancer, and tumor grade, were collected from all participants.

The study design is depicted in Fig. 1. A total of 380 patients were randomized and allocated into a training cohort (n = 228) and a validation cohort (n = 152) in a 6:4 ratio. In the training cohort, 55 patients with benign nodules and 173 patients with lung cancer were included. Similarly, the validation cohort comprised 36 patients with benign nodules and 116 patients with lung cancer. This work conformed to the ethical guidelines of the Declaration of Helsinki and has been approved by the Medical Ethics Committee of Zhejiang Hospital (approval No. 2021-141 K).

Fig. 1
figure 1

Flowchart showing the study design. SE: Sensitivity; SP: Specificity; 4MP: Pro-SFTPB, CA125, Cyfra21-1, and CEA; LCBP: lung cancer biomarker panel

Nodule assessment and histological diagnosis

Since LDCT provides a better evaluation of the morphologic features of pulmonary nodules, all patients underwent LDCT examination using the multiple contiguous sequential axial imaging procedure through the thorax. The PneuView system (Myrian, Paris, France) was used to analyze the features of nodules. Three radiologists specializing in thoracic imaging conducted a retrospective analysis of all CT characteristics of nodules, including size, number, margin, density, and shape, ultimately reaching a consensus. The postoperative histopathological examination serves as the gold standard for diagnosing both benign and malignant lesions. Histological assessments were conducted by a minimum of two independent histologists.

Detection of serum biomarker levels

Blood samples were collected from all patients when pulmonary nodules were detected on the first examination. Serum levels of CA125 (2K45.77, Abbott Laboratories), CEA (7K68.74, Abbott Laboratories), and Cyfra21-1 (2P55.74, Abbott Laboratories) were analyzed by the immunofluorescence assay on the ARCHITECT i2000SR platform (Abbott Laboratories). The serum Pro-SFTPB level was detected using the immunofluorescence Assay Kit (2024060601, Cosmos Wisdom) on the SMART 500S platform (KEYSMILE).

Statistical analysis

All statistical analyses were analyzed using the SPSS 26 software. Measures that conformed to normal distribution were expressed as the mean ± the standard deviation (SD), and comparisons between groups were made using the independent samples t-test; measures that were not normally distributed were expressed as M (P25, P75), and comparisons between groups were made using the Mann–Whitney U test. The χ2 test was used to compare the count data between groups. Clinical features were selected utilizing the LassoCV method with a fivefold cross-validation approach. Subsequently, the Lasso regression model was employed to analyze the importance of these clinical features. Later, the receiver operating characteristic curve (ROC) was plotted by Medcalc 16.8.4 software to analyze the predictive performance of 4MP, clinical data, and CT imaging features in differentiating between benign and malignant lung nodules. Statistical significance was considered at P < 0.05.

Results

Subject characteristics

This study enrolled 55 benign nodule patients and 173 early-stage lung cancer patients in the training cohort. The patient’s clinical characteristics (age, gender, BMI, drinking history, smoking history, family and personal history of cancer) and CT features of nodules (size, number, margin, density, and shape) were detailed in Table 1. In the training cohort, we found significant differences in nodule size, shape, margin, and density between the two groups (P < 0.05). However, age, BMI, gender, smoking history, drinking history, personal and family history of cancer, and nodule number (A single pulmonary nodule is denoted by the 1, whereas multiple pulmonary nodules are represented by the ≥ 2) were not statistically significant in the benign nodule patients group and the early-stage lung cancer group patients (P > 0.05).

Table 1 The characteristics of patients in the training and validation cohort

Analysis of serum biomarker levels

To investigate whether 4MP levels differed in the benign nodules and early-stage lung cancer patients, we analyzed the 4MP levels in the two groups. The level of CA125 had significant differences between the benign lung group and the early-stage lung cancer group in the training cohort (P < 0.05). We found that serum Pro-SFTPB, Cyfra21-1, and CEA levels were higher in the early-stage lung cancer group than in the benign lung group patients, but the differences were not significant (Fig. 2A–D).

Fig. 2
figure 2

The serum biomarker Cyfra21-1 (A), CEA (B), CA125 (C), and Pro-SFTPB (D) levels of patients in the training cohort.* P < 0.05

Diagnosis performance of the 4MP detection

We evaluated the efficacy of serum 4MP detection in distinguishing between benign nodules and early-stage lung cancer using the training cohort, which yielded an AUC value of 0.612 (Fig. 3A). We also analyzed the diagnostic effects of clinical characteristics (age, gender, BMI, smoking history, drinking history, personal and family history of cancer) and CT features of nodules (size, number, margin, density, and shape) in differentiating benign nodules patients and early stage lung cancer, respectively (Fig. 3A). The analysis showed that the AUC values for clinical characteristics and CT features of nodules to identify benign nodule and early-stage lung cancer patients with pulmonary nodules were 0.628, and 0.726, respectively (Table 2). Interestingly, we found that clinical characteristics had high specificity (78.18%) and low sensitivity (48.55%). In comparison, CT features of the nodule and 4MP had high sensitivity (79.19%, 65.90%) and low specificity (58.18%, 54.55%) in identifying benign nodule and early-stage lung cancer patients (Table 2). Next, an independent validation cohort (n = 152) containing 116 benign nodule patients and 36 early-stage lung cancer patients was used to validate the efficacy of the 4MP, clinical characteristics, and CT features of nodules. In the validation cohort, we found that the AUC values for 4MP, CT features of nodules, and clinical characteristics to identify benign and malignant nodule patients were 0.686, 0.678, and 0.701, respectively (Fig. 3B, Table 2). These results suggest that patient clinical characteristics and CT features of nodules may increase the performance of serum 4MP detection.

Fig. 3
figure 3

The receiver operating characteristic curve (ROC) of 4MP, CT features of nodule, and clinical characteristics for differentiating patients with benign nodules or lung cancer in the training cohort (A) and validation cohort (B). 4MP: Pro-SFTPB, CA125, Cyfra21-1, and CEA

Table 2 The discrimination performance of 4MP, clinical characteristics, CT features of nodule, and composite model in the training and validation cohort

Construction of a new composite model

We investigated the diagnostic performance of 4MP combining clinical characteristics and CT features of nodule. LASSO regression analyses were conducted on the gathered clinical data and CT characteristics of the patients, resulting in the identification of 7 factors with non-zero coefficients within the training cohort. These factors include age, gender, BMI, family history of cancer, nodule size, nodule margin, and nodule density (Fig. 4A). Subsequently, Lasso regression was employed to conduct an analysis of variable importance. The variables, ranked in descending order of importance, were as follows: nodule density, nodule margin, gender, family history of cancer, nodule size, BMI, and age (Fig. 4B, Table 3). Based on the 4MP and 7 factors of pulmonary nodules, we constructed a new composite model (composite model = Pro-SFTPB + CA125 + CEA + Cyfra21-1 + age + gender + BMI + family history of cancer + nodule size + nodule margin + nodule density) and found that the AUC value of the new composite model was 0.808, sensitivity was 75.14%, and specificity was 74.55% (Fig. 4C, Table 2). The equation for calculating the probability of early-stage lung cancer was derived from logistic regression: logit(P) = 4.064 + 0.01*Pro-SFTPB − 0.103*CA125 − 0.053*CEA − 0.189*Cyfra21-1 − 0.025*age − 1.220*gender − 0.084*BMI + 0.713*family history of cancer + 0.121* nodule size + 1.269*nodule margin − 2.145*nodule density. Also, the AUC value of the new composite model was 0.714, the specificity was 44.44%, and the sensitivity was 87.07% in the validation cohort (Fig. 4D, Table 2). These results indicate that the new composite model has a high predictive performance in identifying early-stage lung cancer and benign lung nodules.

Fig. 4
figure 4

Selection of characteristics from LASSO regression (A, B), and the receiver operating characteristic curve (ROC) of composite model for differentiating patients with benign nodules or lung cancer in the training cohort (C) and validation cohort (D). 4MP: Pro-SFTPB, CA125, Cyfra21-1, and CEA

Table 3 Results of the importance analysis of the 7 factors

Diagnosis performance of the new composite model

To further evaluate the diagnostic efficacy of the novel composite model in distinguishing between benign nodules and early-stage lung cancer, we conducted an analysis of the model’s performance across nodules of varying sizes (Fig. 5). In this study, the AUC of the new composite model was 0.734 for 42 controls and 178 cases with nodule sizes > 10 mm. For 49 controls and 153 cases with nodule sizes of ≤ 10 mm, the AUC was 0.756. Among 27 controls and 107 cases with nodule sizes of ≤ 8 mm, the AUC increased to 0.820. Furthermore, for 9 controls and 27 cases with nodule sizes of ≤ 6 mm, the model demonstrated an AUC of 0.835, with high sensitivity (70.37%) and specificity (88.89%) (Table 4). These findings suggest that the new composite model exhibits significantly enhanced diagnostic efficacy in detecting smaller nodules.

Fig. 5
figure 5

The receiver operating characteristic curve (ROC) of composite model for benign and malignant differentiation of lung nodules in different sizes. A: nodule > 10 mm; B: nodule ≤ 10 mm; C: nodule ≤ 8 mm; D: nodule ≤ 6 mm

Table 4 The discrimination performance of composite model in lung nodules of different sizes

In addition, we screened 154 patients (34 controls and 120 cases) from the subjects involved in this study for performance comparison between the new composite model and LCBP nodule risk models. In this population, the AUC of the new composite model was 0.680, and the AUC of the LCBP nodule risk models was 0.599 (Fig. 6). Furthermore, the specificity of the new composite model and the LCBP nodule risk model was 70.59% and 61.76%, respectively, while their sensitivity was 67.50% and 60.83%, respectively (Table 5). These findings suggest that the composite model outperformed the LCBP nodule risk model.

Fig. 6
figure 6

The receiver operating characteristic curve (ROC) curve comparison between the composite model and LCBP nodule risk model. LCBP: lung cancer biomarker panel

Table 5 The discrimination performance comparison between the composite model and LCBP nodule risk model

Discussion

Although the response of patients to treatment has dramatically improved in recent years with the advent of precision therapies for lung cancer, such as immunotherapy and targeted therapy, the 5-year survival rate is still only 21% [21]. Early screening, diagnosis, and timely treatment are the keys to a good prognosis for lung cancer patients [22]. Clinically, the presence of small nodules in a patient’s lungs on CT imaging is critical in determining early lung cancer [23]. Nonetheless, not all lung nodules are malignant, and in a large-scale study of CT screening for lung cancer, it was found that about 49% of cancers screened for may be overdiagnosed [24, 25]. In recent years, the prevalence of pulmonary nodules identified through chest CT scans conducted during routine medical care has risen significantly. Consequently, the effective management of both incidental and screen-detected nodules has emerged as a critical public health concern [26]. Therefore, improving the ability to recognize and predict benign and malignant lung nodules is vital to treating lung cancer.

At present, the primary challenge in managing lung nodules lies in accurately identifying high-risk nodules with potential malignancy and stratifying patients into distinct risk categories to inform subsequent management strategies [26]. In the NELSON trial, it was observed that participants with nodules measuring < 5 mm in diameter exhibited a low probability of developing lung cancer [27, 28]. Conversely, those with nodules ranging from 5 to 10 mm demonstrated a moderate probability, while nodules ≥ 10 mm were associated with a significantly increased likelihood of lung cancer development [27, 28]. In accordance with the Fleischner guidelines, distinct management strategies are necessitated for solid and subsolid pulmonary nodules. Specifically, for solid pulmonary nodules > 8 mm in size, tissue sampling is advised. [29]. In Japan, the protocol for LDCT lung cancer screening advises follow-up evaluations at intervals of 3, 6, 12, 24, 36, 48, and 60 months for nodules with an overall mean diameter of < 15 mm and a solid component measuring < 8 mm in diameter [30]. The [Chinese Expert Consensus on the Diagnosis and Treatment of Pulmonary Nodules (2024)] delineates 18 consensus points, underscoring the critical importance of early diagnosis and intervention. It recommends specific screening ages for high-risk populations, clarifies the definition of lung nodules and the methodologies for their assessment, and advocates for the integration of artificial intelligence to enhance diagnostic accuracy [31]. Ye et al. proposed for the first time an adjustment of the criteria for a positive result in chest CT screening for pure ground-glass nodules of the lung in a Chinese population [32]. They suggested that the criteria should be raised from 6 to 8 mm and that only pure ground-glass nodules with a diameter of 8 mm and above should require management of lung nodules [32].

Liquid biopsy, as a non-invasive approach, has received widespread attention for its ease of repeated analysis and its ability to monitor tumor recurrence, metastasis, and response to treatment in real-time [33]. With the rapid development of molecular techniques, circulating tumor cells, circulating tumor DNA, circulating cell-free RNA, circulating cell-free DNA, and extracellular vesicles show potential clinical value in the diagnosis, treatment, and prognosis of lung cancer, but the low concentration of them in the blood results in low sensitivity of liquid biopsies [33]. Recently, Chen et al. developed an epigenetic biomarker model based on circulating ribosomes that is particularly effective in identifying high-risk lung nodules [34]. For the first time, MD Anderson researchers found that the 4MP was helpful for lung cancer risk prediction, with a higher AUC value for the 4MP + smoking model than the smoking-based risk prediction model (0.83 vs. 0.73) [19]. Afterward, researchers analyzed the performance of 4MP in distinguishing lung cancer from benign lung nodules [35]. The researchers found that the 4MP + nodules size model had a higher AUC (0.895) than the model based on nodule size alone (AUC was 0.860) or 4MP (AUC was 0.757), and in the independent validation cohort, the AUC of 4MP was 0.87 [35]. In the past two years, to step forward to determine the role of 4MP in lung cancer, researchers explored the lung cancer risk prediction performance of the 4MP + PLCOm2012 model and found the 4MP can be used for lung cancer risk assessment, with AUC values of 0.80 for 4MP alone detection and 0.85 for the combined 4MP + PLCOm2012 model for sera from cases collected within 1-year preceding diagnosis [36]. Moreover, Vykoukal et al. analyzed the predictive performance of 4MP in distinguishing lung cancer patients from controls and found an AUC value of 0.80 for 4MP, whereas the AUC value for 4MP + miR-210-3p + miR-320a-3p + miR-21-5p was 0.81 [37]. In 2024, MD Anderson Cancer Center researchers analyzed repeated measurements of 4MP in pre-diagnostic serum from 2483 ever-smoker participants [38]. They improved the performance of 4MP in the early detection of lung cancer using a parametric empirical Bayes algorithm [38]. However, most of these studies have investigated clinical diagnoses in Western populations. Lung cancer types, environmental factors, and genetic susceptibility are different between Western and Asian populations [39]. Yao et al. found that 4MP combined with SCC, neuron-specific enolase (NSE), and pro-gastrin-releasing peptide (Pro-GRP) better-distinguished lung cancer and lung disease, and lung cancer pathology types in Chinese patients [2]. In our previous study, we found that the 4MP significantly identified Chinese lung cancer patients from normal individuals [20]. The nodule risk model (4MP + nodule size) constructed by 4MP combined with nodule size has good potential in the benign-malignant differential diagnosis of lung nodules [20]. As our previous study was based on the results of a single-center study, we collected a new study cohort to validate the performance of 4MP in the differential diagnosis of benign nodule and early-stage lung cancer patients. In this research, our results showed that the AUC of 4MP in distinguishing early-stage lung cancer from Chinese benign lung nodule patients was 0.612 in the training cohort and 0.686 in the validation cohort. Therefore, we aim to improve the diagnostic performance of 4MP by combining the detection of other factors.

Clinical studies have demonstrated that benign or malignant lung nodules correlate with patients’ clinical characteristics and CT features of nodules. In the Mayo model incorporating patient age, history of cancer, cigarette-smoking status, spiculation, nodule diameter, and upper lobe location as predictors, the AUC value of the Mayo model was 0.833 [15]. However, researchers and clinicians have found that the Mayo model may not apply to Asians [40]. In the Brock model, age, gender, emphysema, family history of cancer, nodule size, total nodule number, solid nodule, spiculation parameters, and upper lobe involvement are used, and the Brock model with AUCs of at least 0.94 in an external validation cohort [41]. Previous research found that differences in disease prevalence and environmental factors may have led to the limited applicability of the Brock model in Asian populations, which had an AUC of 0.58 to 0.71 in the Chinese cohort [32]. In recent years, researchers have made several advances in the study of differential diagnosis and treatment of benign and malignant nodules. Miao et al. proposed a deep learning model combining CT images of lung nodules and intrathoracic fat images to differentiate between benign and malignant lung nodules, which significantly outperformed the model using CT images of lung nodules alone with an AUC of 0.910, 0.922, and 0.899 in the internal and external test cohorts, respectively [42]. Zhao et al. proposed the MAEMC-NET model based on self-supervised learning, which can effectively distinguish between benign and malignant isolated lung nodules by analyzing CT images of patients, and the AUC value of the model is 0.962 [43]. Meng et al. constructed a new risk stratification model cLung-RADS®v2022 based on Lung-RADS®v2022 and CT features for predicting invasive pure ground-glass nodules in China, which had an AUC value of 0.718 and 0.693 in the training and validation sets, respectively [44]. We analyzed the diagnostic effects of clinical characteristics (age, gender, BMI, drinking history, smoking history, family and personal history of cancer), CT features of nodules (size, number, margin, density, and shape) in Chinese benign nodules patients and early-stage lung cancer patients. We found that the AUC values for clinical characteristics and CT features of nodules were 0.628 and 0.726, respectively.

Related studies have shown that blood biomarkers combined with clinical characteristics can significantly improve the predictive performance of risk models for early malignant lung nodules. Xu et al. constructed a network diagnostic model consisting of seven autoantibodies (CAGE, PGP9.5, GAGE7, MAGEA1, SOX2, GUB4-5, and P53), clinical characteristics (age, cancer history, smoking history), and imaging features (nodules size, total nodule number, property of nodule, spiculation, lobulated sign, vessel sign, bubble-like sign, and pleural indentation) for the diagnosis of lung nodules, which had an AUC value of 0.96 [45]. In addition, Yang et al. developed the LCBP nodule risk model; in the training cohort, the AUC of the nodule risk model was 0.9151, but in the validation cohort, the AUC was only 0.5836 [17]. Hou et al. developed a predictive model based on CEA, CYFRA21-1, and CT features to differentiate between benign and malignant lung nodules, which achieved an AUC of 0.85 and 0.76 in the training and validation groups, respectively [18]. In this study, we screened out 7 factors of patient clinical information and CT features of nodule and developed a novel composite model that integrated 4MP, clinical characteristics (age, gender, BMI, family history of cancer), and CT features of nodule (nodule size, nodule margin, and nodule density). This study found that the novel composite model had a good predictive performance, with an AUC value of 0.808 in the training cohort and 0.714 in the validation cohort.

In lung cancer screening, based on primarily retrospective analyses of data from the International Early Lung Cancer Action Program and the NLST data, it is generally accepted that 6.0 mm is the threshold for positive results on the baseline scan [46]. It is important to note that this does not mean that cancers smaller than 6.0 mm cannot be detected on a baseline scan, it just means that they have a low incidence of malignancy [46]. The probability of malignancy is 1–2% for nodules 6–8 mm and less than 1% for all nodules smaller than 6 mm [12]. In the NLST, the lung cancer probability was 0.3% when the nodule diameter was 4–6 mm [47]. Texas MD Anderson Cancer Center studies have shown that in patients with nodule size ≤ 6 mm, the panel of nodule size + 4MP combinations performed exceptionally well, with an AUC of 0.95 [35]. Surprisingly, we found that the AUC values of the composite model were 0.820 and 0.835 in patients with ≤ 8 mm or ≤ 6 mm pulmonary nodules, respectively. Besides, the performance of the composite model (AUC = 0.680) was better than that of the LCBP nodule risk model (AUC = 0.599). These observations suggest that the new composite model has high performance in identifying benign lung nodules and early-stage lung cancer in Chinese patients and may show higher performance in smaller nodules. The new composite model is suitable for the adjunctive diagnosis of early-stage lung cancer patients, and when LDCT is used to screen people at high risk of lung cancer, patients with difficult-to-identify lung nodules can be further diagnosed by combining the patient’s clinical characteristics, biomarker levels, and CT features.

There are several limitations to this study. Firstly, although we analyzed the performance of the 4MP with patient data from different hospitals, continued multicentre studies are needed to comprehensively assess the applicability of the 4MP. Second, the new composite model relies on the characteristics of CT images, which may reduce the applicability of the model in certain resource-limited areas where patients may not have access to CT scans. Finally, there is a need to expand the sample size, especially for patients with lung nodules ≤ 6 mm, which will be the focus of our future studies.

Conclusions

In summary, we constructed a new composite model for differential diagnosis of benign nodule and early-stage lung cancer patients in Chinese patients, which can effectively diagnose malignant pulmonary nodules, particularly small ones, aiding in stratifying patients by lung cancer risk.