Introduction

Lung cancer is a malignant tumor characterized by uncontrolled cell growth in lung tissues1. According to 2020 global cancer statistics, lung cancer has the highest incidence (11.4%) and mortality (18%) worldwide2. Lung cancer may present with a range of symptoms, including cough, chest pain, dyspnea, anorexia, weight loss, and fatigue3. These studies reported a mean survival time of 13 months, a median survival time of 4.8 months, and a 5-year survival rate of 17% 4,5. Age and sex have been identified as predictors of survival (Abedi et al., 2019). Therefore, to reduce the disease burden, identifying risk factors and monitoring the health status of patients with lung cancer are imperative. Accelerated biological aging, characterized by a premature decline in homeostasis, is a major risk factor for age-related diseases and mortality6. Aging is a primary contributor to the development and progression of lung cancer7, and it is associated with molecular and physiological changes that impair lung function, reduce lung remodeling and regeneration capacity, and increase susceptibility to acute and chronic lung diseases, including lung cancer8,9. However, few population-based studies have explored the relationship between lung cancer and biological aging. Previous studies have shown that increased biological age is associated with increased risks of cancer, mortality, depression, anxiety, and chronic kidney disease10,11,12. Researchers have developed and validated a multisystem aging measure called phenotypic age (PhenoAge) as a surrogate marker of biological aging13. This measure emphasizes the use of clinical biomarkers to approximate an individual’s biological aging rate, as opposed to focusing solely on genetic or molecular markers. This approach has been shown to be a reliable indicator of remaining life expectancy, morbidity, and mortality risk, making it a powerful tool for assessing biological aging in population studies13. Previous studies have demonstrated a link between aging and an increased risk of various cancers, including lung cancer14. Recent studies have further explored the relationship between biological aging, as measured by DNA methylation-based biomarkers, and lung cancer risk. Age acceleration measures, such as PhenoAge and GrimAge, have been shown to be associated with increased cancer risk, particularly lung cancer13,15,16. A prospective analysis in the UK Biobank and a Mendelian randomization study revealed that PhenoAgeAccel was positively associated with lung cancer risk, independent of chronological age17. Additionally, smoking and electronic cigarette use have been linked to accelerated epigenetic aging in lung tissue, with both smokers and vapers showing significantly older GrimAge and shorter telomere lengths than nonsmokers do18. These findings suggest that epigenetic age acceleration is an independent risk factor for lung cancer and may provide insights into the relationships among biological aging, smoking, and cancer risk13,15. However, compared with DNA methylation-based biological age measurements, PhenoAge is more accessible and less expensive. The present study is the first time that PhenoAge has been used to explore the relationship between biological aging and lung cancer in the U.S. population. The identification of individuals whose biological age exceeds their chronological age may facilitate the implementation of timely interventions, thereby preventing the occurrence of disease and effectively reducing the mortality rate of the disease.

Therefore, the aim of this study was to analyze the relationship between biological aging and lung cancer, as well as its impact on mortality, on the basis of the NHANES database.

Methods

Study population

In this study, we analyzed data to examine the associations among accelerated biological aging, lung cancer risk, and all-cause mortality among individuals with lung cancer. Data were drawn from the U.S. National Health and Nutrition Examination Survey (NHANES), a continuous cross-sectional study conducted over 2-year cycles from 2001 to 2020. We included participants aged > 18 years with available data on the clinical biomarkers required for the calculation of PhenoAge and lung cancer status on the basis of self-reported physician diagnoses. For the mortality analysis, we further excluded individuals with missing death records.

This study was conducted in accordance with the Declaration of Helsinki and approved by the National Center for Health Statistics (NCHS) Research Ethics Review Board19. All participants provided written informed consent.

Ascertainment of outcomes

In the NHANES, all participants completed a questionnaire that included a cancer-related question: “Have you ever been told by a doctor or other health professional that you had cancer or a malignancy of any kind?” The participants who responded “yes” were asked a follow-up question (“What kind of cancer was it?“) to identify the participants who were diagnosed with lung cancer.

For the longitudinal analysis, mortality status was ascertained from the NHANES Public Use Linked Mortality File, which provides vital status follow-up data through record linkage, enabling survival analyses of NHANES participants20.

Procedures

PhenoAge was computed for all NHANES participants as a measure of biological aging which captures the influence of clinical biomarkers on health outcomes and aligns with previous work demonstrating its ability to predict disease risk and aging-related outcomes more effectively than chronological age13. PhenoAge was derived using the Gompertz proportional hazard model, which is widely recognized in aging research for modeling mortality risk and its exponential increase with age. This model is particularly well-suited for aging studies as it reflects the acceleration of aging-related mortality risk over time. The calculation of PhenoAge follows a well-established protocol, where two Gompertz proportional hazard models are parameterized: The first model incorporates ten biomarkers and chronological age to estimate an individual’s biological aging status. The second model uses chronological age alone as the predictor variable, serving as a baseline comparison to evaluate whether biological aging (PhenoAge) provides additional predictive value beyond chronological age alone (Levine, 2013). In accordance with the protocol previously outlined in related studies, the NHANES III dataset was employed as the reference population, with an individual’s PhenoAge reflecting the chronological age of an individual with the same mortality risk21. The following formula was employed for the purpose of determining the PhenoAge.

$$\begin{aligned} {\text{PhenoAge }} = & 143.5671 \\ & + \frac{{\ln [ - 0.0059383581 \times \ln [1 - {\text{ mortality risk }}]]}}{{0.08548908}} \\ & {\text{mortality}}\;{\text{ risk}} = 1 - e^{{ - e^{{x\frac{{\exp (120x\gamma ) - 1}}{\gamma }}} }} \\ & \gamma = 0.007354285 \\ \end{aligned}$$

where xb = −19.907−0.0336*Albumin + 0.0095*Creatinine + 0.1953 × Glucose-0.0120 × Lymphocyte Percent + 0.0268 × Mean Cell Volume + 0.3306 × Red Cell Distribution Width + 0.00188 × Alkaline Phosphatase + 0.0554 × White Blood Cell Count + 0.0804 × Chronological Age.

The concept of biological aging is defined as the value of biological age that exceeds an individual’s chronological age. Accelerated biological aging was determined through a calculation of the difference between the estimated biological age and the actual chronological age21.

In this study, based on their clinical relevance and existing literature12,22, the following variables were included: age, sex (male, female), race (non-Hispanic white, non-Hispanic black, Mexican American, other), education level (grade school or less, high school, more than high school), poverty status (ratio of family income to poverty), smoking status (never, former, current smoker), drinking status (never, former, light or moderate, heavy), and BMI. Additionally, history of emphysema was included in this study. The participants were classified according to their smoking status, which was divided into three categories: current smokers, former smokers, and nonsmokers. The term ‘current smoker’ was used to describe individuals who had smoked on a regular basis and had smoked at least 100 cigarettes in their lifetime. Former smokers had smoked at least 100 cigarettes and subsequently ceased this practice. Those who had never smoked were classified as nonsmokers, as were those who had smoked fewer than 100 cigarettes. The drinking status of the participants was categorized according to the following definitions: never (had consumed fewer than 12 alcoholic beverages in their lifetime), former (had consumed 12 or more alcoholic beverages in a single year but had not drunk any alcohol in the preceding year or had not drunk any alcohol in the preceding year but had consumed 12 or more alcoholic beverages at some point in their lifetime), light/moderate drinker (≤ 1 alcoholic beverage per day for women or ≤ 2 alcoholic beverages per day for men on average over the past year), or current heavier drinker (> 1 alcoholic beverage per day for women or > 2 alcoholic beverages per day for men on average over the past year). Individuals can be classified according to their body mass index (BMI) into one of three categories: ≤25, > 25 to < 30, or ≥ 30 kg/m2. The medical history of the participants regarding emphysema was established on the basis of the self-reported information provided by the participants.

Study design

In this study, we employed a two-stage design to examine the relationships among accelerated biological aging, lung cancer risk, and all-cause mortality among individuals with lung cancer, utilizing data from the NHANES. In Stage 1, we performed a cross-sectional analysis to investigate the association between PhenoAge acceleration and the prevalence of lung cancer at baseline. Lung cancer status was determined on the basis of self-reported diagnoses collected through NHANES questionnaires. In Stage 2, a longitudinal analysis was conducted to evaluate the association between PhenoAge acceleration and all-cause mortality among participants diagnosed with lung cancer. Mortality data were obtained from the NHANES Public Use Linked Mortality File, enabling survival analysis during the follow-up period.

Statistical analysis

In the present study, we employed NHANES data to ascertain the baseline characteristics of patients with and without lung cancer. The data are expressed as the means ± SDs (x ± s). The statistical analysis of the differences among the groups was conducted using either Student’s t test or one-way analysis of variance (ANOVA). For the purpose of multiple comparisons, the SNK or LSD method was employed. Qualitative data are presented as percentages and were analyzed via either a chi-square (x2) test or Fisher’s exact test, as appropriate. Missing data were handled by treating missing values for categorical variables (BMI, smoking status, drinking status, education level, and history of emphysema) as separate categories to preserve the full sample size and avoid potential selection bias. The reported P value was two-sided, and a value of less than 0.05 was considered to indicate statistical significance. In alignment with the NHANES analytic guidelines, the present analyses accounted for the complex sampling design, including sampling weights, stratification (strata), and clustering (PSU, primary sampling units). All statistical models incorporated NHANES survey design variables to ensure correct variance estimation and population inference. Furthermore, all analyses were adjusted for potential confounding variables.

In this study, PhenoAge acceleration (PhenoAgeAccel) was initially calculated as a continuous variable using the R package BioAge. To explore the distribution of PhenoAgeAccel across participants, we also categorized it into quartiles on the basis of the PhenoAgeAccel distribution in the study cohort, with cutoffs corresponding to the 25th, 50th (median), and 75th percentiles. Furthermore, PhenoAge acceleration was computed by regressing PhenoAge against chronological age at the time of biomarker measurement, with the residuals defining the PhenoAgeAccel.

In logistic regression models, PhenoAge acceleration was assessed both as a continuous variable and as a binary variable. ‘Accelerated biological aging’ was defined as a PhenoAge acceleration value exceeding zero, whereas ‘nonaccelerated aging’ was defined as a PhenoAge acceleration value of zero or less. Given the presence of zero inflation (many participants exhibited no evidence of accelerated aging), we used both continuous and binary classifications of PhenoAge acceleration to assess its impact on lung cancer risk.

The clinical outcome of all-cause mortality over time was described with Kaplan–Meier survival curves and compared by the log-rank test. Furthermore, Cox proportional hazards regression models were used to evaluate the relationships between PhenoAge and mortality in individuals with lung cancer. The confounders were selected on the basis of clinical interest, previous scientific literature and the identification of all significant covariates in the univariate analysis. The objective was to evaluate the association between lung cancer incidence and PhenoAge acceleration, and for this purpose, several models were employed. The crude model was not adjusted for any variables, whereas Model 1 was adjusted for age, sex, and race. Model 2 was further adjusted for education level, the family income-to-poverty ratio, smoking status, drinking status, BMI, and history of emphysema. This stepwise adjustment approach is widely used in epidemiological studies to evaluate whether biological aging provides additional predictive value beyond demographic and lifestyle factors22. It allows for a clearer understanding of the independent association between PhenoAge acceleration and lung cancer risk. The P value for the trend across increasing exposure groups was calculated using integer values (1, 2, 3, and 4). A restricted cubic spline (RCS) regression model was employed to assess the dose‒response relationships between biological age (PhenoAge) and the risk of developing lung cancer, as well as between biological age and all-cause mortality in individuals with lung cancer. The model was fitted using three knots placed at the 25th, 50th, and 75th percentiles of the PhenoAge distribution to capture the nonlinear relationships and better understand how biological age influences both outcomes.

We performed several sensitivity analyses. First, participants who had values that fell within the 1% extremes of the PhenoAge acceleration range were excluded to rule out the effects of extreme values. Second, the associations between PhenoAge acceleration and lung cancer and all-cause mortality were reanalyzed without the consideration of complex sampling designs. Third, to determine the impact of various factors on the relationship between lung cancer and biological aging, we conducted a subgroup analysis and an interactive test. Fourth, we conducted sensitivity analyses to assess whether missingness influenced the study outcomes. Finally, we applied several additional association inference models: propensity score adjustment (PSA), propensity score matching (PSM)23, inverse probability of treatment weighting (IPTW)24, standardized mortality ratio weighting (SMRW)25, pairwise algorithm (PA)26, and overlap weight (OW)27. The calculated effect sizes and P values from all these models were reported and compared. All the analyses were performed with the statistical software packages R3.3.2 (http://www.R-project.org, The R Foundation) and the Free Statistics analysis platform (Version 1.9, Beijing, China, http://www.clinicalscientists.cn/freestatistics).

Results

Among the 97,657 participants recruited to the NHANES from 2001 to 2020, 46,499 participants had missing data on their lung cancer status and PhenoAge measurements. The percentages of missing data are summarized in Table 1. The rate of missing BMI data was 1.8% (n = 726). For smoking status, 2.9% of the data were missing (n = 1179). Drinking status had the highest proportion of missing data, with 10.5% missing values (n = 4295). Education level had a negligible missing data rate of 0.1% (n = 41). History of emphysema had a missing data rate of 3.5% (n = 1437). A total of 10,315 participants aged 18 years or younger and 78 people with missing death data were excluded, resulting in 40,765 people being included in the analysis (Fig. 1). The mean age of the participants included in our analyses was 48.3 years (SD 18.7), and 20,046 (51.4%) were male (Table 1).

Fig. 1
figure 1

Flowchart of participant selection from the US NHANES.

Table 1 Baseline characteristics of participants by lung cancer status.

Overall, participants with lung cancer were older, more likely to be Non-Hispanic White, more likely to be former or current smokers, and more likely to be former drinkers than participants without lung cancer were (p < 0.001 for all comparisons). Individuals with a history of emphysema also had a higher incidence of lung cancer than the general population did. There were 10,874 (26.7%) participants who were classified as having PhenoAge acceleration, and those with accelerated biological aging had a greater incidence of lung cancer than those without accelerated biological aging.

Cross-sectional analyses revealed statistically significant associations between biological aging and the prevalence of lung cancer for each model. Each 1-year increase in PhenoAge acceleration increased the risk of developing lung cancer by 4% (p < 0.001, Table 2, Model 2).

Table 2 Association between biological aging and the risk of developing lung cancer.

Individuals with accelerated aging had a greater risk of developing lung cancer than nonaccelerated aging individuals did, with odds ratios of 2.60 (full adjustment 95% CI 1.59–4.22, p < 0.001). According to the fully adjusted models, participants in the Q4 subgroup had a 334% greater risk of developing lung cancer.

Kaplan–Meier analyses of individuals diagnosed with lung cancer revealed a greater risk of all-cause mortality for those with accelerated aging than for those with nonaccelerated aging (log rank test P = 0.028, Supplementary_Figure 1). Among lung cancer patients, the multivariate Cox proportional hazards model results, adjusted for potential confounders, revealed that each 1-unit increase in PhenoAge acceleration was associated with a 12% increase in all-cause mortality (95% CI 1.04–1.20, P = 0.003) (Table 3, Model 2).

Table 3 Association between biological aging and the risk of all-cause mortality among lung cancer patients.

Moreover, individuals in the Q4 quartile presented a greater incidence of all-cause mortality (HR = 11.82, 95% CI 1.48–94.3, P = 0.02) than those in the Q1 quartile did, following adjustment for potential confounding variables. The mortality rate was 219% higher in the accelerated aging cohort than in the nonaccelerated aging cohort (95% CI 1.28–7.94, P = 0.013) in Model 2.

The estimated dose‒response curves revealed a statistically significant linear correlation between PhenoAge acceleration and both lung cancer risk (Supplementary_Figure 2, P value for nonlinearity = 0.22) and all-cause mortality (Supplementary_Figure 3, P value for nonlinearity < 0.001). These analyses were adjusted for multiple variables using RCS regression and excluded the highest and lowest 0.5% of the PhenoAge acceleration measurements to ensure robustness. As PhenoAge acceleration increases, both lung cancer risk and mortality show a noticeable upward trend, suggesting that greater biological aging is associated with an increased risk of developing lung cancer and a greater likelihood of death in individuals with lung cancer.

In cross-sectional analyses, the correlation between PhenoAge acceleration and lung cancer was stronger in male participants than in their female counterparts. Additionally, this correlation was more pronounced in individuals without a history of emphysema than in those with a history of emphysema (Fig. 2). The associations between PhenoAge acceleration and the risk of developing lung cancer were more pronounced in Non-Hispanic White and Mexican American individuals and in those with BMIs between 25 and 30 kg/m2 than in those with other BMIs. No statistically significant interaction was observed in the cross-sectional analysis between the other stratified variables. Sensitivity analyses revealed that the results remained consistent and robust after excluding participants with PhenoAge acceleration values in the upper and lower 1% extremes (as detailed in Supplementary_Table 1). Furthermore, repeating the main analyses without considering complex sampling designs (as detailed in Supplementary_Table 2) yielded similar results. After PSM, a statistically significant positive association was identified between PhenoAge acceleration and lung cancer (OR = 2.32, 95% CI 1.41–3.83, p = 0.001). The results remained consistent when weighting analysis was employed with SMRW, PA, and OW (Supplementary_Table 3). For the analysis of all-cause mortality, PSM revealed that PhenoAge acceleration was significantly associated with increased all-cause mortality (HR = 2.27, 95% CI 2.18–2.36; P < 0.001). Weighted Cox regression models yielded consistent results, with HRs ranging from 1.91 to 1.93 across the IPTW, SMRW, PA, and OW methods (Supplementary_Table 4). Additionally, when excluding participants with missing data, the results remained unchanged, confirming the robustness of our findings (Supplementary_Table 5).

Fig. 2
figure 2

Stratified analyses of the associations between biological aging and the risk of developing lung cancer.

Discussion

In this study, we explored the associations between accelerated biological aging and both lung cancer risk and all-cause mortality among individuals with lung cancer via data from the NHANES cohort. Our findings indicate that PhenoAge acceleration, a clinical measure of biological aging, is significantly associated with an increased risk of developing lung cancer and increased all-cause mortality in lung cancer patients.

The relationship between aging and lung cancer risk has been postulated to be attributable to three primary causal factors: first, the progressive accumulation of unresolved damage resulting from exposure to carcinogenic substances (such as nicotine)28; second, the age-related deterioration in immune function9; and third, augmented cellular senescence29. Furthermore, oxidative stress has been demonstrated to alter the function of DNA repair mechanisms through epigenetic modifications. This process is also associated with premature aging, which can be attributed to telomere shortening caused by impaired DNA repair. Notably, telomere shortening has been linked to the development of lung cancer30. The concept of ‘lung age’, derived from the first second of forced expiration, body height and age, has been demonstrated to function as an indicator of pulmonary obstructive impairment and to be significantly associated with postoperative respiratory complications and patient survival in cases of lung cancer31. In addition, previous studies at UK Biobank have shown that biological aging increases the risk of developing lung cancer, which is consistent with our findings12,17. Biological age, as determined by the PhenoAge method, is calculated on the basis of numerous factors, including metabolism, immunity, inflammation and organ homeostasis. This approach can be used to accurately identify individuals who appear to be physiologically older than their chronological age would suggest32,33. These findings reinforce the understanding that accelerated biological aging increases the risk of developing lung cancer and suggest that higher levels of accelerated biological aging could serve as valuable predictors of lung cancer development.

To date, the relationship between lung cancer and accelerated biological aging has been explored in only a limited number of studies12,17. Our findings provide additional evidence linking lung cancer to biological aging and are the first to analyze how biological aging impacts mortality in lung cancer patients. The evidence suggests that aging plays a key role in the risk of death across populations34,35. The process of aging is typified by a series of gradual physiological declines, a reduction in cellular repair mechanisms, and the eventual occurrence of death36. The age component offers insight into the impact of physiological alterations resulting from the aging process over time37. The term “biological aging” is used to describe the decline in tissue and organismal function that occurs over time. In contrast, the term “chronological aging” is used to indicate the amount of time that has passed since an individual’s birth38. Therefore, it is more accurate to use biological age to predict mortality. Previous studies have shown that accelerated biological aging increases mortality in individuals with Alzheimer’s disease, type 2 diabetes and multivessel coronary artery disease39,40,41. Furthermore, individuals who were diagnosed with lung cancer and exhibited accelerated biological aging presented an increased mortality risk compared with those without lung cancer and without accelerated biological aging. Therefore, monitoring the biological aging status of individuals with and without lung cancer may be beneficial for preventing mortality.

The strengths of our study lie in the large sample size and the extended survey periods of the NHANES. These factors provide a robust foundation for our findings. Our study also has several limitations. First, owing to the cross-sectional nature of the NHANES dataset, causal relationships between accelerated biological aging and lung cancer cannot be established. It remains unclear whether accelerated aging precedes lung cancer development or if lung cancer itself accelerates biological aging. This temporal ambiguity limits the ability to draw definitive conclusions. Additionally, the study does not account for the differing rates of biological aging across individuals with the same chronological age, which could lead to variation in the relationship between biological aging and lung cancer risk. Future longitudinal studies are needed to clarify these temporal relationships. Second, our study focused on prevalent lung cancer cases, namely, individuals who were already diagnosed with lung cancer at the time of data collection. While this approach provides insights into the relationship between biological aging and lung cancer outcomes, it does not address how biological aging contributes to the onset of lung cancer. Incorporating incident cases in future research would help elucidate the role of biological aging in the early stages of lung cancer development. Third, lung cancer diagnoses in the NHANES are based on self-reported data, which may introduce reporting bias and affect the accuracy of the results. Additionally, NHANES lacks detailed clinical information on cancer subtype, histology, and aggressiveness, which are crucial for understanding the heterogeneity of lung cancer and its interaction with biological aging. Fourth, while PhenoAge has been shown to predict disease and mortality outcomes effectively, it may not capture the full complexity of biological aging mechanisms. Biomarkers such as DNA methylation-based measures could provide deeper insights into the molecular processes underlying aging and offer additional predictive power. Future studies integrating clinical and molecular data are necessary to improve the accuracy of biological aging assessments. Fifth, regarding missing data, we treated missing values as separate categories for categorical variables. This approach assumes that individuals with missing data are systematically different from those with complete data, which could introduce potential bias. However, we conducted sensitivity analyses to assess whether missingness influenced the study outcomes. The consistency of our findings across multiple analytic approaches suggests that missing data did not substantially impact the overall conclusions. Sixth, this study spans multiple NHANES cycles from 2001 to 2020, during which changes in healthcare policies, screening programs, and treatment advancements may have influenced both biological aging and lung cancer risk. For example, Medicaid expansion and the implementation of low-dose CT lung cancer screening could have affected early detection and disease incidence. However, our study did not adjust for these long-term population-level shifts, which could introduce unmeasured confounding. Future research should consider temporal adjustments to evaluate whether the observed associations remain stable over time. Finally, the number of lung cancer cases in our cohort was relatively small, and the results should be considered preliminary. However, the results remained consistent across multiple sensitivity analyses, including PSM, unweighted analyses, and RCS regression. These methods confirmed the robustness of the observed associations. Future studies with larger numbers of incident lung cancer cases and diverse populations are needed to confirm these findings and explore their applicability to broader demographic groups. The incorporation of molecular biomarkers and longitudinal designs would further enhance the understanding of how biological aging influences lung cancer risk and outcomes.

Conclusion

Despite these limitations, this study underscores the potential of accelerated biological aging, as measured by PhenoAge, to serve as an independent predictor of lung cancer risk and mortality. The consistent associations observed across multiple sensitivity analyses suggest that interventions aimed at slowing biological aging may offer promising strategies for reducing lung cancer risk and improving outcomes in lung cancer patients. Further research is needed to establish causal links and identify the mechanisms driving these associations, paving the way for targeted prevention and treatment approaches.