Introduction

COPD is a diverse respiratory disorder that is identified by continuous and gradual restriction of airflow, which involves abnormalities in the airways (such as bronchiectasis) and chronic symptoms related to breathing (like dyspnea, cough, and sputum)1. Globally, COPD has emerged as the third most common reason for mortality, with its occurrence rising in correlation with advancing age2. A FEV1/FVC ratio below 70% after bronchodilator use is considered the gold standard for diagnosing COPD3. Decreased lung function in the general population is associated with systemic biomarkers4. Certain blood metabolites have been viewed as clinical markers of acute exacerbations of COPD5, however alterations in their quantities are not consistent in COPD6. Thus, evidence of causation is particularly important in the case of biomarkers7.

In recent years, the emergence of metabolomics as a component of systems biology has provided a new approach to studying the underlying mechanisms of disease. Metabolomics, in particular, has the ability to offer understanding into the biological processes of disease development through the detection of altered metabolites or metabolic pathways8,9. Several metabolomics studies have shown metabolic dysregulation in patients with COPD10,11. Genetic epidemiologic studies of COPD show associations between COPD-related genetic polymorphisms and blood biomarkers12. Nevertheless, the number of biomarkers linked to COPD that have been discovered so far is limited. While many researchers have previously utilized proteomic methods to discover fresh indicators, like plasma sRAGE for detecting the existence and advancement of emphysema13, others have employed metabolomic methods to identify potential markers of disease severity or therapeutic candidates. These markers include various types of lipids and derivatives (primarily phospholipids, but also ceramides, fatty acids, and arachidonoids), amino acids, coagulation factors, and nucleic acid components. These substances may play a role in their functions, proteolytic metabolism, energy production, oxidative stress, immune-inflammatory responses, and coagulation disorders14. Dysregulation of sphingolipid metabolism is frequently observed in individuals suffering from chronic obstructive pulmonary disease. There is increasing evidence indicating that sphingolipids have a crucial involvement in the development of several lung conditions, such as asthma, acute lung injury, emphysema, COPD, and cystic fibrosis15,16. Exacerbation of COPD is linked to decreased levels of sphingomyelin in the plasma, whereas elevated levels are connected to the rapid advancement of emphysema17,18. Furthermore, amino acids serve as the fundamental components of proteins and have a vital function in intermediary metabolism. Evidence suggests that dysregulated amino acid metabolism may be present in COPD patients, even during periods of rest. This indicates that an abnormal amino acid profile could potentially be a significant risk factor for COPD, as supported by research19,20. This suggests that targeting these metabolites may be a promising approach to treating COPD21. Regrettably, the precise connection between metabolites and COPD remains uncertain due to the absence of any prospective studies on metabolites and COPD thus far. The lack of clarity regarding the causal connection between metabolites and COPD arises from the inherent design flaws in conventional observational studies, including modifications in metabolites due to intentional lifestyle adjustments in patients following COPD diagnosis and alterations in metabolites caused by prolonged use of specific medications. Implementing rigorous randomized controlled trials (RCTs) is challenging due to ethical concerns, lengthy observation periods, costly funding, and various limitations; however, they hold the utmost credibility in showcasing causal effects in evidence-based medicine.

MR is a useful framework for studying causality through genetic tools (e.g., single nucleotide polymorphisms [SNPs])22. In the absence of randomized controlled trials, MR stands out as a very effective method for investigating causal relationships between exposures and outcomes of interest. In addition, MR helps to reduce confounding and reverse causality bias inherent in observational studies. In this study, we investigated the potential causality of genetically predicted 486 blood metabolites in COPD progression in an MR framework using publicly available GWAS summary statistics. The findings may provide additional evidence for the etiology of COPD.

Methods

Data on 486 blood metabolites and COPD from GWAS

This study utilized a genome-wide association study (GWAS) conducted by Shin et al. in 2014, which included 7,824 adults from two European cohorts, TwinsUK and KORA, involving 2,163,597 associated SNPs23. Fasting serum samples were metabolically analyzed by untargeted mass spectrometry and comprehensive high-performance liquid chromatography (HPLC) and gas chromatography-mass spectrometry (GC-MS), and 486 serum metabolites were successfully identified. Table S1 details the names of 486 metabolites, where X represents an unidentified chemical composition. Of these, 309 were identified through the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and categorized into eight biochemical classes: amino acids, carbohydrates, cofactors and vitamins, energy, lipids, nucleotides, peptides, and exogenous.The remaining 177 unknown metabolites were not further analyzed due to their uncertain chemical nature.Complete statistics for these genome-wide association summary datasets are publicly accessible via the Metabolomics GWAS server (https://metabolomics.helmholtz-muenchen.de/gwas/).The FinnGen biospecimen repository (https://www.finngen.fi/en) was used as an outcome variable for GWAS analysis of COPD. The analysis included genotype data from 6915 COPD patients and 186,723 controls. The data used in this study is publicly available and has been anonymized to ensure that it is not personally identifiable.All participants had obtained local ethical approval and provided informed consent, so no additional ethical approval was required for this study.

IVs selection

To assess the possible causal relationship between circulating metabolites and COPD, we performed a two-sample MR analysis24. This study examined blood metabolites and COPD as exposure and outcome factors. According to our analysis, for instrumental variables to be valid, three basic assumptions must be met25, as shown in Fig. 1. First, genetic variants used as instrumental variables must be strongly correlated with blood metabolites. In addition, the genetic mutation must be independent of any potential confounding variables. In addition, genetic variants must affect COPD only through blood metabolites and not through other pathways. To satisfy these three hypotheses, we set the significance threshold for autonomy at P < 1 × 10 − 526. Furthermore, in order to guarantee the independence of SNPs and remove linkage disequilibrium (LD)27, we established the LD threshold as r2 < 0.001 and a distance of 10,000 kb. We excluded SNPs significantly associated with known risk factors, including palindromic SNPs, using the PhenoScanner (http://www.phenoscanner.medschl.cam.ac.uk/) website28.In the present study, we removed SNPs associated with smoking. This measure aims to eliminate SNPs that may bias MR estimates and ensure more stable and reliable results.

Fig. 1
figure 1

Workflow of the MR analysis.

Statistical analysis

In order to guarantee the dependability and accuracy of the results, a variety of strong statistical techniques were utilized and sensitivity analyses were performed to evaluate the possible influence of different sources of bias. The MR inverse variance weighting (MR-IVW) method was used as the primary analytical method29. Subsequently, we applied the Bonferroni correction algorithm (α = 0.05, PB = 0.05/486, PB = 1e-4) to correct for the type I error rate in multiple hypothesis testing and to identify metabolites with a significant causal association with COPD. metabolites with Bonferroni adjusted P-values lower than 1e-4 were considered statistically significant.However, due to the limited number of results, we adjusted the threshold to 1e-4 < PB< 0.0530.Fifteen metabolites with Bonferroni values between 1e-4 and 0.05 were categorized as potentially statistically significant.In order to evaluate the strength of the findings, we conducted supplementary sensitivity analyses using the weighted-median approach31 and MR-Egger method32. Valid estimates were obtained using the weighted median approach, given that over 50% of the information was derived from valid instrumental variables (IV).

After Bonferroni correction, we used the MR-Egger method for assessing the presence of horizontal pleiotropy in the selected IVs32, with P > 0.05 indicating no horizontal pleiotropy. We performed heterogeneity analyses using Cochran’s Q-test to assess heterogeneity among the estimated SNPs effects, with P > 0.05 indicating no significant heterogeneity33. Furthermore, we conducted a sensitivity analysis by leaving out one SNP at a time to investigate if certain SNPs had a disproportionate impact on the overall estimation. In addition, to prevent the influence of weak IVs on the included IVs, we also computed the F-statistic (F = beta^2/se^2)34,35. All analyses were done in R software (version 4.3.2).

Results

After rigorous quality control of instrumental variables, 486 blood metabolites were obtained for MR analysis. In addition, LD analysis, echo sequence removal to improve accuracy, and reliability of the selected IVs were performed. Of the 486 blood metabolites, a total of 491 MR analyses were performed, and five of these metabolites were found to have two different forms. We set the threshold for the F-statistic to 10. The F-statistic value for all selected SNPs exceeded 10, suggesting that our IVs were sufficiently robust. 486 metabolites analyzed are shown in Table S2. After Bonferroni correction (P value < 1e-4), none of the metabolites were statistically significant. When the significance threshold was adjusted to 1e-4 < P value < 0.05, the IVW method analysis identified 15 circulating metabolites that may be causally related to COPD risk, with 10 known compounds and 5 unknown compounds. The relationship between the 15 metabolites and COPD is shown in the forest plot(Fig. 2).The metabolites lactate(OR = 0.224, 95%CI: 0.083-0.600,P = 0.002), fructose(OR = 5.374, 95%CI: 1.540-18.755,P = 0.008), margarate (17:0)(OR = 2.837, 95%CI: 1.049-7.870,P = 0.039), 5-oxoproline(OR = 0.320, 95%CI: 0.152 -0.675,P = 0.002), guanosine(OR = 1.347, 95%CI: 1.030–1.761,P = 0.029), paraxanthine(OR = 0.718, 95%CI: 0.551–0.935, P = 0.013), phenyllactate(PLA)(OR = 0.534, 95%CI: 0.296–0.962,P = 0.036), N-acetylglycine(OR = 0.725, 95%CI: 0.547–0.962, P = 0.025), X-10,810(OR = 0.611, 95%CI: 0.386–0.965,P = 0.034), X-02269(OR = 1.674, 95%CI: 1.062–2.638,P = 0.026), X-09789(OR = 1.478, 95%CI: 1.022–2.138,P = 0.037), X-11,552(OR = 0.664, 95%CI: 0.443–0.995,P = 0.047), 2-stearoylglycerophosphocholine (OR = 1.749, 95%CI: 1.023–2.990,P = 0.040), hexadecanedioate(OR = 1.340, 95%CI: 1.067–1.683,P = 0.011)and X-14,977(OR = 0.460, 95%CI: 0.262–0.808,P = 0.006) were analyzed using MR analyses, which included Cochran’s Q-test and MR-Egger’s intercept analysis(Table S3). Cochran’s Q test assessed potential heterogeneity, especially for the metabolites studied, and showed no significant heterogeneity between IVs; furthermore, no outliers were found by MR-PRESSO. Although horizontal pleiotropy (MR-Egger P < 0.05) was found in the MR study of the unknown compound X-02269, the results of the Cochrane’s Q-test showed that no heterogeneity was found.This suggests that some confounding factors may still exist. Moreover, the leave-one-out method, depicted in Figure S1, validates the durability and consistency of the MR analysis. It demonstrates that removing any SNPs does not have a substantial impact on the overall results, thus reinforcing the trustworthiness and stability of the MR analysis. The linear skewness in the scatterplot shows the degree of causality, as shown in Figs. 3 and 4.

Fig. 2
figure 2

Forest plot of MR-analyzed blood metabolites on COPD causality. CI, confidence interval; OR, odds ratio; SNPs, single nucleotide polymorphisms.

Fig. 3
figure 3

Scatterplot of MR analysis of metabolites negatively associated with COPD risk. The extent of the cause-and-effect relationship is demonstrated by the incline of the linear graph. The x-axes represent genetic instrument- metabolites associations, and the y-axes represent genetic instrument-COPD associations . Black dots denote the genetic instruments included in the MR analysis.

Fig. 4
figure 4

Scatterplot of MR analysis of metabolites positively associated with COPD risk. The extent of the cause-and-effect relationship is demonstrated by the incline of the linear graph. The x-axes represent genetic instrument- metabolites associations, and the y-axes represent genetic instrument-COPD associations . Black dots denote the genetic instruments included in the MR analysis.

Discussion

Our study aimed to investigate the possible causal impacts of 486 blood metabolites on the development of COPD by utilizing genetic proxies. Our analyses identified genetic metabolites linked to decreased risk of COPD by integrating two extensive GWAS datasets and utilizing a rigorous MR design. These metabolites include lactate, 5-oxoproline, paraxanthine, PLA, N-acetylglycine, X-10,810, X-11,552, X-14,977. Decreased chances of developing COPD are linked to higher levels of these metabolites. On the other hand, we discovered that individuals with a genetic predisposition to increased amounts of specific substances, such as fructose, margarate (17:0), guanosine, X-02269, X-09789, 2-stearoylglycerophosphocholine, and hexadecanedioate had a higher likelihood of developing COPD.

Early intervention is crucial in preventing and treating COPD, a chronic condition that has a lengthy progression. However early vague clinical symptoms make the diagnosis of COPD difficult. Hence, investigating the metabolites linked to the progression of COPD will not just aid in the initial detection and prevention of COPD, but also enhance comprehension of the biological mechanisms involved in treating COPD. Although there are associations between genetic, smoking and environmental factors and different clinical manifestations of COPD, the exact pathogenesis remains unclear. Timely detection is crucial in order to avoid permanent harm to the organs, highlighting the importance of dependable and pertinent biomarkers for enhancing the treatment of COPD. The use of metabolomics technology has generated curiosity in investigating the possible importance of metabolites in COPD. Metabolomics analysis is the systematic examination of small-sized biochemicals in biological samples, encompassing sugars, amino acids (AA), organic acids, nucleotides, and lipids36. According to the ECLIPSE cohort37, a recent study using plasma metabolic profiling (PMP) revealed the correlation between cachexia and emphysema with various amino acids. Another study examined the correlation between 34 specific amino acids and dipeptides among various subcategories of individuals with COPD (including emphysema, airway disease, and cachexia)21. Bowler and his team discovered a connection between sphingolipids and ceramides with airflow blockage and emphysema in the group of patients from the COPDGene study. They found that five sphingolipids were linked to emphysema, while seven ceramides were associated with worsening of COPD symptoms38. This implies that plasma metabolomics may have a role in the clinical treatment of COPD, potentially aiding in outcome prediction, serving as an intervention tool, and indicating treatment response. Therefore, we carried out a significant MR investigation with the objective of clarifying the causal connection between blood metabolites and COPD, as well as the related metabolic routes. This study was designed to provide guidance for screening and management of COPD.

Studies have shown that elevated levels of fructose, margarate (17:0), guanosine, 2-stearoylglycerophosphocholine, and hexadecanedioate are associated with increased susceptibility to COPD. COPD patients are often associated with insulin resistance or abnormal glucose metabolism, which has also been validated by metabolomics techniques.It has been found that COPD patients have higher plasma levels of certain glucose metabolites, suggesting possible alterations in energy metabolism39.It has been revealed that prolonged intake of fructose leads to the deterioration and restructuring of lung tissue and negatively affects breathing function. Long-term fructose exposure can lead to harmful effects such as airway hyperresponsiveness, chronic bronchitis, alveolar remodeling, and emphysema. Fructose ingestion causes an increase in the concentration of cytokines such as IL-10, IL-6, IL-1β, and TNFα, inducing chronic inflammatory responses that propagate to the lung parenchyma and lead to the accumulation of monocytes in the lung tissue40. Neopurine (one of the metabolites of guanosine) levels may dose-dependently increase oxidative stress in endothelial cells in hypoxic environments, leading to endothelial dysfunction41. Several studies have found high levels of lipid substances such as sphingomyelin and eicosadienoic acid in the alveoli and blood of COPD patients, suggesting that lipid metabolism may play an important role in chronic inflammation and airway remodeling in COPD42. Margarate is a long-chain saturated fatty acid that is an important component of cell membrane phospholipids. Excessive intake of saturated fatty acids (e.g., heptadecanoic acid) may exacerbate COPD by activating inflammatory signaling pathways and promoting the release of inflammatory factors. An animal study found that saturated fatty acids increased the number of lung macrophages and enhanced high-fat diet-induced neutrophil airway inflammation in a high-fat diet mouse model43. 2-stearoilglycerophosphocholine is a phospholipid molecule, which belongs to the class of phospholipids that are important components of biological membranes, especially in lipid bilayer structures.In addition, as a diacid fatty acid derivative, it may play a role in lipid metabolism.Our findings that 2-stearoilglycerophosphocholine and Hexadecanedioate may exacerbate chronic inflammation and airway remodeling in the lungs through their involvement in lipid metabolism are consistent with previous studies.Moreover, lipids have a significant impact on the functioning of the lungs. Recent studies have convincingly demonstrated the binding of lipids to inflammation. The oxidation of different biomolecules, such as lipids, resulting in epithelial cell damage and death, which is a crucial factor in the development of COPD, may be caused by both exogenous and endogenous reactive oxygen species produced by inflammation and mitochondrial dysfunction44. Apoptosis of endothelial cells is a crucial occurrence in the progression of emphysema. According to the research, it is indicated that these mechanisms could be linked to increased concentrations of ceramides in the respiratory system of individuals with COPD, which are regarded as an indicator of the condition. It is important to mention that there could be multiple routes for ceramide production, which can involve the activity of acidic sphingomyelinase and the synthesis of palmitate upon exposure to cigarette smoke45. This suggests that this taxonomic group could potentially be utilized as a novel target for therapeutic purposes. Nevertheless, additional investigation is required to further understand its particular function.

Besides the unidentified blood metabolites, this MR investigation discovered five blood metabolites (lactate, 5-oxoproline, paraxanthine, PLA, and N-acetylglycine) that are thought to be protective against COPD. The specific mechanism of their protective effect on COPD needs further in-depth study. Lactate is an important product of glycolysis and plays a key role in energy metabolism, inflammation regulation and signaling.Lactic acid regulates the function of immune cells and influences the inflammatory response.For example, lactic acid can inhibit macrophage and T-cell activity, thereby reducing inflammation. An experimental study found that lactic acid produced by alveolar type II cells converted alveolar macrophages to an anti-inflammatory phenotype and inhibited excessive inflammation in mice with lung injury46. 5-Oxoproline is an important intermediate in glutathione metabolism and plays a key role in oxidative stress and inflammation regulation. Some animal studies have found that glutathione supplementation improves oxidative stress and reduces lung damage47.As an amino acid, proline plays a crucial part in controlling the synthesis and metabolism of arginine, as well as the immune response of the body. The Nitric Oxide Synthase (NOS) family produces Nitric Oxide (NO), using L-Arginine as a substrate. NO production in the lungs and airways can play a variety of roles in lung development, regulating airway and vascular smooth muscle tone and participating in inflammatory processes and host defense. It has been shown that the exacerbation of COPD can be inhibited by inhibiting arginase or supplementing arginine to increase the substrate availability of NOS48. In addition to this, arginine is involved in wound healing, control of asthma and certain cardiovascular related diseases49. Amino acids are present in large quantities in certain fish and marine oils. An examination of data from the National Health and Nutrition Examination Survey (NHANES) revealed that adults who consumed a lot of fish had a higher 1-second force expiratory volume (FEV1) compared to those with low fish consumption50. A metabolomics study identified the differential expression of 23 serum metabolomic biomarkers in healthy smokers and COPD smokers, revealing metabolite biomarkers of early COPD, which are primarily associated with inflammatory responses and caffeine metabolism51.Paraxanthine is a metabolite of caffeine and belongs to the methylated xanthine class of compounds. It is produced through the metabolism of caffeine in the body and has antioxidant and anti-inflammatory properties52. PLA is an intermediate product of phenylalanine metabolism and plays an important role in amino acid metabolism and inflammation regulation. In an infected mouse model, subinhibitory concentrations of PLA were effective in clearing pathogenic bacteria from the lungs of mice, suggesting that PLA may modulate the inflammatory response by influencing the function of immune cells53.

However, our findings are not entirely consistent with those of previous studies, which may be due to the small sample size of the previous study, which may affect the broad applicability of the findings.The clinical presentation of COPD patients varies widely, and the different subtypes may affect the changes in metabolic markers; therefore, it may be possible that there are certain metabolic markers that are more specific to a particular patient group. In a metabolomic analysis of current and former smokers in a white cohort, lower serum amino acid concentrations were found to be associated with a higher incidence of respiratory deterioration. In particular, there was a consistent and substantial association between reduced tryptophan concentrations and worse lung function, extent of emphysema on chest CT, and work capacity54. In addition, a study based on an Asian population found that elevated serum uric acid levels measured at two different times were consistently and significantly associated with an accelerated decline in FEV1 and a decrease in the percent predicted FEV1 value in nonsmoking individuals55.

This MR analysis has several significant advantages. Our study utilizes MR to determine causality and relies on publicly available data, thereby eliminating the need for individual-level datasets and effectively circumventing the limitations of traditional observational studies, such as insufficient sample size and potential confounders.In addition, we conducted multiple sensitivity analyses to ensure that our study was not confounded by horizontal pleiotropy and heterogeneity, confirmed that there was no evidence of violation of the assumption of instrumental variable independence in the analyses, and ruled out potential outliers through the MR-PRESSO methodology, which further solidified the robustness and consistency of our findings.

Nevertheless, there are constraints to our study. While the F-statistics of chosen SNPs showed satisfactory resilience, the limitation arises from the restricted number of SNPs available for a thorough analysis of genome-wide exposure. Expanding the sample size in future studies could improve the reliability of the results and the accuracy of the causal effect estimates. In addition, the results of the study are not based on the entire population, which may limit their generalizability to different ethnic groups.Therefore, the generalizability of our findings to other populations warrants further exploration and validation. We suggest that future multi-ethnic MR analyses be conducted to further validate our findings.Second, given the limitations of the database information, we were unable to assess the potential impact of disease-specific classifications on study outcomes, such as severity or acute exacerbations of COPD.In addition, the impact of clinical interventions could not be determined by MR analysis. While MR analysis provides valuable insights into etiology, it is important to note that our findings should be validated by targeted metabolomics in a COPD cohort before applying them to the clinic.

Conclusion

This MR study used genetic proxies to determine a causal relationship between blood metabolites and COPD. Furthermore, the research identified five distinct blood metabolites that could potentially be linked to the progression of COPD. The discoveries offer valuable perspectives on possible approaches for early detection, avoidance, and management of COPD. By combining genomics and metabolomics, this analysis provides a reference point for studying the causes and development of COPD.