Introduction
Chronic obstructive pulmonary disease (COPD) is a chronic condition characterized by persistent respiratory symptoms and airflow limitation that is not fully reversible, often accompanied by an enhanced inflammatory response in the airways.1 Globally, COPD affects over 250 million individuals, with a prevalence of approximately 12% among adults aged 40 and above.2,3 According to the World Health Organization (WHO), COPD is the third leading cause of death worldwide and poses a significant threat to public health.4 The economic burden of the disease is substantial, placing immense pressure on healthcare systems and society. In the United States, data from the American Lung Association indicate that annual COPD-related medical costs for adults aged 45 and above amount to $24 billion, with an average annual cost of $4322 per patient.5 In India, economic losses exceed 1 trillion rupees ($13.4 billion) annually, with 45–70% of total costs attributed to the management of acute exacerbations. In Europe, direct annual costs per patient range from €1963 to €10,701.6 Furthermore, COPD is associated with progressive decline in lung function and an increased risk of comorbidities such as cardiovascular disease, lung cancer, and depression, substantially impairing patients’ quality of life.7 Although major risk factors—such as cigarette smoking, occupational exposure, and air pollution—have been identified, prevention of COPD remains challenging.8–10 Currently, there is no cure for COPD. Its underlying pathogenesis is not fully understood, and the disease is often asymptomatic in its early stages, hindering timely prevention and early intervention. Therefore, elucidating the potential pathogenic mechanisms and identifying risk factors influencing the development and progression of COPD are critical priorities in ongoing research.
In recent years, increasing attention has been paid to the potential link between lipid metabolism and COPD. On one hand, COPD is closely associated with metabolic syndrome and cardiovascular diseases. On the other hand, lipid biomarkers such as low-density lipoprotein cholesterol (LDL-C), triglycerides (TG), total cholesterol (TC), and high-density lipoprotein cholesterol (HDL-C) are crucial indicators for cardiovascular risk assessment. However, numerous studies have reported inconsistent and complex associations between these lipid markers and COPD. A meta-analysis comprising 11 studies found no significant differences in LDL-C, TG, TC, and HDL-C levels between COPD patients and healthy controls overall. Nonetheless, in a subgroup not receiving lipid-lowering therapy, TG levels were significantly higher in COPD patients (MD = 16.35, 95% CI 5.90, 26.80).11 Contrastingly, another study reported significantly higher LDL-C levels in the healthy group compared to COPD patients (P = 0.001), possibly due to chronic inflammation in the latter.12 Additionally, some studies have noted that plasma TC and TG levels tend to increase with the progression of COPD, although these differences were not statistically significant.13 Other research has suggested that lipid alterations primarily occur in the late stages of COPD, with increased HDL-C levels and reduced TC and LDL-C levels observed in severe cases.14 Traditional observational studies are limited by inherent confounding factors and the possibility of reverse causality, where COPD or its treatment may influence lipid levels. Therefore, more robust methodological approaches are urgently needed to elucidate the true causal relationship between lipid metabolism and COPD.
Randomized controlled trials (RCTs) are widely regarded as the gold standard for establishing causal relationships. However, they are not always feasible due to ethical concerns and the substantial human and financial resources they require. In recent years, the advancement of genetic technologies and the increasing availability of genome-wide association study (GWAS) summary data have ushered in the era of Mendelian randomization (MR) analysis. MR leverages genetic variants as instrumental variables (IVs) to proxy modifiable exposures. Since alleles are randomly allocated from parents to offspring at conception, this process mimics the randomization in RCTs and effectively minimizes confounding and reverse causality. Moreover, as a secondary analysis of publicly available GWAS data, MR is cost-effective, efficient, and relatively easy to implement. Recently, a Mendelian randomization study by Huang et al investigated the causal association between lipid levels and COPD but was limited to individuals of European (EUR) ancestry,15 thus constraining the generalizability of the findings. It is important to note that the prevalence of COPD varies by race and geographic region. For instance, in the United States, the highest prevalence is observed among non-Hispanic White individuals, at approximately 5.6%.16 In the United Kingdom, COPD prevalence is lower among Black and Asian populations compared to White individuals, possibly due to differences in smoking behaviors or genetic susceptibility.17 Globally, the highest prevalence is reported in the Americas (14.53%), while the lowest is in Southeast Asia and the Western Pacific region (8.80%).16 Therefore, to address these limitations, the present study conducted a new MR analysis using data from populations of East Asian (EAS), African (AFR), and Hispanic or Latin American (HIS) ancestries.
Methods
Study Design
The study design is illustrated in detail in Figure 1. Summary-level GWAS data for three ancestral groups—EAS, AFR, and HIS—were obtained from publicly available large-scale databases. Blood lipid traits, including HDL-C, LDL-C, TG, and TC, were used as exposures, and COPD was defined as the outcome. A two-sample MR analysis was conducted. The outcome GWAS included both a discovery dataset and a replication dataset. Causal estimates from these two datasets were then meta-analyzed to determine overall causal evidence. Multiple sensitivity analyses were performed to assess the robustness of the results and validate the final causal inference. Genetic variants were selected as IVs based on standard quality control procedures and were required to meet the following three core assumptions of MR: (1) the genetic variants must be strongly associated with the exposure traits; (2) the genetic variants must be independent of confounding factors; and (3) the genetic variants must influence the outcome exclusively through the exposure, not through alternative pathways.18
Figure 1 Study design. This figure illustrates the overall study framework. (A) compares the current study with previous Mendelian randomization (MR) studies. (B) outlines the GWAS data sources used for exposures, and (C) presents the sources of the two outcome datasets. (D) summarizes the analytical workflow, including MR methods and sensitivity analyses.
|
This study adhered strictly to the STROBE-MR guidelines.19 As the analysis was based entirely on publicly available GWAS summary data, with all original studies having received the necessary ethical approvals, no additional ethical approval, clinical trial registration, or informed consent was required for this secondary analysis. According to Items 1 and 2 of Article 32 of the Measures for Ethical Review of Life Science and Medical Research Involving Human Subjects (effective February 18, 2023, China), research projects that meet either of the following conditions may be exempt from ethical review: (1) the research uses legally obtained public data or data generated through observation without interfering with public behavior; or (2) the research uses anonymized information or data. As this study complies with both conditions, ethical review was not required.
Instrumental Variable Selection
Initially, this study selected genetic variants based on genome-wide significant thresholds (P < 5×10−8) and stringent linkage disequilibrium clumping criteria (r² < 0.001), utilizing data derived from Phase 3 of the 1000 Genomes Project. Subsequently, the identified single nucleotide polymorphisms (SNPs), were matched with the outcome GWAS data. If certain SNPs were unmatched due to sequencing depth limitations in the outcome data, proxy SNPs were automatically imputed using the “TwoSampleMR” package within an r² threshold of < 0.8. Allele harmonization was then conducted, excluding SNPs exhibiting intermediate allele frequencies or mismatched alleles (eg, T/A vs T/G). Additionally, a Steiger test was implemented by comparing R2 values, representing the proportion of variance explained, to further exclude SNPs that demonstrated stronger associations with the outcome than with the exposure, thus satisfying the third assumption of IVs. Furthermore, F-statistics were calculated for each SNP to exclude weak genetic variants (F < 10), ensuring the robustness of the results; formulas for calculating R2 and F-statistics have been extensively detailed in previous studies.20,21 Finally, sensitivity analysis was performed by excluding SNPs identified as outliers to mitigate potential horizontal pleiotropy bias.
Data Sources
In this study, GWAS summary statistics for four lipid phenotypes were acquired from the Global Lipids Genetics Consortium (GLGC), encompassing EAS (n=146,492), AFR (n=99,432), and HIS (n=48,057) ancestry populations for exposure analyses.22 For the outcome phenotype—COPD—discovery datasets were obtained from the Global Biobank Meta-analysis Initiative (GBMI), comprising EAS (n=329,733), AFR (n=29,682), and HIS (n=15,086) populations.23 Additionally, replication datasets were retrieved from the Million Veteran Program (MVP) for AFR (n=112,492) and HIS (n=56,391) populations,24 as well as from the BioBank Japan (BBJ) for the EAS population (n=166,670).25 Additionally, we conducted a validation analysis using data from individuals of EUR ancestry. Lipid phenotypes were obtained from the GLGC (n = 1,320,016), and COPD datasets were sourced from the GBMI (n = 995,917) for discovery, with replication from the MVP (n = 418,504). Detailed data sources, including accessible IDs, are provided in Table 1. Given that the utilized datasets involve large-scale GWAS meta-analyses incorporating multiple cohorts, detailed cohort information and diagnostic criteria are available in the original publications.
![]() |
Table 1 Detailed Information of Data Sources
|
Statistical Analysis
Inverse variance weighted (IVW) regression was used as the primary analytical approach.26 Supplemental analyses included several complementary MR methods: MR-Egger, weighted median, penalized weighted median, contamination mixture (ConMix),27 robust adjusted profile score (RAPS),28 debiased inverse-variance weighted (dIVW),29 constrained maximum likelihood (cML),30 and Bayesian weighted Mendelian randomization (BWMR).31 Detailed descriptions of each method are provided in Table S1.
Sensitivity analyses included correction for potential false-positive findings resulting from sample overlap using MRLap.32 Cochran’s Q test was applied to detect heterogeneity,33 and the I² statistic was computed to select an appropriate IVW effect model. A leave-one-out analysis was conducted to evaluate whether individual SNPs disproportionately influenced results. MR-Egger regression34 and MR-PRESSO35 were used to assess horizontal pleiotropy and identify outlier SNPs. Notably, when MR-PRESSO failed to detect outliers even after increasing the NbDistribution threshold to 10,000, suggesting unavoidable horizontal pleiotropy, RadialMR analysis was implemented as a supplementary measure to comprehensively remove potential outliers.36 Lastly, Causal Analysis Using Summary Effect Estimates (CAUSE) was performed for further validation.37
A Bonferroni-corrected significance threshold was applied, with P < 0.002 (0.05/24) indicating significant causal evidence, and 0.002 < P < 0.05 interpreted as suggestive evidence of causality. Statistical power was computed using the mRnd online platform.38 All analyses were performed using R software (4.2.3), employing the packages cause, RadialMR, TwoSampleMR, MRPRESSO, MRlap, MRcML, and meta.
Results
Selection of Genetic Variants and Evaluation of MR Assumptions
This study strictly adhered to the three fundamental assumptions of MR, implementing a rigorous genetic variant selection protocol. Initially, one outlier SNP (rs7350481) was excluded using MR-PRESSO, yielding a total of 1213 SNPs for subsequent analyses. However, the MR-PRESSO global test indicated unavoidable horizontal pleiotropy in six analyses; thus, an additional 43 outlier SNPs were removed using RadialMR (Table S2 and Figure S1). Ultimately, 1170 SNPs were retained as IVs, comprising 392 in the EAS population, 619 in the AFR population, and 159 in the HIS population. All included SNPs exhibited strong instrumental strength (F-statistics > 10) and passed the Steiger directionality test, fulfilling the three core MR assumptions and ensuring robustness of the causal inference framework. Comprehensive SNP details are presented in Tables S3–S5, numeric results for all analytical methods are shown in Table S6, statistical power calculations are provided in Table S7, sensitivity analyses results are detailed in Table S8, and MR analyses correcting for overlapping samples (MRLap) are summarized in Table S9.
Causal Analysis of Lipid Traits on COPD Risk in the EAS Population
In the EAS ancestry analysis, a total of 392 SNPs were included—175 from the discovery dataset and 217 from the replication dataset (Table S3). Figure 2 presents the findings from all two-sample MR analyses. In the discovery dataset, primary analyses revealed inverse causal associations between genetically predicted LDL-C, TG, and TC levels and COPD risk (OR < 1, P < 0.05). Among these, TC demonstrated statistical significance at the evidence threshold (P = 1.29×10−4). In the replication dataset, these inverse associations remained consistent (OR < 1, P < 0.05), with LDL-C demonstrating enhanced statistical significance (P = 5.59×10−9) (Table S6 and Figure 2). For positive findings, sufficient statistical power (≥75%) was observed across both datasets based on the primary analysis method effect estimates (Table S7). However, further meta-analysis excluded LDL-C as a positive causal finding and provided robust support only for genetically predicted TG (OR = 0.891, 95% CI: 0.841–0.944, P = 9.91×10−5) and TC (OR = 0.827, 95% CI: 0.739–0.925, P = 9.04×10−4) being significantly associated with reduced COPD risk (Table 2).
![]() |
Table 2 Summary of Meta-Analysis Results
|
![]() |
Figure 2 Summary of Mendelian randomization analysis results. This figure presents the significance and effect estimates of causal associations. (A) shows the significance of causal evidence in the discovery dataset, while (B) displays the corresponding significance in the replication dataset. (C) summarizes the estimated causal effects across both datasets. *Indicates 0.002 < P < 0.05; **indicates P < 0.002.
|
Sensitivity analyses detected no significant horizontal pleiotropy or heterogeneity (Table S8). Nonetheless, MRLap analyses indicated that significant associations between LDL-C and COPD, observed in both discovery and replication datasets, represented false positives due to “winner’s curse” induced by sample overlap. Therefore, the LDL-C evidence was excluded, further supporting the meta-analysis results (Table S9). Subsequent leave-one-out analysis identified rs7350481 within the Mastermind-like transcriptional coactivator 2 (MAML2) gene as driving the entire causal association between TG and COPD risk in both discovery and replication datasets (Figure S2). After excluding rs7350481, the causal association of TG with COPD lost statistical significance, rendering this finding suggestive evidence only (Table S10). Finally, CAUSE analysis confirmed a negative causal effect of genetically predicted TC on COPD risk (OR = 0.889, 95% CI: 0.823–0.960, P = 0.024), indicating superior fit of the causal model over the shared-effect model (Table 3 and Figure 3).
![]() |
Table 3 MR-CAUSE Analysis, Linking Genetic Liability to TC with COPD
|
![]() |
Figure 3 CAUSE analysis of the causal effect of TC on COPD in the EAS population. (A) shows the posterior distributions from the CAUSE model. The posterior density of gamma, representing the causal effect, is centered below zero, suggesting a negative causal impact of TC on COPD. The etas, representing shared factor effects, are near zero, and the q values, representing the proportion of variants affected by the shared factor, are concentrated near zero—indicating limited influence through a shared factor and favoring the causal model. (B) compares SNP effect estimates under different models. In the Sharing Model, the x-axis represents the effect of TC on SNPs (β_M) and the y-axis represents the effect of SNPs on COPD (β_Y). The weak linear trend and poor fit of the solid line (shared factor eta) suggest limited support for the sharing model. In the Causal Model, the dashed line represents the causal effect (gamma), and the overall fit better aligns with the data points, favoring a causal explanation. The ELPD Contribution plot illustrates the contribution of each SNP to model comparison—blue indicates support for the causal model and red for the sharing model, with deeper color and larger size indicating stronger influence. The predominance of blue points further supports the presence of a causal relationship.
|
In summary, strong evidence in the EAS population supports a protective effect of genetically predicted TC against COPD risk, corroborated by approximately ten sensitivity analyses and substantial methodological agreement (supported by 5 out of 8 complementary methods in the discovery dataset and all 8 complementary methods in the replication dataset) (Table S6 and Figure 2). The causal effect of TG on COPD was driven primarily by rs7350481 and should be considered suggestive evidence only. LDL-C associations were excluded due to bias from “winner’s curse.”
Causal Analysis of Lipid Traits on COPD Risk in AFR Population
In analyses involving the AFR ancestry population, a total of 619 genetic variants were included—265 from the discovery dataset and 354 from the replication dataset (Table S4). In the discovery dataset, the primary analytical method indicated a causal association between genetically predicted TG levels and increased COPD risk (OR = 1.438, 95% CI: 1.091–1.896, P = 0.009), supported by all eight complementary methods (8/8) and demonstrating sufficient statistical power (Power = 97%). However, this association lost statistical significance in subsequent replication and meta-analyses (P > 0.05) (Table S6 and Figure 2). Given the absence of heterogeneity and pleiotropy identified in sensitivity analyses, and the stability confirmed by MRLap and leave-one-out analyses, the TG-COPD association is considered suggestive evidence only. No causal relationships between other lipid phenotypes and COPD risk were observed in this population.
Causal Analysis of Lipid Traits on COPD Risk in HIS Population
In the HIS ancestry analysis, 159 genetic variants were analyzed—67 from the discovery dataset and 92 from the replication dataset (Table S5). No significant causal associations between lipid traits and COPD risk were observed in either the discovery dataset, replication dataset, or subsequent meta-analysis (P > 0.05) (Table S6 and Figure 2). These findings were consistently supported by robust sensitivity analyses, which showed no evidence of heterogeneity or horizontal pleiotropy (Table S8).
Validation Analysis: Causal Analysis of Lipid Traits on COPD Risk in EUR Population
For comparative validation, this study applied the same analytical framework to populations of EUR ancestry. After genetic instrument screening and outlier removal, 1221 SNPs were retained for the discovery dataset (GBMI), and 1385 SNPs for the replication dataset (MVP) (Table S11A). Meta-analysis combining both datasets indicated that genetically predicted LDL-C and TC were causally associated with a reduced risk of COPD in the EUR population (OR < 1, P < 0.05) (Table S11B). However, no significant causal associations were observed between HDL-C or TG and COPD risk. Sensitivity analyses detected no evidence of heterogeneity or pleiotropy, confirming the robustness of these findings (Table S8).
Discussion
This MR study conducted a causal analysis of lipid traits and COPD within a multi-ancestry framework. The findings provide strong evidence for a causal relationship between genetically predicted TC levels and a reduced risk of COPD in individuals of EAS ancestry. Additionally, the observed inverse association between TG and COPD risk in the EAS population appeared to be primarily driven by the single nucleotide polymorphism rs7350481 located within the MAML2 gene. Given the potential influence of a single variant, this result should be interpreted with caution. Lastly, in the AFR ancestry group, suggestive evidence was observed supporting a positive causal association between genetically predicted TG levels and an increased risk of COPD.
Compared to previous MR studies conducted in EUR populations,15 our findings revealed consistent trends—namely, the association between higher LDL-C levels and decreased COPD risk in EAS, and between higher TG levels and increased COPD risk in AFR. However, in our study, the LDL-C findings were excluded due to sample overlap, and the evidence for TG was only suggestive. It is worth noting that previous studies also faced similar issues of sample overlap but did not apply corresponding adjustments, which may partially explain the discrepancies in results. Additionally, we identified a protective effect of higher TC levels on COPD risk in the EAS population, a finding supported by several epidemiological observations. For example, the Copenhagen General Population Study reported that lower cholesterol levels were associated with more severe COPD and worse prognosis.39 Similarly, the COSYCONET cohort found that COPD patients with coexisting hyperlipidemia had better lung function, suggesting that cholesterol may exert a protective role in pulmonary disease progression.40 An analysis of hospitalized patients further showed that each standard deviation increase in serum cholesterol levels among men was associated with a 25% reduction in mortality risk from respiratory diseases such as pneumonia or influenza.41 These observational findings support an apparent association between elevated lipid levels and improved pulmonary health. One plausible explanation is that higher cholesterol levels may reflect better nutritional and metabolic status, which could help buffer the catabolic burden of chronic illness, prevent respiratory muscle wasting, and mitigate weight loss. Additionally, cholesterol, as a structural component of cell membranes and pulmonary surfactant, plays a critical role in maintaining alveolar stability and reducing surface tension.42 Plasma lipoproteins also have immunomodulatory and anti-inflammatory properties, such as neutralizing bacterial lipopolysaccharides.43 Therefore, within a certain physiological range, elevated TC levels may contribute to COPD protection through multiple pathways. Nevertheless, given that excessive cholesterol can also lead to metabolic disturbances and cardiovascular comorbidities, such protective effects should be interpreted as context-dependent and reflective of a metabolic balance that warrants further investigation.
This study further incorporated a validation analysis using data from individuals of EUR ancestry. Findings indicated that genetically predicted levels of LDL-C and TC were significantly associated with a reduced risk of COPD, supporting the protective effects observed in the EAS population (although the LDL-C results in the EAS analysis were cautiously excluded due to sample overlap). Notably, we did not replicate the previously reported causal association from EUR populations15 that elevated TG levels increase COPD risk. Instead, we provide novel evidence of the protective effect of TC in the EUR population. Consistent with previous findings, both studies suggested a protective trend for LDL-C (OR < 1) and found no significant causal effect of HDL-C. The discrepancies observed could be attributed to methodological differences; previous analyses utilized smaller-scale GWAS datasets published prior to 2013, and relied on single-cohort COPD outcomes, limiting statistical power and representativeness. In contrast, this study leveraged the most recent large-scale multi-cohort GWAS from the GLGC (over 1.3 million individuals), combining data from multiple sources such as GBMI and MVP, substantially enhancing both sample size and external validity. As the current study was not specifically designed as a commentary on existing literature, future research, including systematic reviews and meta-MR analyses, is warranted to comprehensively evaluate the impact of ancestral diversity, data versions, and methodological heterogeneity on lipid-COPD causal relationships, thereby providing more robust evidence.
Interestingly, this study identified a negative causal association between TG and COPD in the EAS population, primarily driven by the SNP rs7350481 located within the MAML2 gene. Given that this association was almost entirely attributable to a single genetic variant, the result should be interpreted with caution. The rs7350481 SNP is positioned at 116,715,567 on chromosome 11 and lies within the intronic region of MAML2. This gene encodes a critical transcriptional co-activator in the Notch signaling pathway, which is broadly involved in cell fate determination, differentiation, immune regulation, and pulmonary tissue repair.44 The Notch pathway has been implicated in COPD pathogenesis, particularly in the regulation of inflammatory responses and airway remodeling. Notch1 and Notch3, in particular, are activated in smoking- and PM2.5-induced lung injury, contributing to immune dysregulation and disease progression.45 As a core component of the Notch pathway, MAML2 dysregulation may impair alveolar epithelial repair and disrupt immune homeostasis, thereby potentially contributing to COPD development. Although there is currently no direct evidence linking rs7350481 to COPD, this variant has been consistently associated with elevated TG levels and increased susceptibility to metabolic syndrome across several Asian populations. Notably, GWAS in Japanese and Indian cohorts have reported extremely strong associations between rs7350481 and TG levels (P = 7.52×10−26).46,47 However, these metabolic associations likely reflect systemic inflammatory or metabolic states rather than a direct mechanistic link to pulmonary pathology. Further analysis suggests that rs7350481 may influence COPD risk indirectly by modulating MAML2 expression and subsequently altering Notch pathway activity, thus contributing to pulmonary immune regulation and repair mechanisms. Therefore, we propose that the observed inverse causal association between TG and COPD is more likely attributable to the functional role of rs7350481 within the MAML2/Notch signaling axis, rather than a direct protective effect of TG metabolism on lung disease. It is important to emphasize that functional validation of rs7350481 in lung tissue is currently lacking, and this variant has not been identified as a significant locus in existing COPD GWAS. As such, the present finding should be regarded as exploratory, warranting further investigation through molecular functional assays and expression studies in pulmonary tissues.
In the AFR population, we observed a suggestive positive causal association between elevated TG levels and increased risk of COPD. This finding implies that the role of TG in COPD may exhibit ancestry-specific variation and, under certain genetic backgrounds, act as a risk factor. Epidemiological studies have also reported a close association between elevated TG levels and systemic inflammation in COPD, particularly among individuals with metabolic syndrome. In such populations, increased TG levels are often accompanied by low HDL cholesterol, insulin resistance, and chronic low-grade inflammation, all of which may contribute to the onset and progression of COPD.48 Although genetic and environmental exposures differ across populations, our findings suggest that elevated TG may indeed represent a risk factor for COPD within specific genetic contexts. This observation further underscores the need for future COPD prevention and management strategies to consider individualized metabolic profiles and ancestry-related genetic differences.
In MR analysis, the control of horizontal pleiotropy remains one of the central methodological challenges. Horizontal pleiotropy can be classified as either correlated or uncorrelated. The identification and control of uncorrelated pleiotropy are relatively straightforward and well-established. In this study, we employed a comprehensive suite of methods to assess and mitigate uncorrelated pleiotropy, including MR-Egger regression, MR-PRESSO combined with RadialMR, and the BWMR approach. In contrast, addressing correlated pleiotropy is considerably more complex. Previous MR studies have often relied on online tools such as Phenoscanner to manually screen phenotypic associations for each SNP and exclude those with potential pleiotropic effects based on traits. However, this approach has notable limitations. First, it lacks standardized, objective criteria and often relies on ambiguous or indirect evidence. Second, it may inadvertently remove SNPs with biologically meaningful associations, leading to excessive noise reduction, which in turn compromises statistical power and the interpretability of results.49 To more robustly address both forms of pleiotropy, our study incorporated the CAUSE method, which is based on a Bayesian inference framework. CAUSE enables simultaneous modeling and correction of both correlated and uncorrelated pleiotropy, thereby enhancing the credibility and stability of causal inference.
This study also identified a noteworthy phenomenon: certain positive associations observed in the discovery dataset exhibited even greater statistical significance in the replication dataset. For example, in the EAS population, the association between TC and COPD had a P of 1.29×10−4 in the discovery set, which improved to 3.95×10−5 in the replication set. Additionally, the number of IVs used in the replication analyses was generally higher than that in the discovery analyses. This discrepancy is primarily attributable to differences in sequencing depth across the GWAS datasets. Regarding dataset selection, the GBMI integrated four independent cohorts for each of the EAS and HIS populations and six for the AFR population, offering broader population coverage and more complex genetic architectures.23 These features enhance the generalizability and external validity of the findings. Therefore, GBMI was selected as the discovery dataset in this study, while the single-cohort datasets from the MVP and BBJ were used as replication datasets to reinforce the robustness of the results.
Additionally, this study observed highly significant causal associations between lipid phenotypes and COPD risk in the EAS population, whereas evidence from the AFR and HIS populations was comparatively limited. Several factors may explain these discrepancies. First, MR analyses are highly sensitive to sample size and the number of disease cases compared with traditional observational studies.38,50 In the GBMI dataset, the EAS population comprised 329,733 individuals, with COPD cases several times greater than the combined number of cases from the AFR and HIS groups (Table 1), thereby conferring greater statistical power to detect true effects. Conversely, the AFR (n = 29,682) and HIS (n = 15,086) populations were limited by smaller sample sizes and fewer cases, resulting in reduced statistical power. Second, differences in allele frequency, linkage disequilibrium structures, and gene-environment interactions across ancestries could cause attenuation or directional inconsistency of lipid-related SNP effects observed in AFR or HIS populations. Lastly, it is also plausible that a genuine causal relationship does not exist in these ancestry groups, and previously reported associations from observational studies may have been influenced by confounding factors or reverse causation. Therefore, future studies should conduct triangulation analyses in larger cohorts of AFR, HIS, and other underrepresented ancestries to further validate these findings.
Strengths, Limitations, and Future Directions
This MR analysis addresses a common limitation not only in previous studies on the relationship between lipid levels and COPD,15 but also in the broader body of MR literature—namely, the overwhelming focus on individuals of EUR ancestry. As Europeans constitute less than 10% of the global population and occupy only about 5% of the Earth’s land surface, findings derived primarily from EUR-based datasets may have limited generalizability and representativeness on a global scale. It is undeniable that MR analyses rely on secondary use of GWAS data, and most publicly available GWAS datasets to date have been generated from EUR populations. This pattern is partly attributable to the technological leadership and extensive scientific infrastructure of European countries in genomics research, which has laid a foundation for understanding the genetic architecture of many diseases and provided a valuable reference for studies in other ancestral groups. However, with the advancement of genomic technologies and the global expansion of research initiatives, large-scale genetic databases for non-EUR populations have become increasingly accessible in recent years, including resources such as BBJ, the China Kadoorie Biobank (CKB), and the MVP. These databases offer critical opportunities to address the longstanding issue of limited external validity in MR studies. While the present study includes data from EAS, AFR, and HIS populations to examine the potential causal relationships between lipid phenotypes and COPD across multiple ancestries, it does not yet encompass all major ancestral groups, such as South Asians (SAS) and individuals from the Greater Middle East (GME). According to the GWAS Catalog (accessed April 2, 2025), lipid-related GWAS data are now available for SAS22 and GME51 populations; however, no COPD-related GWAS datasets have been identified for these groups to date. Notably, although the GBMI includes 21,948 SAS samples, a dedicated COPD GWAS dataset for this ancestry is not yet available, limiting further analysis in this population. Future research should prioritize replication efforts in SAS, GME, and other underrepresented populations—when data become available—to enhance the robustness and generalizability of MR findings across diverse ancestries. Moreover, the ancestry-specific signals observed here remain exploratory; their biological relevance requires follow-up functional studies in cellular and animal models, targeted lipidomic profiling, and validation in prospective nutritional or epidemiologic cohorts.
Conclusion
To encapsulate, this multi-ancestry MR study identified a significant causal association between genetically predicted higher TC levels and reduced risk of COPD in the EAS population. The observed protective effect of TG in EAS was primarily driven by the rs7350481 variant within the MAML2 gene, suggesting that the association may reflect genetic pleiotropy rather than a direct effect of TG metabolism. In contrast, elevated TG levels were positively associated with increased COPD risk in the AFR population. These findings underscore the ancestry-specific nature of genetic associations between lipid traits and COPD and highlight the possibility that some causal signals may be driven by the biological function of specific variants. Cautious interpretation is warranted, and future studies integrating functional validation are needed to further elucidate the underlying mechanisms. Accordingly, the present findings should be considered hypothesis-generating, pending confirmation and refinement through combined experimental and longitudinal clinical research to unravel ancestry-dependent mechanisms and potential therapeutic targets.
Abbreviations
AFR, African ancestry; BBJ, Biobank Japan; BWMR, Bayesian Weighted Mendelian Randomization; CAUSE, Causal Analysis Using Summary Effect Estimates; CI, Confidence Interval; CKB, China Kadoorie Biobank; cML, Constrained Maximum Likelihood; COPD, Chronic Obstructive Pulmonary Disease; ConMix, Contamination Mixture Method; dIVW, Debiased Inverse Variance Weighted; EAS, East Asian ancestry; EUR, European ancestry; GBMI, Global Biobank Meta-analysis Initiative; GLGC, Global Lipids Genetics Consortium; GWAS, Genome-Wide Association Study; HDL-C, High-Density Lipoprotein Cholesterol; HIS, Hispanic ancestry; IV, Instrumental Variable; IVW, Inverse Variance Weighted; LDL-C, Low-Density Lipoprotein Cholesterol; MAML2, Mastermind-like Transcriptional Coactivator 2; MR, Mendelian Randomization; MR-PRESSO, Mendelian Randomization Pleiotropy RESidual Sum and Outlier; MVP, Million Veteran Program; OR, Odds Ratio; RCT, Randomized Controlled Trial; SNP, Single Nucleotide Polymorphism; STROBE-MR, Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization; TC, Total Cholesterol; TG, Triglycerides; WHO, World Health Organization.
Data Sharing Statement
The data used in this study are from publicly available downloadable GWAS data, available from Table 1.
Ethics Approval and Informed Consent
The study was a secondary analysis of publicly available data and therefore did not require ethical approval and clinical registration.
Acknowledgments
We thank all GWAS participants and investigators for publicly making the summary statistics data available.
Author Contributions
All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.
Funding
No funds were received for this study.
Disclosure
The authors report no conflicts of interest in this work.
References
1. Celli B, Fabbri L, Criner G, et al. Definition and nomenclature of chronic obstructive pulmonary disease: time for its revision. Am J Respir Crit Care Med. 2022;206(11):1317–1325. doi:10.1164/rccm.202204-0671PP
2. AL Wachami N, Guennouni M, Iderdar Y, et al. Estimating the global prevalence of chronic obstructive pulmonary disease (COPD): a systematic review and meta-analysis. BMC Public Health. 2024;24(1):297. doi:10.1186/s12889-024-17686-9
3. Alqahtani JS, Aquilina J, Bafadhel M, et al. Research priorities for exacerbations of COPD. Lancet Respir Med. 2021;9(8):824–826. doi:10.1016/S2213-2600(21)00227-7
4. WHO EMRO. Chronic obstructive pulmonary disease (COPD) | health topics. World Health Organization – Regional Office for the Eastern Mediterranean.
5. American Lung Association. COPD Trends Brief – Burden.
6. Agarwal D. COPD generates substantial cost for health systems. Lancet Glob Health. 2023;11(8):e1138–e1139. doi:10.1016/S2214-109X(23)00304-2
7. May SM, Li JTC. Burden of chronic obstructive pulmonary disease: healthcare costs and beyond. Allergy Asthma Proc. 2015;36(1):4–10. doi:10.2500/aap.2015.36.3812
8. Hassan MM, Tahir MH, Ameeq M, Jamal F, Mendy JT, Chesneau C. Risk factors identification of COVID-19 patients with chronic obstructive pulmonary disease: a retrospective study in Punjab-Pakistan. Immun Inflamm Dis. 2023;11(8):e981. doi:10.1002/iid3.981
9. Hassan MM, Sikandar SM, Jamal F, Ameeq M, Kargbo A. The complex relationship between chronic obstructive pulmonary disease with cardiovascular disease and their interactions with COVID-19 vaccination: a retrospective study. Immun Inflamm Dis. 2024;12(11):e70068. doi:10.1002/iid3.70068
10. Hassan MM, Sikandar SM, Jamal F, Ameeq M, Kargbo A. Chronic obstructive pulmonary disease patients with community-acquired pneumonia on inhaled corticosteroid therapy: a comprehensive analysis of risk factors, disease burden, and prevention strategies. Health Sci Rep. 2025;8(1):e70395. doi:10.1002/hsr2.70395
11. Xuan L, Han F, Gong L, et al. Association between chronic obstructive pulmonary disease and serum lipid levels: a meta-analysis. Lipids Health Dis. 2018;17(1):263. doi:10.1186/s12944-018-0904-4
12. Huang Y, Ding K, Dai Z, et al. The relationship of low-density-lipoprotein to lymphocyte ratio with chronic obstructive pulmonary disease. Int J Chron Obstruct Pulmon Dis. 2022;17:2175–2185. doi:10.2147/COPD.S369161
13. Paul S, Chakrabortty R, Islam S, Paul S, Choudhury A, Rahman M. Lipid profile patterns in chronic obstructive pulmonary disease and its correlation with the severity of disease. Bangabandhu Sheikh Mujib Med Univ J. 2023;15:37–41. doi:10.3329/bsmmuj.v15i4.64151
14. Markelić I, Hlapčić I, Rogić D, et al. Lipid profile and atherogenic indices in patients with stable chronic obstructive pulmonary disease. Nutr Metab Cardiovasc Dis. 2021;31(1):153–161. doi:10.1016/j.numecd.2020.07.039
15. Huang P, Zhao Y, Wei H, et al. Causal relationships between blood lipid levels and chronic obstructive pulmonary disease: a Mendelian randomization analysis. Int J Chron Obstruct Pulmon Dis. 2025;20:83–93. doi:10.2147/COPD.S476833
16. Varmaghani M, Dehghani M, Heidari E, Sharifi F, Moghaddam SS, Farzadfar F. Global prevalence of chronic obstructive pulmonary disease: systematic review and meta-analysis. East Mediterr Health J. 2019;25(1):47–57. doi:10.26719/emhj.18.014
17. Gilkes A, Ashworth M, Schofield P, et al. Does COPD risk vary by ethnicity? A retrospective cross-sectional study. Int J Chron Obstruct Pulmon Dis. 2016;11:739–746. doi:10.2147/COPD.S96391
18. Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–1163. doi:10.1002/sim.3034
19. Skrivankova VW, Richmond RC, Woolf BAR, et al. Strengthening the reporting of observational studies in epidemiology using Mendelian randomization: the STROBE-MR Statement. JAMA. 2021;326(16):1614–1621. doi:10.1001/jama.2021.18236
20. Zhang Y, Su Y, Tang Z, Li L. The impact of cannabis use on erectile dysfunction and sex hormones: a Mendelian randomization analysis. Int J Impot Res. 2024;2024:1–8.
21. Zhang Y, Ni Y, Li L. Genetic Insights into the causal relationship between cannabis use and diabetic phenotypes: a genetic correlation and Mendelian randomization study. Drug Alcohol Depend. 2024;254:111037. doi:10.1016/j.drugalcdep.2023.111037
22. Graham SE, Clarke SL, Wu KHH, et al. The power of genetic diversity in genome-wide association studies of lipids. Nature. 2021;600(7890):675–679. doi:10.1038/s41586-021-04064-3
23. Zhou W, Kanai M, Wu KHH, et al. Global Biobank meta-analysis initiative: powering genetic discovery across human disease. Cell Genom. 2022;2(10):100192. doi:10.1016/j.xgen.2022.100192
24. Verma A, Huffman JE, Rodriguez A, et al. Diversity and scale: genetic architecture of 2068 traits in the VA Million Veteran Program. Science. 2024;385(6706):eadj1182. doi:10.1126/science.adj1182
25. Sakaue S, Kanai M, Tanigawa Y, et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat Genet. 2021;53(10):1415–1424. doi:10.1038/s41588-021-00931-x
26. Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–665. doi:10.1002/gepi.21758
27. Burgess S, Foley CN, Allara E, Staley JR, Howson JMM. A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat Commun. 2020;11(1):376. doi:10.1038/s41467-019-14156-4
28. Zhao Q, Wang J, Hemani G, Bowden J, Small DS. Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. Ann Stat. 2020;48(3):1742–1769. doi:10.1214/19-AOS1866
29. Ye T, Shao J, Kang H. Debiased inverse-variance weighted estimator in two-sample summary-data Mendelian randomization. Ann Stat. 2021;49(4):2079–2100. doi:10.1214/20-AOS2027
30. Xue H, Shen X, Pan W. Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects. Am J Hum Genet. 2021;108(7):1251–1269. doi:10.1016/j.ajhg.2021.05.014
31. Zhao J, Ming J, Hu X, Chen G, Liu J, Yang C. Bayesian weighted Mendelian randomization for causal inference based on summary statistics. Bioinformatics. 2020;36(5):1501–1508. doi:10.1093/bioinformatics/btz749
32. Mounier N, Kutalik Z. Bias correction for inverse variance weighting Mendelian randomization. Genetic Epidemiology. 2023;47(4):314–331. doi:10.1002/gepi.22522
33. Kulinskaya E, Dollinger MB, Bjørkestøl K. On the moments of Cochran’s Q statistic under the null hypothesis, with application to the meta-analysis of risk difference. Res Synth Methods. 2020;11(6):920. doi:10.1002/jrsm.1446
34. Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32(5):377–389. doi:10.1007/s10654-017-0255-x
35. Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–698. doi:10.1038/s41588-018-0099-7
36. Bowden J, Spiller W, Del Greco MF, et al. Improving the visualization, interpretation and analysis of two-sample summary data Mendelian randomization via the Radial plot and Radial regression. Int J Epidemiol. 2018;47(4):1264–1278. doi:10.1093/ije/dyy101
37. Morrison J, Knoblauch N, Marcus JH, Stephens M, He X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet. 2020;52(7):740–747. doi:10.1038/s41588-020-0631-4
38. Brion MJA, Shakhbazov K, Visscher PM. Calculating statistical power in Mendelian randomization studies. Int J Epidemiol. 2013;42(5):1497–1501. doi:10.1093/ije/dyt179
39. Freyberg J, Landt EM, Afzal S, Nordestgaard BG, Dahl M. Low-density lipoprotein cholesterol and risk of COPD: copenhagen General Population Study. ERJ Open Res. 2023;9(2):00496–02022. doi:10.1183/23120541.00496-2022
40. Kahnert K, Lucke T, Huber RM, et al. Relationship of hyperlipidemia to comorbidities and lung function in COPD: results of the COSYCONET cohort. PLoS One. 2017;12(5):e0177501. doi:10.1371/journal.pone.0177501
41. Iribarren C, Jacobs DR, Sidney S, et al. Serum total cholesterol and risk of hospitalization, and death from respiratory disease. Int J Epidemiol. 1997;26(6):1191–1202. doi:10.1093/ije/26.6.1191
42. Keating E, Rahman L, Francis J, et al. Effect of cholesterol on the biophysical and physiological properties of a clinical pulmonary surfactant. Biophys J. 2007;93(4):1391–1401. doi:10.1529/biophysj.106.099762
43. Gowdy KM, Fessler MB. Emerging roles for cholesterol and lipoproteins in lung disease. Pulm Pharmacol Ther. 2013;26(4):430–437. doi:10.1016/j.pupt.2012.06.002
44. Kitagawa M. Notch signalling in the nucleus: roles of Mastermind-like (MAML) transcriptional coactivators. J Biochem. 2016;159(3):287–294. doi:10.1093/jb/mvv123
45. Zong D, Ouyang R, Li J, Chen Y, Chen P. Notch signaling in lung diseases: focus on Notch1 and Notch3. Ther Adv Respir Dis. 2016;10(5):468–484. doi:10.1177/1753465816654873
46. Braun TR, Been LF, Singhal A, et al. A replication study of GWAS-derived lipid genes in Asian Indians: the chromosomal region 11q23.3 harbors loci contributing to triglycerides. PLoS One. 2012;7(5):e37056. doi:10.1371/journal.pone.0037056
47. Yamada Y, Sakuma J, Takeuchi I, et al. Identification of rs7350481 at chromosome 11q23.3 as a novel susceptibility locus for metabolic syndrome in Japanese individuals by an exome-wide association study. Oncotarget. 2017;8(24):39296–39308. doi:10.18632/oncotarget.16945
48. Yang HY, Hu LY, Chen HJ, Chen RY, Hu CK, Shen CC. Increased risk of chronic obstructive pulmonary disease in patients with hyperlipidemia: a nationwide population-based cohort study. Int J Environ Res Public Health. 2022;19(19):12331. doi:10.3390/ijerph191912331
49. Cho Y, Haycock PC, Sanderson E, et al. Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework. Nat Commun. 2020;11(1):1010. doi:10.1038/s41467-020-14452-4
50. Burgess S. Sample size and power calculations in mendelian randomization with a single instrumental variable and a binary outcome. Int J Epidemiol. 2014;43(3):922–929. doi:10.1093/ije/dyu005
51. Thareja G, Al-Sarraj Y, Belkadi A, et al. Whole genome sequencing in the Middle Eastern Qatari population identifies genetic associations with 45 clinically relevant traits. Nat Commun. 2021;12(1):1250. doi:10.1038/s41467-021-21381-3