Study design
MR is a specialized analytical method that employs genetic instrumental variables (IVs), specifically single nucleotide polymorphisms (SNPs), to assess the effects of risk factors on various outcomes, including diseases [11]. This MR study consists of two analytical phases (Fig. 1). In the first phase, the causal impacts of nine adverse events during pregnancy on six congenital malformations in offspring were investigated using a Two-Sample MR approach. In the second phase, the mediating role of circulating metabolites in the causal pathways between adverse events during pregnancy and these congenital outcomes was evaluated using a Two-Step Two-Sample MR approach.
Overview of the study design. The flow diagram of conducting the two-step MR, which involved the adverse events during pregnancy, circulating metabolites, and congenital malformations
The study employs a two-sample MR design. To ensure accuracy and rigor, the instrumental variables must satisfy three core assumptions [11]: (1) the IVs are closely associated with the exposure; (2) the IVs are independent of confounding factors; (3) the IVs influence the outcome exclusively through the exposure. This study adheres to the guidelines of the Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomization (STROBE-MR) [12].
All MR analysis was conducted with the packages “TwoSampleMR”, “MendelianRandomization”, “MRPRESSO”, “MRInstruments”, and “ieugwasr” in R software (version 4.3.0).
Data source
The GWAS datasets provide reliable instrumental tools for MR analysis. Summary statistics for adverse events during pregnancy were obtained from the UK Biobank and accessed through the Pan UKBB portal (Pan UKBB) [13] (Supplementary Table 1). Summary statistics on congenital malformations were sourced from the Finn Biobank (Supplementary Table 2).
For this study, data on 249 nuclear magnetic resonance circulating metabolites from 121,000 participants of European ancestry were utilized. These included absolute concentrations of 168 biomarkers and 81 biomarker ratios, predominantly encompassing lipids and lipoprotein particles sub-fractions (accounting for 81% of the data). Additional measured biomarkers included cholesterol, amino acids, esterified cholesterol, apolipoproteins, fatty acids, free cholesterol, lipoprotein particle size, ketone bodies, choline, glycolysis-related compounds, phospholipids, and triglycerides. These metabolite profiles were generated by Nightingale Health [14]. The full GWAS summary statistics for these biomarkers are publicly available in the IEU Open-GWAS Project database under the GWAS identifier’met-d'(Supplementary Table 3).
Instrumental variables selection
Initially, this study employed a significance threshold of P < 5e-8 to identify SNPs highly associated with the exposure factors. However, due to a limited number of SNPs associated with adverse events during pregnancy, which compromised the reliability of the results, a more lenient cutoff of P < 1e-5 was adopted. For the selection of SNPs related to circulating metabolites, the stricter significance threshold of P < 5e-8 was maintained. This analysis conducted a clumping procedure to filter independent SNPs, applying a window size of 10,000 kb and an r2 < 0.01 threshold to assess linkage disequilibrium (LD). Palindromic SNPs were excluded due to alignment uncertainties in the same orientation for both exposure and outcome. The F-statistic, which integrates the magnitude and precision of the genetic impact on the trait:
$$F=frac{{R}^{2}(N-2)}{1- {R}^{2}}$$
where R2 signifies the proportion of the trait’s variance elucidated by the SNP, and N denotes the sample size of the GWAS encompassing SNPs associated with the trait [15]. The R2 values were estimated using the formula:
$${R}^{2}=2*EAF*left(1-EAFright)* {beta }^{2}$$
The effect allele frequency (EAF) of the SNP is denoted as EAF, and β represents the estimated effect of the SNP on the trait. SNPs with an F-statistic less than 10 were excluded, as an F-statistic greater than 10 indicated ample strength, ensuring the credibility of the SNPs.
Statistical analyses
MR analysis to estimate the effects of the adverse events during pregnancy on the congenital malformation
Two-Sample MR was utilized to estimate the effect of adverse events during pregnancy on congenital malformations. The Inverse Variance Weighted (IVW) method served as the primary analysis technique, offering the most precise and powerful estimates assuming all genetic variants are valid instruments. To comprehensively evaluate potential relationships, additional robust methods were employed, including MR-Egger, weighted median, weighted mode, and simple mode.
MR-Egger regression, which is typically used to detect publication bias in meta-analyses, was also applied to assess directional pleiotropy among different genetic variants [16]. This study utilized the MR-Egger intercept approach to detect horizontal pleiotropy [17]. Should horizontal pleiotropy be detected, outliers were removed, and the IVW method was reapplied to aggregate the effect sizes of each SNP.
Mediation MR analysis linking the adverse events during pregnancy with congenital malformation via circulating metabolites
Two-step MR was utilized to estimate the mediation effect of circulating metabolites on the relationship between adverse events during pregnancy and congenital malformations. Initially, the impact of adverse events during pregnancy on 168 metabolites was quantified, denoted as (βexp-med). Subsequently, the analysis assessed the influence of circulating metabolites—those that exhibited statistically significant associations with adverse events during pregnancy—on the congenital malformations, represented as (βmed-out).
The indirect effect of the exposure (adverse events during pregnancy) on the outcome (congenital malformations) mediated through metabolites was calculated as the product of βexp-med and βmed-out:
$${beta }_{indirect}={beta }_{exp-med}*{beta }_{med-out}$$
Furthermore, the direct effect of the exposure on the outcome, specifically refers to the component of the exposure’s effect on the outcome that is independent of the proposed mediator—the circulating metabolites, was computed using the equation:
$${beta }_{direct}={beta }_{exp-out}-{beta }_{indirect}$$
Sensitivity analysis
In the analysis assessing the impact of adverse events during pregnancy on congenital malformations, several sensitivity analysis methods were employed to ensure the robustness of the findings. These included the MR-Egger, MR-PRESSO, weighted median, simple mode, and weighted mode methods. Additionally, MR-Egger regression was utilized to evaluate potential biases arising from gene pleiotropy, with the intercept serving as an indicator of such bias. To further quantify the heterogeneity among SNPs, Cochrane’s Q statistic was applied both for the IVW method and the MR-Egger approach.