Causal effect of conventional anti-dementia drugs on economic burden: an orthogonal double/debiased machine learning approach | BMC Geriatrics

Study design and data source

This study used a pooled cross-sectional design utilizing data from the Medicare Current Beneficiary Survey (MCBS) [13], developed by the Centers for Medicare & Medicaid Services (CMS). MCBS employs a sophisticated survey methodology involving a stratified multistage sampling approach and computer-assisted personal interviews. It is designed to be nationally representative and provides extensive data by surveying participants. Furthermore, it links the survey data to their corresponding Medicare claims data. The Medicare claims include Part A inpatient, Part B outpatient, and Part D prescription information on the diagnosis, health care utilization, and health care costs. The linked data enable this study to examine anti-dementia drug utilization and costs related to patients with ADRD.

Patient selection

This study included Medicare beneficiaries who had a diagnosis of ADRD, were 65 years and over, and were included in the MCBS spanning from 2015 to 2019. All participants were restricted to individuals with Medicare Parts A, B, and D.

Variable definitions

The presence of ADRD and anti-dementia drug use was determined through Medicare claims data. The ADRD patients were identified using the International Classification of Diseases, Ninth and Tenth Revisions (ICD-9-CM, ICD-10-M) diagnoses codes for dementia as defined by the Chronic Conditions Warehouse (CCW) [14]. (Supplementary Table 1) The ADRD in our study included Alzheimer’s disease, vascular dementia, frontotemporal dementia, unspecified dementias, and other neurodegenerative and cognitive disorders commonly associated with ADRD.

The health care costs were measured as total medical costs, and categorized into Medicare costs, out-of-pocket (OOP) costs, inpatient costs, and outpatient costs from Medicare Part A (inpatient), B (outpatient/physician), and D (prescription drugs) claims. Conventional anti-dementia drugs in this study consisted of two classes, including ChEIs and NMDAR antagonist. Specifically, ChEIs included the medications rivastigmine, donepezil, and galantamine, while memantine was classified as the NMDAR antagonist. Anti-dementia drug users were defined as individuals with at least one prescription drug fill for ChEIs or NMDAR antagonist during the observation year, using Medicare Part D claims. Those without any such fills were classified as non-users. Healthcare costs and Drug use were measured over the same calendar year. All costs were converted to 2024 US dollars using the Consumer Price Index (CPI).

Based on the newly released National Institute on Aging (NIA) Health Disparities Research Framework [15], a total of 56 covariates were used in our study, including and biological factors (e.g., age, sex, race), environmental factors (e.g., residence, cost-related medication nonadherence), sociocultural factors (e.g., education level, income), and behavioral factors (e.g., activities of daily living, instrumental activities of daily living). (Supplementary Table 2)

Statistical analysis

Traditional regression models are often limited by assumptions of linearity and may struggle to account for high-dimensional confounding, leading to biased estimates in observational studies. To overcome these challenges and strengthen causal inference, we employed Double/Debiased Machine Learning (DML) [16] to obtain more accurate and unbiased estimates of the causal effects of anti-dementia drug use on healthcare costs.

DML flexibly models both treatment assignment and outcome using machine learning algorithms, while applying sample splitting and Neyman orthogonality to minimize bias [16]. This orthogonalization effectively isolates the causal effect of interest from the influence of high-dimensional covariates, reducing bias from both regularization and overfitting. Subsampling techniques, such as cross-fitting, further enhance estimation accuracy. Given the complex, nonlinear relationships and large number of covariates in real-world healthcare data, DML is well-suited to produce more robust and unbiased estimates of the causal effects of anti-dementia drug use on healthcare costs.

In our study, we utilized LASSO (Least Absolute Shrinkage and Selection Operator) for predicting both the outcome and the treatment variable. We opted for LASSO regression for several reasons: Firstly, LASSO regression is effective at preventing overfitting and dealing with multicollinearity considering many similar covariates in this study. Additionally, despite being a machine learning technique that optimizes parameters based on the training data, LASSO regression offers a high degree of interpretability. This interpretability is particularly valuable for public health practice [17], as it allows us to examine and understand the coefficients derived from the model.

To estimate the effect of anti-dementia drug use on cost change, we first used machine learning models to estimate each individual’s probability of receiving the drug (treatment model) and their expected change in costs regardless of treatment (outcome model), based on patients’ characteristics. Finally, DML estimated the average treatment effect by comparing the adjusted cost changes between drug users and non-users, using a cross-fitting procedure to improve estimation accuracy and reduce bias.

In this study, individuals with missing values for costs and anti-dementia drug use information were excluded. For covariates with missing values, the missing values were treated as a distinct category, and participants with missing data for these variables were retained in the analysis. In addition, we conducted sensitivity analyses by using multiple imputation for missing variables for each anti-dementia drug. Specifically, we generated five imputed datasets, using logistic regression (“logreg”) for binary variables, multinomial regression (“polyreg”) for multi-categorical variables.

The differences in patient characteristics among patients using or not using anti-dementia drugs were compared using Chi-square tests. Survey sampling weights were applied in this study to generate national estimates. P-value less than 0.05 was considered statistically significant. R package “DoubleML” was employed to conduct DML, and R package “mice” was used to perform multiple imputation in this study.

Continue Reading