The current education research-to-policy pipeline is too slow to keep pace with the urgent needs of districts and states. Researchers face steep barriers to accessing high-quality, multimodal data, while existing R&D infrastructures remain siloed and under-resourced. Without scalable, trusted, systems that enable timely and secure data use, the U.S. risks falling behind in generating actionable and evidence-based insights to guide policy and practice. In this memo, we discuss how privacy-preserving research models can be used to strengthen education R&D capacity.
Challenge and Opportunity
Learning is a lifelong and multidimensional process, yet data about learning has historically been difficult to obtain. The shift to digital learning platforms (DLPs), accelerated by COVID-19, has created a wealth of data, but accessing it remains complex and slow – especially for researchers with fewer institutional resources.
Additionally, complex privacy laws, such as the Children’s Online Privacy Protection Act (COPPA) and Family Educational Rights and Privacy Act (FERPA), alongside state-specific regulations and institutional risk aversion, create substantial barriers. These laws were not designed to accommodate privacy scenarios within the current environment of pervasive data collection and rapidly advancing AI.
As such, trusted mechanisms for safe data access that remove barriers to critical R&D, bolster global competitiveness, and leverage innovation to cultivate a skilled STEM workforce, are more important than ever. Without trusted mechanisms to ensure privacy while enabling secure data access, essential R&D stalls, educational innovation stalls, and U.S. global competitiveness suffers.
Flipping the traditional research model
The landscape of educational research and development (R&D) is rapidly evolving as digital learning platforms (DLPs) capture increasingly rich streams of data about how students learn. These multimodal data streams provide unprecedented opportunities to accelerate insights into how learning happens, for whom, and in what contexts – as well as how these processes, in turn, affect learning outcomes, engagement, and persistence. Yet, despite this potential, access to platform-generated learning data remains highly constrained – particularly for early-career researchers with minimal institutional resources and organizations outside elite academic settings.
Current challenges to accessing DLP data include privacy risks (e.g., data leaks), opaque legal environments, institutional risk aversion, and the lack of trusted third-party intermediaries to balance privacy with data utility. As a result, promising research is delayed and the research-to-policy pipeline is exacerbated – leaving decision-makers without timely evidence to address urgent needs such as learning recovery, responsible AI integration, or workforce readiness.
Privacy-preserving models offer transformative opportunities to address these barriers. Across sectors, the field is converging on trusted research environments that include secure enclaves that keep data in situ and move analysis to the data. SafeInsights, the U.S. Census’ Federal Statistical Research Data Center (FSRDC), and North Carolina Education Research Data Center (NCERDC) are examples of such systems complemented by privacy-preserving methods.
Privacy-preserving research models, such as SafeInsights, flip the traditional research model: instead of giving data to researchers, it brings researchers’ questions and analyses, encoded as software, to the data. At no point in the research process does the researcher have direct access to raw data, thereby minimizing concerns for data leaks.
Researchers instead use sample or synthetic data to craft their analyses. Once the researchers’ analysis code is submitted to the owner of the data, it is reviewed by experts for approval. This model minimizes risk, reduces delays in the research-to-policy pipeline, and unlocks data that would otherwise remain inaccessible.
Think of it as a secure research zone: a trusted third-party intermediary where researchers can run analyses using specific tools and applications, but cannot access data directly, ensuring strict security.
Rather than extracting and sharing sensitive data with researchers, privacy-preserving research models bring researchers’ analytic tools to secure data enclaves – preserving privacy while enabling rigorous, scalable, inquiry of DLP data. Through secure enclaves, transparent governance, and standardized compliance frameworks, a durable large-scale infrastructure for research can be created.
Benefits of privacy-preserving research models
- Accelerate time to insight for policy and decision-makers who need rapid, evidence-based guidance. Standardized governance reduces delays arising from fragmented compliance and legal processes. For federal, state, and local level policy and decision-makers, this means actionable insights can be delivered in months rather than years, potentially informing legislative decisions and programs with greater speed.
- Safely join data across platforms, enabling richer analyses of student learning. Shared infrastructure maximizes critical research infrastructure return on investments and spreads costs across funders. Secure, trusted, interoperable research environments protect privacy while enabling cumulative evidence. This aligns with federal agency priorities to modernize research infrastructure and ensure taxpayer investments translate into impact.
- Democratize access and participation in complex research by lowering barriers for early-career researchers with minimal institutional resources and organizations outside elite academic settings. Lowering barriers to entry broadens the reach of federal R&D investments and supports state leaders and research organizations seeking to participate in research.
By securing cross-sector investment for embedding scalable privacy-preserving models into R&D ecosystems and infrastructures, we can expand access to high-value data while supporting long-term research scalability, security, and trust.
Such models can fill a critical gap in the R&D ecosystem by establishing a secure and sustainable research infrastructure that extends well beyond its initial NSF funding and is ideally suited to broker access between DLP developers, school districts, and researchers.
Plan of Action
Promote R&D Infrastructure Development and Sustainability
Privacy-preserving research models have the potential to offer researchers safer, faster, reliable, high-value, de-identified data analyses – while simultaneously saving DLPs and school districts time and resources on compliance reviews and privacy audits. It also creates opportunities for funders to support a sustainable research infrastructure that multiplies the impact of each dollar invested.
To move from promise to practice, interested stakeholders, including research institutions, school districts, and funders, should consider the following actions:
Recommendation 1. Lay the Foundation for Sustainable Large-Scale R&D Infrastructure
- Conduct policy landscape scans, including review of state student privacy laws, to identify commonalities, constraints, and pathways for district participation.
- Interview stakeholders, including district data leads, state education agencies, and platform providers, to understand pain points and demand for trusted intermediaries.
- Review existing research infrastructures and operational frameworks, including research data hub governance, fee structures, data-sharing agreements, IRB support services, and services, adapting effective practices to the privacy-preserving context.
Recommendation 2. Embed Infrastructure Costs into Research Contracts and Budgets
- Require researchers to include service fees for privacy-preserving infrastructure directly in grant applications, with templates to simplify proposal preparation.
- Embed privacy-preserving infrastructure costs in contracting and budgeting to support scalability, drive down the marginal cost of data access across the field, and make rigorous educational research more accessible and sustainable beyond single grants.
Recommendation 3. Catalyze Scaling through Foundation and Philanthropic Support
Recommendation 4. Develop Large Scale R&D Infrastructure across Sectors
- Extend privacy-preserving models across sectors, such as education, health, workforce, housing, and finance, to capture increasingly rich streams of data about how people live, learn, work, and access services.
- Enable secure, interoperable, cross-sector research on questions such as how early education experiences impacts long-term workforce outcomes or how neighborhood-level educational access connects to public health disparities.
- Align with federal agency efforts, such as the Federal Data Strategy, to support the linking of data ecosystems across sectors.
Conclusion
Privacy-preserving research models offer standardized, secure, and privacy-conscious ways to analyze data – helping researchers at the local, state, and federal levels understand long-term educational trends, policy impacts, and demographic disparities with unprecedented clarity.
By accelerating time-to-insight, investing in critical R&D infrastructure, and expanding participation in complex research, privacy-preserving research models offer possibilities for delivering on urgent policy priorities – building towards a modern, responsive, trustworthy education R&D ecosystem.
What kinds of research topics can be explored using privacy-preserving research models?
Privacy-preserving research models could offer the possibility to connect researchers with DLP data representing different learning contexts. DLP data is often rich and versatile, possibly enabling the exploration of multiple research topics, including:
- Learning Behaviors: Analyze patterns of engagement, tool usage (e.g., text-to-speech, digital pencil), or response time.
- Personalized Learning: Investigate how adaptive experiences influence outcomes.
- Achievement Gaps: Study differences across subgroups (e.g., students with disabilities, English Language Learners).
- Intervention Effectiveness: Test how interventions or instructional strategies impact student performance.
- Learning Trajectories: Examine longitudinal progress and identify barriers to success.
read more
What kinds of data could be made available through privacy-preserving research models?
Privacy-preserving research models could facilitate connections among various types of educational data from DLP developers, each representing different aspects of K16+ teaching and learning, including administrative records, learning management systems, and curricular resource usage data.
Examples of DLP data categories include digital curricula, university data systems, and student information systems for K-12 institutions.
read more
What are some examples of privacy-preserving research models utilizing secure enclaves across different sectors?
Across sectors, the field is converging on privacy-preserving research models that utilize secure enclaves to keep data in situ and move analysis to the data. Such examples include:
- Federal statistical system: the FSRDC network provides secure facilities (now including some remote access) where qualified researchers run analyses on restricted microdata under rigorous review.
- Cross-agency administrative data: the Coleridge Initiative’s Administrative Data Research Facility (ADRF) is a FedRAMP-certified, cloud based platform that supports inter-state and inter-agency linkages under shared governance.
- State education data enclaves: NCERDC at Duke University and the Texas Education Research Center (ERC) support secure access to longitudinal education/workforce data with well-defined agreements and masking rules.
- Health: OpenSAFELY operationalizes a strict “code-to-data” model—researchers develop code on dummies, submit jobs to run against in-place EHR data, and only aggregate outputs leave the enclave. NIH’s N3C and All of Us Researcher Workbench similarly provide secure, cloud based research environments where individual-level data never leave the enclave.
These approaches are complemented by privacy-preserving release methods (e.g., differential privacy), used by the U.S. Census Bureau and supported by open-source toolkits like OpenDP/SmartNoise.
read more
How might privacy-preserving research models support research and researchers?
At the center of privacy-preserving research models is privacy-by-design that enables secure research with protected information – while alleviating technical, logistical, and collaborative challenges for researchers.
Technical
Privacy-preserving research models can offer technical components that support large-scale digital learning research such as:
- Analysis options, which enable large-scale analysis of single platform data
- Intervention options, which enable researchers —under appropriate agreements—to introduce different kinds of interactive activities (including surveys, assessments, and learning activities) within a partner platform’s student experience
- Enclave fusion, which in some designs can enable researchers to leverage multi-platform data
Logistical
- Shared data sharing agreement templates
- Streamlined IRB and data-sharing processes
- Consent management across different populations
- Regulatory compliance with the changing data protection landscape
Community and Collaboration
- Help easily surface researchers and the research that they are conducting
- Bridge connections among platforms, researchers, and educational institutions to support meaningful research to inform practice
- Connect researchers at different levels of their careers and different domains to support mentorship and collaboration
read more
Case Study: Turning Student Assessment into Actionable Insights
If assessment results are the scoreboard that reveals what students are learning, user data is the game film that reveals how students learn: time on task, requesting support, revising, using resources.
Using SafeInsights’ privacy-preserving tools, researchers can securely analyze real-time digital learning platform data to better understand how students engage with digital learning. Consider two students with the same score:
Student A works steadily, using hints to revise answers. This pattern suggests a need for additional content support, scaffolding, and practice.
Student B races through with rapid guessing and skipped items. This pattern suggests a need to adjust prompts, pacing, and support.
By distinguishing between these pathways, researchers, educators, and policymakers can target digital learning platform interventions more precisely—whether that means redesigning practice problems, adjusting instructional supports, or tailoring engagement strategies.
Bottom line: SafeInsights securely transforms raw data into actionable evidence, helping policymakers and practitioners invest in solutions that boost learning outcomes and improvement at scale.
read more
Education & Workforce
day one project
Privacy-Preserving Research Models Essential for Large Scale Education R&D Infrastructure
Without trusted mechanisms to ensure privacy while enabling secure data access, essential R&D stalls, educational innovation stalls, and U.S. global competitiveness suffers.
12.02.25
|
6 min read
Education & Workforce
day one project
Analytical Literacy First: A Prerequisite for AI, Data, and Digital Fluency
tudents in the 21st century need strong critical thinking skills like reasoning, questioning, and problem-solving, before they can meaningfully engage with more advanced domains like digital, data, or AI literacy.
11.07.25
|
13 min read
Improving Standardized Test Score Reporting and Administration for Students, Caregivers, and Educators
We need to overhaul the standardized testing and score reporting system to be more accessible to all of the end users of standardized tests: educators, students, and their families.
11.05.25
|
10 min read
Education & Workforce
day one project
Moving Federal Postsecondary Education Data to the States
Moving postsecondary education data collection to the states is the best way to ensure that the U.S. Department of Education can meet its legislative mandates in an era of constrained federal resources.
10.24.25
|
6 min read
