With RGB lighting, exposed electronics, and a full color screen inside a transparent enclosure, Edifier’s latest desktop accessory can easily be mistaken for a decked out gaming PC. In reality, it’s primarily a speaker that can connect to…
While AI has been used to identify potential drug candidates from libraries of promising molecules, investigators now claim a “global first” for applying AI to predict the mechanism of action of an investigational antibiotic. “A lot of AI use in…
To develop effective interventions that enhance human health and wellness, it is crucial to engage human participants in research to identify a target population’s needs and behaviors, as well as to test the efficacy and implementation of developed interventions []. The validity of data collected from human participants relies on their membership within the population of interest.
Unfortunately, the issue of fraudulent participants—human participants who falsely claim eligibility for a study or participate multiple times to gain financial incentives []—is a documented and growing problem, particularly in studies where all activities are conducted remotely (eg, online surveys and virtual interviews) [-]. Pozzar et al [] have highlighted the potential significance of this issue: their online survey targeting American adults with ovarian cancer received 576 responses within just 7 hours of promotion on social media, with 271 responders self-reporting as meeting the eligibility criteria. Upon investigation, the research team found that 100% of these responses were either fraudulent (94.5%) or suspicious (5.5%).
Including data from fraudulent participants can have significant negative impacts on research, particularly when such responses are unknowingly included in analyses and presented as valid. As described in the study by Chandler et al [], participant fraud will make it harder to accurately estimate treatment effects in experimental studies, having potential to significantly impact observed relationships within and across groups. By definition, fraudulent responses also do not reflect the values and viewpoints of the population of interest—at best, they reflect external beliefs about this group, which are then erroneously attributed to the target population []. These effects become even more pronounced when studies target rare or underrepresented participant groups. Finally, fraudulent responses may also include significant amounts of noise, for example, when surveys are completed by simply filling random responses []. In such cases, true experimental effects may be masked, and analyses may be erroneously reported as statistically nonsignificant [,].
If an intervention is developed based on the expressed needs or desires of fraudulent participants, it is less likely to be acceptable, useful, or effective for the population for which it was originally intended. Other negative outcomes include the waste of research funding and researcher time [,]. Simply being aware of the risk of fraud in a study impacts researcher time because it necessitates the review and verification of all study responses and participants.
As more studies leverage virtual methodologies [], this problem is only expected to grow, and it is reasonable for researchers to be cautious about fraud when using methodologies that require minimal verification of participants’ identities. It is essential that fraudulent participants are identified and excluded from studies.
There are various types of fraudulent participants in research, and strategies to identify and exclude them differ by type. One type is automated software agents, commonly called “bots,” which imitate human behavior to perform specific tasks independently []. Bots are more prevalent in large-scale survey studies that involve many participants and few, if any, direct interactions between researchers and study participants. Bots have been a concern in internet security for many years, and several prevention measures have been developed, including using a completely automated public Turing test to tell computers and humans apart (CAPTCHA) to weed out automated responses, honeypot questions (questions not visible to the human eye but would be seen and answered by bots), and identity verification tools, as well as reviewing survey completion time (as bots will finish a survey much faster than human respondents), identifying identical responses from multiple respondents, performing geolocation checks, avoiding survey promotion on public forums (eg, social media), and embedding specific validation steps [,-,,,].
Another type of fraudulent participant includes individuals who falsely claim eligibility for a study and complete study procedures to gain participation incentives. While this form of fraud is less “efficient” than a bot, it can be more challenging for researchers to identify and prevent these individuals from participating. As these participants are human, tools such as CAPTCHAs and honeypot questions do not filter them out, and direct interactions with study personnel may not reveal their fraudulent status, as they can engage convincingly in real time with the research team. Moreover, widely available AI tools and large language models (eg, ChatGPT) can generate text or responses that mimic those of other individuals []; for example, someone wishing to commit fraud by posing as a member of a different demographic group (eg, pretending to be a health care professional to meet eligibility criteria) can use these tools to more convincingly persuade researchers of their legitimacy.
Smaller-scale studies addressing the issue of fraudulent participants have offered recommendations such as interacting with participants nonanonymously (eg, over the telephone), sending financial incentives via traditional mail, and reviewing participant responses and interactions for inconsistencies [,,,,].
In conducting 3 studies of our own, we encountered fraudulent participants and found that not only were they difficult to recognize but also that guidance in the literature on identification strategies was limited. Our objective is to address this gap in fraud identification literature by describing “red” and “yellow” flags in participant behavior and interaction that research teams can be aware of when screening participants for inclusion in research studies.
Case Studies
Overview
In this section, we present 3 research studies in which participant fraud was identified. Each case study outlines how the research design was vulnerable to participant fraud, how the fraud was discovered, and how the team responded. provides a summary of the designs. All 3 studies were conducted entirely online.
Table 1. Summary of the case studies in which participant fraud was identified.
Study
Population of interest
Sample size, n
Study activities
Compensation
Recruitment sources
ADRDa Systematic Hospital Inclusion Family Toolkit (A-SHIFT)
Hospital clinicians and primary caregivers of individuals living with ADRD
27 clinicians and 15 caregivers
Virtual interview and 2 surveys
US $100 total in the form of electronic gift cards
Postings to caregiver and clinician email registries, ADRD community organization promotion (websites and social media posts), and physical flyers posted in community and hospital locations
Resource connection project
Rural primary caregivers of individuals living with ADRD
15
Web-based intervention pilot test, 4 surveys, and virtual interview
US $100 total in the form of electronic gift cards
Postings to caregiver email registries and ADRD community organization promotion (websites and social media posts)
Legal and financial planner
Primary caregivers of individuals living with ADRD
99
Web-based intervention pilot test, 3 surveys, and virtual interview
US $150 total in the form of electronic gift cards
Postings to caregiver email registries and ADRD community organization promotion (websites and social media posts)
aADRD: Alzheimer disease and related dementias.
Case Study 1: Alzheimer Disease and Related Dementias Systematic Hospital Inclusion Family Toolkit
Overview
The objective of the Alzheimer disease and related dementias (ADRD) Systematic Hospital Inclusion Family Toolkit study was to develop a toolkit to support the inclusion of dementia caregivers in hospital care []. In the study, we conducted virtual interviews with dementia caregivers, persons living with dementia, and hospital clinicians. After the interviews, we asked participants to complete online surveys ranking the significance and modifiability of factors influencing dementia caregiver inclusion as identified from the interviews.
Recruitment
Recruitment materials contained information about study activities, eligibility criteria, compensation, and research team contact information. Recruitment methods included distributing materials through email lists of professional organizations and societies, hospital systems, and ADRD caregiver research registries; displaying study flyers in physical locations; posting on public social media; and sharing recruitment materials with clinicians and community organizations such as older adult centers, local chapters of the Alzheimer’s Association, and Alzheimer’s Disease Research Centers.
Screening and Enrollment
Interested individuals were invited to complete a telephone screening or complete the screening through a Zoom (Zoom Video Communications, Inc) audio call on request.
Caregivers were eligible to participate if they were (1) a current or former primary caregiver to a person living with dementia who had experienced at least 1 hospitalization, (2) aged ≥18 years, and (3) able to speak English. Persons living with dementia were eligible to participate if they were (1) aged ≥65 years and (2) able to speak English. Clinicians were eligible to participate if they (1) had at least 5 years of experience working in a hospital setting and (2) were able to speak English. After screening, eligible individuals reviewed the study information sheet over the telephone (or by Zoom, if requested) with a member of the research team and provided verbal consent.
Study Activities
Study activities included completing a virtual 1-hour interview and 2 online surveys.
Ethical Considerations
Participants received US $100 in the form of electronic gift cards for completing all study activities (US $50 for the interview and US $25 for each survey). This study was approved by the Institutional Review Board (IRB) at the University of Wisconsin-Madison (approval no. 2022-0024).
Cases of Fraudulent Participants
No fraud was suspected or identified among caregiver participants. However, in the clinician group, among those screened, deemed eligible, and enrolled, 3 participants were found to be fraudulent. Fraud 1 stated that they learned about the study on social media. They completed a telephone screening and claimed to be a neurologist with 20 years of experience. Fraud 2 initially contacted the research team via telephone and claimed during the telephone screening to be a junior administrator at a hospital. Fraud 3 requested to complete screening via Zoom and during this session claimed to be a psychiatrist at a teaching hospital with 5 years of experience. All 3 individuals had their cameras switched off during their virtual interviews, making it difficult for the interviewers to verify their identity. Suspicion of fraud arose after these participants completed the first online survey and provided demographic data that conflicted with the information shared in baseline surveys and during interviews. On reviewing the interview transcripts, the lead investigators confirmed that these individuals were not clinicians: their responses were either too vague or included claims about roles and organizations that were not consistent with how care or clinical roles are structured in the health care system. Consequently, all 3 individuals were withdrawn from the study, with the research team citing discrepancies in responses and a subsequent threat to data quality and credibility as the reason for withdrawal.
Case Study 2: CareVirtue Resource Connection Pilot Study
Overview
The CareVirtue resource connection project aimed to develop a web-based intervention to support rural caregivers of persons with ADRD. After co-designing the intervention with caregivers and representatives of community organizations serving rural ADRD caregivers, the study sought to pilot-test a prototype to obtain user feedback.
Recruitment
The two primary recruitment methods were (1) sharing recruitment methods with community organizations that serve ADRD caregivers and (2) distributing recruitment materials via email to members of a caregiver registry. Recruitment materials included contact information for a member of the research team, and interested care partners were instructed to self-identify by contacting the team member. At least 1 public social media post was published on Facebook by a community organization.
Screening and Enrollment
Interested individuals were invited to schedule a telephone screening with a member of the research team. To meet inclusion criteria for the study, individuals had to (1) self-identify as a primary caregiver for someone living with ADRD, (2) self-identify as living in a rural area, (3) be aged ≥18 years, and (4) have access to the internet. Individuals who met the eligibility criteria reviewed the study information sheet over the telephone immediately after screening and gave verbal consent to participate.
Study Activities
All study activities were conducted virtually. Participants completed an onboarding meeting over Zoom or over the telephone, at which they were given access to the study intervention and completed the registration process and a needs assessment. Next, they were instructed to engage with the intervention for 30 days. During this time, they received recommendations for local and online resources once a week, and after day 7 of the study, they were asked to complete an online feedback survey once per week. At the end of the 30-day use period, participants completed a postuse interview over Zoom or over the telephone about their experiences and opinions of using the intervention.
Ethical Considerations
Participants received a US $25 electronic gift card for completing the first visit. They also received a US $75 electronic gift card for completing the 30 days of the study and the postuse interview. This study was approved by the Indiana University IRB (IRB #20189).
Cases of Fraudulent Participants
Of the emails we received, 75 were from individuals whom we determined to be fraudulent. Emails were delivered en masse, and the time of delivery coincided with study recruitment materials being shared in a public Facebook post by an organization that served dementia caregivers. In addition, these emails differed from those sent by previous (genuine) participants in that they were short, strangely worded, vague, and had significant grammatical and spelling errors. Of these 75 individuals, 40 (53%) ceased communication with the research team after being informed that they would be required to complete a telephone screening to determine their eligibility for participation. The remaining individuals (35/75, 47%) were flagged as fraudulent based on several factors, including the timing of their initial emails to the research team, the type and format of their email addresses, the content of their emails, and their inability to provide a landline or mobile phone number.
Case Study 3: CareVirtue Legal and Financial Planner Pilot Study
Overview
The purpose of this pilot study was to confirm the feasibility, usability, and acceptability of a legal planning and financial management training tool for ADRD caregivers.
Recruitment
Study information was shared through several channels, including online caregiver registries (eg, Research Inclusion Supports Equity, funded by National Institute on Aging grant R24AG066599) and community organizations (eg, newsletters as well as public postings on websites and Facebook groups).
Screening and Enrollment
Participants contacted the research team by email. Researchers then scheduled a screening call via telephone or Zoom and confirmed that participants (1) were a caregiver of a person with ADRD, (2) were aged ≥18 years, (3) had access to the internet, and (4) had access to a desktop or laptop computer. Individuals who met the eligibility criteria reviewed the study information sheet over the telephone immediately after screening and signed an electronic form to document their consent to participate.
Study Activities
Participants completed a 30-minute virtual onboarding meeting at the beginning of the study. At this meeting, they completed a baseline survey that included questions about demographic information and financial and legal planning topics. A researcher then guided them through the process of creating a planner account and gave them a tour of the planner website. After a week of using the planner, participants met with a researcher over Zoom for a 15-minute check-in to make sure that they were not experiencing any technical issues. On day 45 of the study, participants received an invitation to complete a midpoint questionnaire about their experiences using the planner. The 3-month study period ended with a semistructured postuse interview over Zoom in which participants were asked about their experiences and opinions of using the planner. Finally, they were asked to complete a postuse survey immediately after the interview.
Ethical Considerations
Participants were paid US $40 for completing onboarding. They also received US $20 for completing the midpoint questionnaire. In addition, they were compensated US $50 for completing the postuse interview and survey and received a US $40 bonus if they completed all study activities (maximum possible compensation=US $150). This study was approved by the Indiana University IRB (IRB #16242).
Cases of Fraudulent Participants
A total of 318 potential participants initially expressed interest in the study, of whom 216 (67.9%) were not enrolled (n=63, 29.2% due to suspected fraud; n=153, 70.8% for reasons such as ineligibility, lack of interest in the study, or lack of communication with the study team), while 102 (32.1%) were deemed eligible and enrolled. However, of these 102 participants, 3 (0.2%) were later withdrawn after they were suspected of being fraudulent. After cases of fraud in the previous 2 studies were brought to the research team’s attention, additional detection and verification procedures were put in place. Several individuals requested to complete screening via Zoom, which our team did not allow because it had been requested by fraudulent participants in previous studies. Some suspected fraudulent individuals did reply to the study team when told that they must be screened over the telephone, but the majority ceased contact with the study team after they were informed of this requirement. Those who did agree to be screened over the telephone provided numbers that were determined to be voice over IP (VoIP) numbers, which can display a US area code regardless of the user’s actual location. Three individuals were consented before the enhanced screening procedures were implemented and were later identified as fraudulent. They were subsequently withdrawn before study onboarding. One was identified due to their inability to access the intervention (their browser time zone was not set within the United States), while the others were not able to provide a valid telephone number to continue participation.
Red and Yellow Flags for Suspected Fraud
Throughout these 3 studies, we observed patterns of behavior that frequently preceded the identification of a fraudulent participant, some of which raised more suspicion than others. summarizes our team’s system for identifying these behavioral patterns and categorizes them as either “red flags” (behavioral patterns that should raise the research team’s suspicions and prompt further investigation of the potential participant) or “yellow flags” (patterns common among frauds but also present in some genuine participants). Where relevant, we also provide verbatim examples of interactions with fraudulent participants and compare them with communications sent by nonsuspicious participants.
Table 2. Red and yellow “flags” to identify fraudulent participants.
Flags
Examples of fraudulent participants
Examples of real participants
Red
Interest email
Short
Not signed with person’s name
No salutation
Significant grammatical errors
No subject line, or subject line is a strange study name
Email text is identical to email text sent by another interested person (likely also fraudulent)
Generic and does not provide detail
No subject; Body: “I am interested to partake in the he survey”
Subject: “Interested I take caregiver for mother with dementia;” No body.
Subject: “Dementia Caregiver Study;” Body: “Hello, I received the email from the FTDa Disorders Registry about your study. My husband has bvFTDb and was diagnosed in 2019. I’d like to know if I may participate, and of course, first know more about it. I did read the summary of the criteria and think I meet it. Thank you, [NAME]”
Compensation
Overly concerned with study compensation
Sends multiple emails in a short period about getting compensated
May become aggressive about compensation and demand it (eg, using guilt-inducing language and capitalizing the subject line)
Body: “After completing this study you promised to provide and we will receive a compensation but till date I haven’t seen anything, receiving and email that I have been removed from the study but haven’t been compensate So all this thing aren’t Good I haven’t received my promised compensation and you removed me from the study because of am the last or what please explain And I completed this study long ago that is last year December which is unfair Take it to be like you how will you feel about this? Thank you for your response”
Body (sent 8 days after completing a survey): “Hi [researcher], I completed this survey on 3/11. I wanted to be sure this was reflected in your records. Thank you! [name]”
Telephone number
Person refuses to provide a telephone number
Person insists on communicating through videoconferencing rather than over the telephone
Provided telephone numbers are VoIPc numbers (vs landline or mobile phone numbers)
—d
—
Address and zip code confusion
Zip code provided does not match the address given
Takes the person a long time to provide their zip code when asked for it (as if they have to look it up)
Address does not exist or is for a nonresidential building (eg, a church)
—
—
Diagnosis and condition details
Provides incorrect details or cannot provide specific details on the condition of interest to the study when asked, or it takes a long time for them to answer (eg, care recipient dementia type)
Interviewer: “In the inpatient hospital setting, what would your role be in the admission process?”
Respondent: “Yeah. In the admission process, I’m a neurologist, yeah, and also I’m an administrator as well, yeah, regarding my field of specialty because we’re not really much here, so I have a lot of function in there, so.”
Respondent: “I am a caregiver to my spouse, who was diagnosed with frontotemporal dementia with PPAe (logopenic variant) two years ago. He has been living with symptoms for at least six years. I am very interested in participating in this particular program/study.”
ID check
When asked to provide ID on camera, person refuses or cannot comply
—
—
Yellow
Email address
Uses a Gmail account with the following structure: Firstnamelastname#@gmail.com
Email address differs from name provided in the study
Johnsmith1@gmail.com
Doejane46@gmail.com
These email addresses are illustrative examples created by the authors and were not used by any actual participants (genuine or fraudulent)
—
“How did you hear about the study?”
Answer is vague (eg, Facebook, newsletter, or a friend)
Friend who referred them to the study is not enrolled in the study or is also a suspected fraud
“I heard about it on social media”
“A colleague referred me to study”
“Found your study on the AFTDf website. I am interested in your study. My husband was diagnosed with BVFTD in April 2021. I am his full time caregiver, he lives at home.”
Delay in answering questions
For example, date of birth, telephone number, and address
Uses a large number of filler words before giving an answer
—
—
Videoconferencing behavior
Does not turn on their camera
Provides short answers to questions and does not elaborate
—
—
Address
Major metropolitan locations (eg, Los Angeles, Miami, or New York City)
Generic addresses (eg, [##] W [#]th St; [###] Main St)
Incomplete addresses provided
When reading their address aloud, they pronounce it strangely
fAFTD: Association for Frontotemporal Degeneration.
We determined that it was necessary to separate behaviors into 2 categories: behaviors that strongly indicated fraud (eg, inability to provide a valid telephone number or providing an incorrect zip code); and behaviors that, while nearly ubiquitous among fraudulent participants, were also exhibited by genuine participants (eg, having a Gmail account or keeping the camera switched off during videoconferencing). We thought that it was important to include the “yellow flags” for additional context for researchers. While the presence of a single flag, particularly a yellow one, does not necessarily mean that a participant is inauthentic (for instance, many legitimate participants use Gmail accounts, live in major metropolitan areas, or exhibit disengaged behaviors), recognizing these behavior patterns can alert the research team to potential fraud and prompt further investigation.
Strategies to Prevent Fraudulent Participation
Participant fraud is a risk that must be considered and accounted for in the design and conduct of research studies. The simplest way to reduce this risk may be to conduct study activities in person, thereby requiring face-to-face interaction with members of the research team. However, virtual research activities offer significant benefits to both researchers and participants. Remote research facilitates the recruitment of participants who might otherwise be unable to participate by overcoming geographic constraints, minimizing the burden of participation (eg, no need to travel), and enabling individuals who are homebound to take part [-]. In the earlier phase of the study, presented as the second case study in this paper, remote study operations enabled the recruitment of rural ADRD caregivers from throughout the United States (none of whom were in the researchers’ home state), which would not have been possible with traditional in-person methods []. Decentralization can also lead to cost savings for the researcher [].
To continue leveraging the opportunities created by virtual research, we as researchers must recognize the elements of study design that increase vulnerability to fraud; implement prevention strategies; and continuously monitor for fraud during recruitment, enrollment, and data collection. The strategies presented in stem from the authors’ experiences with participant fraud across the 3 previously described studies and have been reinforced in other fraud prevention publications.
Table 3. Suggested strategies according to the study phases to prevent and identify fraudulent participants.
Study phase
Strategies
Recruitment
Use targeted recruitment approaches such as emailing private registries, engaging a liaison with access to the population of interest, posting in newsletters circulated to a limited audience, or contacting potential participants directly
When possible, post physical flyers where potential participants congregate
Avoid public websites or social media (eg, Facebook) posts
Avoid specifying compensation amounts in public postings
Ask participants where they heard about the study; press for details if they provide vague responses
Screening and enrollment
Verify information provided by participants
Telephone number: use a phone number validation tool to determine if the participant is using a voice over IP number
Mailing address: check online to determine whether an address is valid
Personal data: have participants reconfirm personal data (eg, date of birth) and look for discrepancies
Perform screening and enrollment over the phone instead of using videoconferencing software
Require an ID check from the participant through the webcam (no identifiable information from the ID check needs to be recorded; simply verify the individual’s name) if videoconferencing is necessary
Order questions to “catch” frauds (eg, ask where they live, then ask for the zip code, and note whether they have difficulty answering or whether the zip code matches the address)
Ask for details about the condition of interest when applicable and note whether they have difficulties answering, for example, the type of dementia or specific duties of a clinician in a clinical role
Data collection
Implement location restrictions, if possible, when testing a digital intervention (eg, through IP address and browser time zone)
Note whether the participant’s camera is switched off during study interactions over videoconferencing
Note erroneous statements that a legitimate participant would not make (eg, a clinician expecting their patient with dementia to “get better”)
Ask participants to resubmit demographic data in multiple surveys and check for discrepancies
Discussion
Summary
On the basis of 3 case studies, this paper presents a “red” and “yellow” flag framework for identifying individuals falsely claiming to be eligible for a study. We noticed suspicious behaviors among suspected frauds that were also published in other papers, such as keeping cameras switched off during videoconferencing [,,], not providing a valid US telephone number [,,], having a Gmail account or email address with a specific structure [,], short interest emails lacking contextual information [], providing nonresidential addresses as home addresses [], inability to provide specific details on the condition of interest or providing details that do not align with the stated condition [,], vague responses when asked how they were referred to the study [], and unusual pauses or delays when answering questions []. However, we believe that some of the behaviors identified in these 3 case studies have not yet been documented in the literature, such as the prevalence of VoIP numbers, strange pronunciations of addresses, and the prevalence of addresses in large metropolitan areas (eg, New York City). Our study also categorized these behaviors by severity, indicating whether something was a strong indication that a participant was fraudulent (“red flag”) or whether it was also a behavior exhibited by genuine participants but was particularly common among frauds (“yellow flag”), which is novel and expands on previous work.
Similarly, we provide a series of recommendations to prevent fraudulent participation in online qualitative research studies such as our 3 case studies. Some of these—such as implementing location restrictions using IP addresses [,,,], limiting compensation details in public postings [,], avoiding public study advertising (eg, on social media) [], and verifying participant identity (eg, demographic, ID, or address checks) []—have already been documented in the literature while others (eg, ordering questions to “catch” frauds) have not been as prominently discussed.
Fraudulent research participants create multiple issues beyond compromising data quality and rigor. Research staff in studies with smaller samples now spend extra time and resources investigating every potential participant, a task that may be infeasible in studies with large samples. If a large number of frauds are enrolled, recruitment may need to be paused or delayed to allow research staff to identify and exclude them. Incentives for participation may be awarded to frauds instead of genuine participants, detracting from fixed project budgets allocated for participant compensation and wasting resources often funded by the government and taxpayers.
Perhaps the greatest disservice caused by frauds is to the genuine individuals who volunteer their time, accept the risks of participating in a research study, and provide data that enable scientific advancement. Additional steps to verify an individual’s identity increase the burden on genuine research participants. Many signs of participant fraud are “yellow flags” with multiple interpretations, creating the risk that genuine individuals will be turned away or withdrawn from research for appearing too similar to frauds; for instance, a VoIP number may be used by a fraudulent individual or by a caregiver because it is less expensive than a mobile phone contract. Likewise, free email accounts such as Gmail are widely used and convenient, making them accessible to both fraudsters and genuine participants (including those who may not have paid or institution-based email accounts). We want to clarify that we do not recommend excluding someone from study participation based on a single “flag” but rather advocate for considering all contextual information to make the best decision. The need for research staff to make a judgment call on including a participant raises ethical concerns regarding biases. For this reason, we suggest incorporating fraud prevention measures (eg, requiring study screening to be conducted over the telephone, not over VoIP numbers) into the study’s standard operating procedures to ensure that all study participants are held to the same standard. While this may inadvertently exclude some authentic study participants, this is a risk the study team must evaluate against the need to recruit a legitimate study sample.
Conversely, there are also ethical issues associated with failing to take appropriate steps to remove fraudulent participants from research samples. Individuals who assert membership in the study population may misrepresent the experiences, perspectives, and opinions of actual members of this group; for instance, a fraudulent participant in the ADRD Systematic Hospital Inclusion Family Toolkit study (case study 1) claimed that their hospital had an abundance of resources for family caregivers. If undetected, they might have created the false impression that clinicians possess the resources needed to adequately involve caregivers in hospital care. In our experience, many of these fraudulent participants identify as members of sociodemographic groups that are underrepresented in research (eg, Black or African American), which could lead to misleading assumptions about the needs, beliefs, or characteristics of these groups and further undermine their trust in research.
Balancing the competing risks of being too “strict” and excluding genuine participants with being too “lax” and failing to catch fraudulent ones is a challenge without a clear solution that perfectly fits every research team, study, or participant population. Research teams must work to understand the risks associated with fraud, consider strategies for mitigation and their associated costs, and then make an appropriate determination for their specific circumstances. This may mean adjusting analyses, allocating more funding in budgets to recruiting and screening efforts, and acknowledging the risk of fraudulent participation in limitations sections in publications.
It is important to note that the strategies recommended in this paper and the conclusions drawn are based solely on the authors’ experiences with the 3 aforementioned studies. The studies had differing designs and used various recruitment strategies but all had relatively small sample sizes and involved in-depth interactions with the study team (eg, virtual interviews). We believe that our flag framework and mitigation strategies will be most useful to researchers conducting studies with similar characteristics: (1) sample sizes small enough to allow enhanced screening by the research team, (2) real-time interactions between researchers and participants to judge authenticity, or (3) data collection at multiple time points in the study to compare answers to check for discrepancies.
Studies that rely on large sample sizes (eg, hundreds to thousands of participants) and collect data exclusively through online surveys may encounter unique challenges related to participant fraud. Some strategies suggested in this paper may not be applicable to large samples; for example, online surveys may be more susceptible to bots than to fraudulent participants, necessitating different strategies to address these issues (eg, CAPTCHAs and honeypot questions).
Finally, there is no definitive way to ascertain whether we detected all instances of fraud in the 3 aforementioned studies or whether we mistakenly identified any genuine individuals as fraudulent and erroneously declared them ineligible. Our inability to objectively determine whether a participant truly was fraudulent means that we cannot assess the effectiveness of our strategies.
Conclusions
Fraudulent participants pose an increasingly prevalent challenge in virtual research. It is essential to remove such individuals from study samples to uphold the integrity of the research and avoid harming relationships with genuine participants. In this paper, we have presented behavioral patterns and strategies to identify and prevent fraudulent research participation based on our experiences with 3 different studies. We believe that these recommendations are valuable to others conducting similar research, while acknowledging that there is no one-size-fits-all solution to addressing this issue. Research teams must balance the competing interests of eliminating frauds and ensuring that a diverse array of individuals can participate in research with minimal burden.
The authors would like to acknowledge Madeleine Moureau for her help conceptualizing this paper. The studies described in this paper were supported by the National Institute on Aging of the National Institutes of Health (grants R44AG074128, R43AG080849, and R21AG077439). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Data sharing is not applicable to this paper as no datasets were generated or analyzed during this study.
CE is the chief executive officer and cofounder of CareVirtue Technologies and developed CareVirtue. MZ is the caregiver support officer and cofounder of CareVirtue Technologies, and developed CareVirtue; is an associate for Hilarity for Charity (a 501c3 nonprofit with a mission of bringing light to Alzheimer disease); and is a member of the Alzheimer’s Impact Movement, which is the advocacy affiliate of the Alzheimer’s Association. All other authors declare no conflicts of interest.
Edited by S Brini, A Mavragani; submitted 04.Jun.2025; peer-reviewed by T Dickinson, KW Tay; comments to author 01.Sep.2025; revised version received 09.Sep.2025; accepted 11.Sep.2025; published 10.Oct.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
New research revealed that depression can impact surgical outcomes, making it more difficult for patients to recover from surgery, thus leading to higher postoperative costs as well. In patients with gastrointestinal cancers and depression…
Amazon’s October Prime Day is in the rearview mirror, but that doesn’t mean you have to wait until Black Friday and Cyber Monday to score a deal. Case in point, Apple’s AirPods Max (USB-C) are still available for $429.99 ($120 off), at
A recent review attempted to evaluate and summarize the best available evidence for anticoagulant management in patients with cancer-associated venous thromboembolism (VTE) to develop clinical decisions tailored to China’s national conditions…
Advancements in smartphone technology have revolutionized data collection methodologies in health research, offering new opportunities for frequent and real-time assessments of participants’ health status []. Smartphone-administered surveys have emerged as powerful tools for collecting self-reported data on various health outcomes, behaviors, and clinical experiences []. Digital surveys administered through mobile devices or web platforms enable researchers to gather information more frequently and in real time, potentially reducing recall bias and offering a more convenient and cost-effective approach to capturing participants’ health trajectories []. While digital technologies empower personal monitoring of health, promote disease self-management, and maintain patient-provider communication, older adults are often unable to fully access these important benefits [-].
A key challenge in longitudinal studies using smartphone apps is maintaining engagement over extended periods, which is crucial for ensuring data quality and study validity []. A systematic review and meta-analysis revealed high dropout rates in mobile health interventions, with an average dropout rate of 43% across various study durations []. Survey fatigue, in which participants become overwhelmed by too many surveys or repetitive survey questions, can contribute to decreased response rates and potentially compromise data quality [,]. These challenges are even more severe among older adults and individuals with chronic health conditions, who may face greater barriers to using digital health technology [-]. For instance, older adults may experience age-related declines in cognitive processing speed and vision, making it more difficult to complete app-based surveys consistently [,]. Additionally, those with chronic illnesses often deal with symptoms and treatment burden, which can further reduce their capacity or motivation to engage with study tasks [,].
Engagement in digital health is multifaceted and can be assessed through a range of metrics, where response rate serves as an important indicator [-]. The timing and frequency of longitudinal survey deployment may directly impact participants’ response rate []. The optimal timing of survey requests is unclear, as too frequent requests may feel burdensome, while long gaps between scheduled surveys or light touches may lead to participant loss [-]. In addition, more frequent deployments enable smaller survey batches, which may reduce participant burden []. Previous research has examined various strategies to improve survey response rates in digital health studies, including optimizing survey length, personalized reminders, and incentive structures [,]. Some studies have consistently shown that shorter surveys achieve higher response rates [,]. Trinquart et al [] found that personalized notifications are beneficial for device data transmission but not survey completion. However, limited evidence exists on the impact of survey deployment patterns (frequency and number of surveys) on participants’ longitudinal survey response, especially among older adults, who are often underrepresented in digital health research []. Understanding the influence of deployment patterns is crucial for designing effective longitudinal digital health research protocols that can sustain participants’ response rate over time.
To address this knowledge gap, we conducted a randomized controlled trial (RCT) to evaluate whether distributing smartphone-administered surveys into smaller, more frequent batches would improve response rates among older adults. Participants were randomized to receive half of the surveys every 2 weeks (experimental group) or all surveys every 4 weeks (control group) across four time periods: baseline to week 8, weeks 8-16, weeks 16-24, and weeks 24-32.
Methods
Trial Design
We conducted a 2-arm RCT embedded in the electronic Framingham Heart Study (eFHS) to evaluate the effect of a more frequent survey deployment approach with smaller survey batches on response rates. The trial was registered at ClinicalTrials.gov (NCT04752657). The CONSORT-EHEALTH (Consolidated Standards of Reporting Trials of Electronic and Mobile Health Applications and Online Telehealth) checklist is provided in .
Ethical Considerations
This study was reviewed and approved by the Boston University Medical Campus Institutional Review Board (H-36586, H-40737, and H-32132). All participants provided electronic informed consent for their participation in the eFHS and electronic informed consent for the RCT within the eFHS app. Data were deidentified for analysis. Participants were not compensated for participating in the study.
Study Setting and Eligibility Criteria
The eFHS is an e-cohort designed to integrate surveys and commercial wearables to capture health data []. The eFHS is embedded in the FHS, which is a long-term, community-based cohort study that began recruiting the original cohort of participants in 1948 in Framingham, Massachusetts, to investigate factors and determinants of cardiovascular disease [-]. FHS recruited the children of the original participants and their spouses (Offspring cohort; n=5124) from 1971 to 1975 and a cohort of multiple ancestries (Omni 1; n=506) from 1994 to 1998. The Offspring cohort and Omni cohort have undergone examinations every 4 to 8 years, with demographic and clinical data collected at each health examination []. At the time of examination 10 (Offspring) and examination 5 (Omni) from 2021 to 2023, participants were invited to the eFHS during the in-person examination at the FHS Research Center in Framingham, or enrolled remotely from home []. Eligibility criteria for eFHS included: English-speaking FHS participants who resided in the United States and owned either a smartphone (iPhone with at least iOS 10 or Android phone with at least version 7) or a computer available at home, permission for study notifications and data sharing with the research center, and signed consent. Participants with iPhones were offered a smartwatch to record step count and heart rate data for 1 year following enrollment, while those who already owned a smartwatch were allowed to use their own watch. Participants who enrolled in eFHS were invited to participate in an RCT to test whether delivering half of the surveys every 2 weeks versus all surveys every 4 weeks, administered through smartphones, would improve response rates. Trial participants signed and dated a second consent form. While participants who used either smartphones or computers were eligible for eFHS enrollment, this trial specifically required smartphone ownership: 9 eFHS participants with only computer access were excluded.
The study research technician assisted participants with downloading the eFHS app from the App Store or the Android Google Play Store. Participants received a request for permission to enable notifications and share data with the research center, along with step-by-step instructions for enrollment through the eFHS app. After completing the enrollment and randomization process, participants received their initial smartphone app surveys.
Among 620 eFHS participants, 492 participants were enrolled in the current RCT between June 21, 2021, and June 28, 2023. We conducted the trial over 26 weeks and completed the follow-up of all enrolled participants by December 28, 2023, for both primary and secondary outcomes. Longitudinal surveys were sent out from enrollment (week 0) through week 24 for the control group and through week 26 for the experimental group. Participants in the control group were allowed until week 32 to complete and return their responses to the last batch of surveys, while those in the experimental group were allowed until week 34. Baseline characteristics of participants were obtained using standard protocols and definitions as part of the routine research examination.
Periodic notifications were sent through the eFHS app to notify participants when new surveys became available, to remind participants that surveys were due, and to send a personalized “thank you” greeting when all surveys were completed. Based on our previous work, we sent personalized notifications that included the name of the participant, adaptive messaging based on survey completion status, and the eFHS study technician’s name to improve survey response rate []. We selected the notification message randomly from a message bank (). For example, we sent a personalized notification when a survey was due in the next 7 days, such as “Mr. (S) [FHS participant last name], we missed your survey responses last week. We would really appreciate it if you could complete your surveys.” If all surveys were completed, participants received a thank you message selected at random, such as “Mr. (S) [FHS participant last name], way to go! We received your surveys. Thank you!” To ensure that message variation did not confound our results, both experimental and control groups received the same types of messages selected from the same message bank, with random message selection occurring independently of group assignment. Since the experimental group received survey batches more frequently, they consequently received notification messages more often as well.
Additionally, 1 week after randomization, we contacted all participants by telephone to inquire about any technical issues with the eFHS app. Throughout the study, support staff provided assistance to participants. When interacting with participants to address technical questions, the support staff adhered to standardized scripts to ensure consistency and minimize potential bias.
Surveys and Randomized Deployment Approaches
The MyDataHelps Designer platform (CareEvolution) was used to create the eFHS smartphone app surveys within the MyDataHelps mobile app container. The MyDataHelps app hosts a variety of surveys designed to be sent out on enrollment and at various intervals after enrollment to participants to answer health surveys administered through the app. A previous study in the eFHS reported that 469 participants who returned the Mobile App Rating Scale survey rated app functionality and aesthetics highly (total Mobile App Rating Scale score=8.6 on a 1-10 scale), and those who completed the System Usability Scale survey indicated high satisfaction with the app’s design and ease of use []. In addition, the app included the participant’s account, where the signed consent form can be viewed, and a dashboard to provide participants with a thank you message and survey completion status.
Surveys were chosen for relevance to the health of older adults (). The short form of the Patient-Reported Outcomes Measurement Information System [] allows self-reported assessments of anxiety, depression [], fatigue [], sleep quality [], physical function, pain, cognitive abilities, and cognitive function []. Most surveys typically consisted of 4-item multiple-choice questions with responses ranging from “never” to “always” or “not at all” to “very much.” Additionally, a modified version of the Michigan Body Pain Map [] was used to collect information on chronic pain through a pictorial representation of the human body. We also included surveys of physical activity level [], mobility outside the home [], and occurrences of falls and hospitalizations.
Task-based surveys included smartphone-operated cognitive and physical assessments: a Trail Making Test requiring participants to tap a series of alternating numbers and letters [,], a Victoria Stroop Test had four subtests that required participants to select the color of the matching color block, select the color of the word printed in black font, and select the color of words printed in incongruent colors, and select the word printed in incongruent colors unless the word was underlined [], a gait task measuring walking time [], and a two-finger tapping task assessing tapping speed and accuracy []. Due to the unavailability of these task-based surveys on Android phones, Android users contributed data only to the other question-based surveys ().
The surveys were scheduled to be sent out at enrollment (baseline, week 0) and at various intervals afterward through week 26 ( and ). Participants were randomized to receive surveys as per the two different deployment patterns to compare response rates. At each scheduled time point, participants received a batch containing multiple surveys of different types. The control group received all surveys every 4 weeks, while the experimental group received half of the surveys every 2 weeks. Both groups received the same number of surveys within each time period. The deployment pattern was consistent across both groups for the cognition, pain, mood, and psychosocial modules. However, the deployment of the physical function, physical activity, and events surveys was shifted for the experimental group as compared to the control group, resulting in smaller but more frequent survey batches. Participants were allowed to complete a survey batch from its deployment until the next wave, with time intervals varying across modules based on their deployment schedule. For the trial, we followed participants through week 32 for the control group and week 34 for the experimental group to allow for survey responses to the last batch of surveys.
Figure 1. Timeline and pattern of survey module administration in the experimental (half batch every 2 wk) and control (full batch every 4 wk) groups. Both groups received the same number of surveys within each of the 4 time periods. The following smartphone-administered survey tasks were only available on the iPhone: trail making, Stroop, two-finger tapping in the cognition module, and gait in the physical function module. Surveys were administered from week 0 through week 24 for the control group and through week 26 for the experimental group, as shown. Participants in the control group were allowed until week 32 to complete and return their responses to the last batch of surveys, while those in the experimental group were allowed until week 34.
Outcomes
We prespecified 4 survey periods. Both groups received the same number of surveys within each time period ( and ). Period 1 encompassed all survey modules deployed from baseline up to week 8 (22 surveys per iPhone users; 18 surveys per Android users), period 2 from week 8 up to week 16 (19 surveys per iPhone users; 15 surveys per Android users), period 3 from week 16 up to week 24 (18 surveys per iPhone users; 14 surveys per Android users), and period 4 from week 24 to week 32 (15 surveys per iPhone users; 11 surveys per Android users).
We quantified survey response rates for two outcomes. The primary outcome of the participant’s survey response rate was calculated as the proportion of surveys returned per participant within each time period, including both partially and fully completed surveys. For question-based surveys with 4 or more questions, a completion rate of at least 75% was considered partially completed (). For the pain survey with 3 questions and the event survey with 2 questions, we used 66% and 50% as thresholds, respectively (). The secondary outcome was the proportion of questions or tasks completed per participant in each time period, calculated as the number of completed questions or tasks divided by the total number of questions or tasks across all surveys combined. While the primary outcome captures overall survey participation (whether surveys were returned), the secondary outcome provides a more granular measure of participant engagement by quantifying the completeness of responses within returned surveys. Participants may return surveys but leave some questions unanswered due to survey fatigue, difficulty with specific questions, or technical issues. The secondary outcome thus helps differentiate between participation and the thoroughness of survey completion. We conducted a longitudinal analysis to compare the mean proportions of the primary and secondary outcomes across the 4 time periods.
Random Allocation
Participants in the current trial were randomly allocated to one of the two groups using randomization lists generated by a statistician with randomly permuted blocks of varying sizes. The FHS Offspring cohort includes spouses. Therefore, for married couples participating in eFHS and enrolled in the trial, a simple blocked randomization approach was used to ensure that the spouses were always assigned to the same group, as it was assumed that they would be likely to complete the surveys together. For other participants, randomization was stratified according to participants’ age (≤75 y vs >75 y) and type of phone (Android vs iPhone). The stratification approach was informed by a previous study using the FHS Generation 3 cohort, which found that both age and phone type were strongly associated with survey return and that age modified the effect of time on survey completion []. While the couple randomization was not stratified by age or phone type, it was expected that the random allocation would still achieve a balance of these factors. The randomization process was implemented centrally through the eFHS app. eFHS researchers and statisticians were blinded to the group assignment.
Sample Size
We calculated the sample size needed to detect differences in survey response rates between randomization groups over time. We assumed a correlation of the repeated measures of 0.6 and an SD on the proportion of surveys returned of 20%. A sample size of 240 participants per group would provide 83% power to detect a between-group difference in slopes of at least 2% for the proportion of returned surveys over time, using a 2-sided test with an α of 0.05. The study would have a limited power of 31% to detect a three-way interaction effect size of 2% between subgroups across 4 time points with 480 participants in total.
Collection of Demographic Data
Participants’ demographic data were obtained from comprehensive data collection during routine FHS research examinations. The demographic variables analyzed in this study included age, sex, race or ethnicity, education level, marital status, employment status (retired vs not retired), annual household income, and self-rated health status. They were collected at examination 10 for the Offspring cohort and examination 5 for the Omni 1 cohort. Phone type (iPhone vs Android) and smartwatch use were documented during eFHS enrollment.
Statistical Methods
Baseline characteristics of participants were presented as mean (SD) for continuous variables or n (%) for categorical variables. We fitted a mixed-effects regression model with random intercepts to compare the mean primary and secondary outcomes between randomization groups across time periods. Analyses were in intention-to-treat so that all participants randomized to one of the groups were analyzed within that group. The model included fixed effects for group assignment (experimental vs control group), time periods, age group (>75 y vs ≤75 y), phone type (iPhone vs Android), and a multiplicative interaction of group by time periods. Random intercepts were included to account for both within-individual correlation over time and intraclass correlation within spouses. An unstructured covariance structure was used to model the correlation between repeated measurements. For the comparisons at any specific time period, we estimated the difference in outcomes between groups using mixed-effects regression. In primary analyses, we used a linear mixed-effects regression model to analyze the outcomes, which were continuous proportions. In sensitivity analyses, we defined a binary outcome by categorizing each survey as returned or not returned, and we modeled this repeated binary outcome by a logistic mixed-effects model.
Furthermore, a priori stratified analyses were performed to assess and compare the primary and secondary outcomes in subgroups defined by age group (>75 y vs ≤75 y) and sex (women vs men). We also applied three-way interaction analyses to examine the effect of survey deployment pattern over time across the subgroups.
All analyses were conducted in R (version 4.3.0; R Core Team). Two-sided tests were performed for all analyses; we considered P<.05 as statistically significant.
Results
Participant Characteristics
A total of 492 participants were enrolled in the trial, with 248 participants in the experimental group and 244 participants in the control group, and 20% (n=98) participating as part of a couple (). summarizes the characteristics of the participants. The mean age of participants was 74 (SD 6.3; range 55-92) years, 58% (284/492) were women, 84% (413/492) were non-Hispanic White, 67% (327/492) had a bachelor’s degree or higher, and 69% (337/492) considered their health to be very good or excellent. Following randomization, the experimental and control groups had similar baseline characteristics (P>.05 for all comparisons). In addition, characteristics of participants in the trial were similar to other eFHS participants (n=128) who chose to receive surveys or smartwatches but were not part of the trial (Table S1 in ).
Figure 2. Flow diagram of participant enrollment and randomization in the trial. eFHS: electronic Framingham Heart Study.
Table 1. Baseline characteristics of participants enrolled in the trial (n=492).
Characteristics
Overall (n=492)
Experimental group (n=248)
Control group (n=244)
P valuea
Age (years), mean (SD)
73.9 (6.3)
74.0 (6.2)
73.9 (6.3)
.82
Women, n (%)
284 (57.7)
148 (59.7)
136 (55.7)
.43
Race or ethnicity, n (%)
.74
Non-Hispanic White
414 (84.1)
209 (84.3)
205 (84)
Black
30 (6.1)
17 (6.9)
13 (5.3)
Asian
35 (7.1)
17 (7.9)
18 (7.4)
Hispanic
13 (2.6)
5 (2.0)
8 (3.3)
Bachelor’s degree or higher, n (%)
327 (66.5)
166 (66.9)
161 (66)
.90
Marital status, married, n (%)
340 (69.4)
162 (65.6)
178 (73.3)
.08
Participating as a couple, n (%)
98 (19.9)
49 (19.8)
49 (20.1)
≥.99
Phone type, iPhone, n (%)
382 (77.6)
190 (76.6)
192 (78.7)
.66
Smartwatch user, n (%)
287 (60.2)
151 (62.9)
136 (57.4)
.25
Retired, n (%)
337 (68.6)
170 (68.5)
167 (68.7)
≥.99
Annual income, n (%)
.83
42 (10.7)
23 (11.7)
19 (9.8)
US $35,000-$74,999
110 (28.1)
55 (27.9)
55 (28.4)
≥US $75,000
239 (61.1)
119 (60.4)
120 (61.9)
Subjective health, very good or excellent, n (%)
337 (68.6)
165 (66.5)
172 (70.8)
.36
aP compares the experimental and control groups. Continuous variables were compared using the 2-tailed t test, and categorical variables were compared using the chi-square test.
Effect of Survey Deployment Patterns on the Survey Response Rate
We found that the more frequent deployment with fewer surveys per batch (experimental group) had superior response rates over time compared to the less frequent deployment with more surveys per batch (control group) for both primary and secondary outcomes (; P=.003 and P=.002 for interaction between deployment pattern and time, respectively). For the primary outcome (proportion of surveys returned), the two groups were comparable from weeks 0-8 (75% vs 76%), then showed progressively larger differences of 3%, 5%, and 8% in weeks 8-16 (70% vs 67%), weeks 16-24 (64% vs 59%), and weeks 24-32 (58% vs 50%). The secondary outcome (proportion of questions or tasks completed) was highly correlated with the primary outcome (Figure S1 in ; R=0.99). Hence, a similar trend was observed for the secondary outcome, with similar response rates during weeks 0-8 (77% vs 78%), followed by increasing differences across weeks 8-16 (71% vs 69%), weeks 16-24 (65% vs 61%), and weeks 24-32 (59% vs 52%). In addition, the proportion of participants not returning any surveys increased from 2% in weeks 0-8 to 33% in weeks 24-32, with a consistently higher dropout rate in the control group (3% to 38%) compared to the experimental group (1% to 28%; Figure S2 in ). These results indicated that while both groups showed declining survey response rates over time, the experimental group (more frequent but fewer survey deployment patterns) displayed a slower decline over the four periods compared to the control group.
Table 2. Longitudinal comparison of survey response rates between experimental (half batch of surveys every 2 weeks) and control (full batch of surveys every 4 weeks) groups across four time periods (n=492).
Time perioda
Experimental group (n=248)
Control group (n=244)
Difference in proportionsb
P value of group-by-time interaction
Primary outcome (%)c: proportion of surveys returned (95% CI)
.003
1
75.4 (70.2 to 80.6)
75.6 (70.3 to 80.9)
–0.2 (–7.0 to 6.5)
2
69.5 (64.5 to 74.4)
67.2 (62.2 to 72.2)
2.3 (–4.0 to 8.6)
3
63.5 (58.6 to 68.5)
58.7 (53.7 to 63.8)
4.8 (–1.5 to 11.1)
4
57.6 (52.4 to 62.8)
50.3 (45.0 to 55.6)
7.3 (0.5 to 14.0)
Secondary outcome (%)c: proportion of questions or tasks completed (95% CI)
bDifferences estimated as outcomes of the experimental group minus outcomes of the control group.
cEstimated outcomes were derived from a linear mixed-effects model analyzing repeated measurements of primary or secondary outcomes across four time points per participant. The model included fixed effects for randomization group, timepoint (time period 1-4), age-group (≤75 y vs >75 y), phone type (iPhone vs Android), and a randomization group-by-time interaction. Random intercepts were included for each couple and individual to account for correlation within couples and repeated measures of individuals.
In a sensitivity analysis, we coded the outcome of each survey as a binary variable, that is, returned or not returned. We used these binary outcomes instead of using the proportion in our analyses. Since only approximately 1% of returned surveys were partially completed, we did not analyze partially versus fully completed surveys and categorized both as “returned.” The significant interaction between time and group (Table S2 in ; P<.001) demonstrated that the experimental group had a slower decline in survey response rate over time compared to the control group. The odds ratios (experimental vs control group) between groups increased progressively over time, from 1.00 (95% CI 0.93-1.07) in the first time period to 1.08 (95% CI 1.01-1.15) in the final period (Table S2 in ), indicating higher odds of survey completion in the experimental group after the first period.
The effects of survey deployment pattern over time were explored across subgroups of biological characteristics (age and sex) through three-way interaction analyses. None of these interactions were significant (P=.95 for age groups; P=.65 for sex), indicating that the deployment pattern effect over time was not significantly different across participant subgroups by age groups (≤75 y [n=288] vs >75 y [n=204]) and sex (women [n=284] vs men [n=208]). However, with only 31% power to detect a three-way interaction effect, this study was underpowered to formally test whether the intervention effect significantly differed between strata. In the age-stratified analyses, we observed that, among younger participants (≤75 years), the experimental group showed improved survey return rates over time compared to the control group (12% higher in the final period), whereas, among older participants (>75 years), there were minimal differences between groups (Table S3 in ). In both women and men, the experimental group maintained higher response rates than the control group in later time periods: among women, there was a 6% between-group difference in the final period, whereas among men, there was an 8% difference between groups in the final period (Table S4 in ).
Discussion
Principal Findings
This RCT, embedded within the eFHS Offspring and Omni Study, provides insight into the impact of survey deployment patterns on longitudinal response rate among older adults using smartphone-administered health surveys. Our findings demonstrated that administering half of the surveys every 2 weeks (experimental group) was associated with a maintained response rate compared to administering all surveys every 4 weeks (control group). The findings suggest that the survey deployment patterns can significantly influence participants’ response rate over time, with potential implications for the design of digital health research protocols.
The trial design maintained the total survey burden constant between groups, isolating the effect of deployment pattern from overall workload. Several mechanisms may explain the superior response rates observed in the experimental group. First, the biweekly deployment pattern likely reduced cognitive burden by presenting fewer surveys per session, making each interaction less likely to be overwhelming [,,]. Additionally, the more frequent touchpoints may have helped maintain participant engagement through regular reminders of their study participation, creating a routine that integrated survey completion into participants’ schedules. The shorter time interval between deployments may have also reduced the likelihood of participants losing motivation between survey waves and reinforced the connection between participants and the research through consistent study contact [,].
Our results contribute to the growing body of literature on optimizing digital health research methods, particularly for older adult populations. To date, limited studies have investigated the impact of survey deployment patterns on response rates. A systematic review of RCT participation burden found that more frequent questionnaires increased patient workload, where they compared daily versus weekly assessments []. However, some prior studies suggested that frequent follow-up combined with other engagement strategies can increase survey response rates [,]. These mixed findings highlight the complexity of optimizing deployment frequency: too frequent may increase burden, while too infrequent may lead to loss to follow-up. The optimal deployment frequency likely varies by survey type and length and may require further exploration across different study contexts. This study provides insights into this topic by specifically examining the impact of survey deployment patterns (half batch of surveys every 2 weeks vs full batch of surveys every 4 weeks) on participants’ response rate among older adults (mean age 74, SD 6.3 y). The observed decline in survey response rates over time is consistent with findings from other longitudinal health studies [,]. However, our results suggest that more frequent but smaller survey batches may mitigate this decline to some extent, potentially improving adherence and reducing survey fatigue []. This approach appears to be less burdensome compared to less frequent, larger survey batches. In addition, these more frequent touches with participants may help maintain connection to the study [,].
In this trial, while the experimental group showed less dropout than the control group, both groups experienced increasing dropout rates over time, with cumulative dropout reaching 28% and 38%, respectively, by weeks 24-32. However, these dropout rates are lower than those reported in many other digital health studies. For instance, Meyerowitz-Katz et al [] analyzed 17 mobile health studies and found an average dropout rate of 43%, where studies varied widely in duration, ranging from 2 weeks to 1 year, with nearly one-third of the studies lasting 1 month or less. Daniore et al [] reported a median study completion rate of 48% among 19 remote digital health studies lasting 12 weeks or less, highlighting that retention rates as low as 10% are common in digital health research []. The persistence of dropout even with our deployment strategy reflects the inherent challenges of maintaining long-term retention in remote digital health research [].
Older adults do not adopt new and emerging technologies as readily as younger people [-]. However, while younger adults often demonstrate higher initial engagement with technology, retention and adherence fall off over time compared to adults aged 60 years and older []. Older age has been observed to be associated with improved retention in eight remote digital health studies where those aged 60 years and older were the least represented age-group, with shorter participant retention times (median across 8 studies varied from 2 to 26 days) than investigated in this trial []. A similar pattern was also observed in a previous study on middle-aged participants in FHS []. Understanding the predictors of older adults’ internet use and digital competence has become increasingly important as people live longer and the age profile of societies rapidly changes []. While many previous studies broadly classify “older adults” as those older than 60 years, our study population (mean age 74, SD 6.3 y) provides evidence for a substantially older demographic [,]. Old-age diversity is underrepresented in digital health research, and older individuals were more likely to be nonscreened, nonrecruited, or decliners in these studies []. This study revealed differences in response rate between older age-groups (higher response rate for those ≤75 y), which deepens our understanding of strategies to maintain survey response rates among older adults in digital health research. National survey data and data from large hospital centers identify older age as a factor associated with declines in wearable device use, consistent with less interest or comfort with technology []. These findings underscore the need to apply age-specific approaches in digital health research. When promoting technology adoption among older adults, both the technology’s features and the capabilities of older users should be considered, along with consideration for the incorporation of appropriate technical support []. Personalized learning approaches and aids that respect older adults’ pace and connect to their real-life needs may improve their engagement with digital technologies [].
Strengths and Limitations
This study benefits from several key strengths. The eFHS is embedded within the long-standing FHS, whose participants have been involved for decades. This long-term engagement has fostered trust between participants and researchers, likely contributing to higher retention rates than typically seen in e-cohorts. For example, among our sample with a mean age of 74 (SD 6.3) years, more than 65% (320/492) of participants continued to return surveys in the last period. Additionally, this study’s 26-week duration provides valuable insights into extended engagement patterns, offering a more comprehensive view of participant behavior compared to shorter-term digital health studies. Furthermore, the nested design of eFHS within FHS allows participants to be well characterized at in-person research center examinations.
Several limitations should be considered when interpreting the results of this study. First, the generalizability of our findings may be limited due to the characteristics of our study cohort. The eFHS participants were predominantly older, non-Hispanic White, English-speaking, well-educated, and resided primarily in the New England region of the United States. Moreover, eFHS participants were generally healthier than the broader FHS cohorts. The response rate patterns observed in this trial may be influenced by the participant characteristics. For example, higher education levels are associated with greater health literacy and technology adoption, potentially facilitating survey completion and reducing barriers to technology access and use [,]. The long-standing relationship between FHS participants and researchers likely enhanced trust and commitment, contributing to the relatively high response rates. Therefore, our findings may not be fully applicable to more diverse populations, younger participants, or those with different socioeconomic backgrounds or specific health conditions. Second, while the more frequent survey deployment pattern with smaller batches (administering half of the surveys every 2 weeks) was associated with an increased response rate, it introduced a temporal misalignment in end point assessments. This resulted in some instruments being assessed up to 2 weeks earlier or later compared to others, potentially introducing variability in the timing of data collection. This temporal discrepancy should be considered when interpreting the results and may have implications for studies where precise timing of assessments is critical. Third, our subgroup analyses by age and sex had limited statistical power to detect the three-way interaction and should be viewed as exploratory findings. Additionally, unmeasured factors such as technology literacy may have influenced response rates within subgroups, potentially masking interaction effects. Fourth, this trial focused on longitudinal response rates as main indicators of participant engagement, but other potentially informative engagement metrics, such as retention rates, time to survey completion, or time spent on each survey, were not assessed. While notification messages were randomly selected from the same message bank, we did not examine potential interactions between message type and deployment pattern. Furthermore, this study was limited to smartphone users and may not be generalized to digital health research using other mobile or electronic devices, such as tablets or desktop computers, which may have different use patterns and user interfaces.
Future Work
Future studies should investigate the impact of survey deployment patterns across diverse populations, including those with varying demographic backgrounds and specific health conditions. Additionally, research is needed to explore the optimal balance between survey frequency and quantity. While this study compared two specific patterns, future investigations could examine a broader spectrum of deployment patterns.
Larger studies with adequate statistical power are needed to confirm our exploratory subgroup findings and should collect data on additional factors, such as technology literacy, to better understand differential effects across participant characteristics. Future research should also incorporate other complementary engagement metrics beyond response rates, including retention rates, time to survey completion, and time spent on surveys, to develop a more comprehensive understanding of how deployment patterns affect various dimensions of participant engagement in digital health research.
Our findings underscore the potential of digital technology to engage older adults in health research. The high response rates, even among older adults, demonstrate the feasibility of remotely collecting important health data on mood, cognition, and physical function from older populations. This is particularly significant as technology increasingly empowers older individuals to self-monitor chronic conditions at home []. It also encourages implementation of decentralized RCTs that leverage remote participation and data collection, potentially expanding research access to traditionally underrepresented populations while reducing participant burden [].
Digital health studies offer a promising avenue for research participation, particularly for those who may struggle with in-person examinations, and they can extend outreach to improve access and representation in health research and clinical trials [-]. As digital methods become increasingly prevalent in health research, optimizing strategies for sustained participant engagement is crucial. Identifying effective strategies to maintain high response rates can improve the quality and representativeness of data collected through digital devices, ultimately enhancing the understanding of health trajectories in populations [].
Conclusions
In conclusion, this RCT provides evidence that survey deployment patterns influence longitudinal response rate in smartphone-based health research among older adults. The strategy of administering smaller, more frequent survey batches shows promise in maintaining higher response rates over time compared to larger, less frequent batches.
The electronic Framingham Heart Study (FHS) is supported by R01AG076757 and R01-HL141434. The FHS is supported by the National Heart, Lung, and Blood Institute at the National Institutes of Health (contracts N01-HC-25195, HHSN268201500001I, 75N92019D00031, and 75N92025D00012). EJB is partially supported by R01HL092577 and 1R01AG066010. The investigators thank the FHS participants. Generative artificial intelligence was not used in any portion of the manuscript writing.
The datasets generated and analyzed during this study are available from the corresponding author on reasonable request and will be available at public repositories (DbGaP and BioLINCC).
JMM and EJB obtained funding and supervised the collection of electronic Framingham Heart Study (eFHS) data; CL, JR, and YZ created analysis-ready datasets of the eFHS data; LT and JMM designed the trial; YZ conducted the statistical analyses with CL for this study; YZ drafted the original manuscript; all authors reviewed and edited the final manuscript.
ES is an employee of CareEvolution.
Edited by A Coristine; submitted 14.Mar.2025; peer-reviewed by AY Chiang, M Uchida, R Moser; comments to author 18.Jul.2025; revised version received 07.Aug.2025; accepted 19.Sep.2025; published 10.Oct.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
Worldwide, the estimates for the prevalence of persons with intellectual disabilities, an IQ below 70, vary between 1% and 3% [-], which is 80-234 million people. In the Netherlands, 2.3% of the population (around 400,000 out of 17.4…