Category: 3. Business

  • Journal of Medical Internet Research

    Journal of Medical Internet Research

    Digital health technologies (DHTs), encompassing a wide array of tools from mHealth apps and telemedicine to artificial intelligence, hold transformative potential for health care worldwide [,]. By expanding access to care, enhancing patient engagement, and improving the efficiency of diagnostic and treatment pathways, these technologies offer significant opportunities to build more accessible, affordable, and equitable health systems [-]. The World Health Organization defines digital health as “the field of knowledge and practice associated with the development and use of digital technologies to improve health.” [] This broad concept includes not only established eHealth domains but also emerging areas such as big data analytics and the Internet of Things, reflecting its integral role in modern health care.

    However, the benefits of digital health are not universally realized and are not distributed equally []. Factors such as digital literacy, access to devices and internet, socioeconomic status, cultural relevance, and community context influence who benefits from digital health solutions [,]. In fact, digital literacy and internet connectivity have been termed “super social determinants of health” because of their foundational influence on all other determinants of health in the digital age []. The rapid digitization of health care may widen health disparities if solutions are not developed with these determinants in mind []. Growing evidence suggests that the digital transformation in health care may exacerbate existing health inequities, creating new barriers for marginalized populations including persons with disabilities, patients of racial or ethnic minority groups, those with limited language proficiency, and people with low socioeconomic status [-].

    Previous research on the digital divide and health care access for vulnerable groups has illuminated various forms of exclusion, such as the inaccessibility of health websites and mobile apps, often due to a lack of distinguishable button features, inaccessible content, or the absence of assistive technology integration [,]. For individuals with blindness specifically, existing literature frequently points to significant challenges in interacting with visually-oriented digital environments []. Crucially, much of this prior research tends to homogenize the experiences of vulnerable populations [], overlooking the nuanced realities and varying adaptive capacities within specific subgroups. This oversight means that while broad challenges are identified, the potential for certain segments of vulnerable communities to navigate and even leverage digital tools remains underexplored.

    Within this context, educated and digitally literate young adults with blindness represent a critically overlooked and underexplored subgroup. For the purpose of this study, we define our participant cohort as follows: “young adults” refers to individuals aged 18 to 30 years [], a generation broadly considered digitally native; “educated” refers to individuals who have received or are currently pursuing higher education (including associate’s, Bachelor’s, Master’s, or doctoral degrees); and “blindness” is defined according to the World Health Organization criteria of a presenting visual acuity of less than 3/60 in the better eye []. This cohort embodies a central paradox: they are a digitally native generation, often exhibiting a greater willingness and capacity to adopt new technologies and engage in digital transformation through exploratory learning. The proliferation of smartphones equipped with assistive features like screen readers and voice assistants theoretically holds significant promise for enhancing their independence. However, their entire digital experience is mediated by these assistive technologies, rendering them uniquely vulnerable to design and usability flaws in mainstream applications. The existing literature, by not adequately differentiating within the community with blindness, fails to capture the unique dynamic of empowerment and exclusion experienced by this specific subgroup. This study addresses the critical gap by proposing that a segment of high-literacy individuals with blindness, through personal effort and adaptive strategies, can indeed mitigate some impacts of the digital divide, a nuanced perspective often underestimated in studies that generalize vulnerabilities. Understanding this internal heterogeneity is paramount for developing genuinely effective and equitable digital health solutions.

    China offers a uniquely relevant context for exploring these complex issues. It boasts one of the worlds largest internet user bases, exceeding 1.1 billion individuals as of 2024 [], with extensive access to a variety of internet-based services, including health care []. Concurrently, China is home to one of the largest populations with disabilities in the world, including nearly 10 million who are blind [], a significant proportion of whom are young. While some studies in China have identified health care barriers for visually impaired individuals, such as difficulties with registration, navigation, and understanding treatment processes [], the majority of empirical studies on digital health access have tended to focus on older adults or persons with disabilities in general. These studies offer valuable broad overviews but often do not provide in-depth insights into the specific experiences of educated young adults with blindness navigating both empowerment and exclusion in a rapidly digitizing health care system. The unique combination of a highly developed digital infrastructure and a large young population with blindness in China provides invaluable insights into how accessibility challenges persist and manifest even amid advanced technological environments, underscoring the urgency for inclusive design.

    To address this gap, this qualitative study aims to comprehensively explore the lived experiences of educated and digitally literate young adults with blindness in China as they access health care services in the digital age. A qualitative methodology is uniquely suited to capture the rich, in-depth narratives of these interactions, uncovering the nuanced facilitators and barriers that quantitative methods might miss. This nuanced understanding of their lived experiences with the digital health ecosystem can inform policy developments and improve clinical practices in promoting digital health equity.

    Study Design

    We used a qualitative design to gain an in-depth understanding of how educated and digitally literate young adults with blindness navigate health care access with the assistance of DHTs. This approach was chosen to capture the rich, subjective lived experiences and perceptions of participants, offering deep insights into how they interpret their personal encounters, construct their realities, and attribute meaning to their experiences within a rapidly digitizing health care ecosystem []. A qualitative methodology is particularly appropriate for exploring complex social phenomena where individual perspectives are central to uncovering the underlying dynamics of empowerment and exclusion. This study adheres to the Consolidated Criteria for Reporting Qualitative Research (COREQ) guidelines () [].

    Participants and Recruitment

    Participants

    This study focused on educated young adults with blindness who actively use smartphones and digital platforms to access health care services. Participants were selected based on the following inclusion and exclusion criteria ().

    Textbox 1. Inclusion and exclusion criteria.

    Inclusion criteria

    • Citizens and residents of China
    • Mandarin speakers
    • Young adults aged 16 to 36 years
    • Higher education (associate’s degree, Bachelor’s degree, or higher)
    • Capable of independently operating at least 1 digital device (eg, smartphone or computer)
    • Individuals with blindness (presenting visual acuity worse than 3/60 in the better eye, based on World Health Organization standards [])

    Exclusion criteria

    • No experience seeking health care services within the past 2 years
    • Unwilling to participate or unable to clearly articulate their experiences
    • Failure to meet any of the defined inclusion criteria
    Recruitment and Sample Size

    A purposive snowball sampling approach was used to recruit participants. Initially, participants were selected from online communities and social media platforms serving people with blindness in China. Initial recruitment was facilitated by author CC (who is also a highly educated adult with blindness), who posted the study invitation in several WeChat groups dedicated to information exchange and community building among the population with blindness. Interested and eligible individuals were then contacted directly by the author for screening. Following their interview, initial participants were asked to refer peers in their network who also met the study criteria, thus generating the subsequent snowball sample.

    A total of 12 participants were recruited for this study. In qualitative research, the sample size was determined by the principle of data saturation, not statistical generalizability. This approach is supported by findings from Guest et al [], which indicate that 12 interviews are often sufficient to reach thematic saturation in a relatively homogeneous sample. Our analysis showed a similar pattern: over 70% of themes were identified within the first 6 interviews, and the primary core themes were established by the 10th interview. To confirm saturation, 2 additional interviews were conducted, which yielded no new core themes. Therefore, the final sample of 12 participants was considered sufficient for a comprehensive analysis.

    Data Collection

    Semistructured interviews were conducted in Mandarin Chinese during September 2024 by 1 author (JZ), a female PhD student trained in qualitative research methods. The interviewer had no prior relationship with the participants, which helped minimize biases and address potential ethical concerns. All interviews were carried out remotely using the Tencent Meeting (Tencent Technology Co Ltd) videoconferencing platform. Tencent Meeting was selected due to the necessity for remote data collection during the COVID-19 outbreak and its status as a mainstream, accessible, and free online conferencing tool widely used in mainland China [,]. This ensured both the safety of participants and researchers and provided a familiar and convenient platform for our digitally literate participants with blindness. Before the interviews, participants were provided with detailed information about the study’s purpose, procedures, and the expected time commitment.

    A topic guide with open-ended questions () was used during interviews to ensure comprehensive coverage of relevant topics and allow participants to freely elaborate on their experiences. Follow-up questions were posed as needed to clarify responses and gather more detailed information on participants’ perspectives. The interview questions were developed by reviewing the existing literature and absorbing expert opinions. To ensure validity, the guide was pretested by 2 educated adults with blindness (who were not included in the final sample) and revised based on their feedback. Participants were initially asked to share their personal background and how they became blind. Subsequently, they were prompted to describe their past and present experiences in accessing health care services, while the third part focused on the perceived benefits and challenges of using digital tools, including how such technologies empowered or hindered their access to health care. Follow-up questions were tailored to participants’ responses to encourage deeper elaboration. Finally, participants were invited to share any additional thoughts or address overlooked aspects before concluding the interview. The interviewer took field notes during the interviews to supplement the data and highlight key moments []. A total of 12 interviews were conducted, with durations ranging from 35 to 90 minutes (mean 55.0, SD 18.5). All interviews were audio-recorded with participants’ permission, transcribed verbatim, and checked by participants.

    Ethical Considerations

    This study received ethical approval from the Peking University Institutional Review Board (IRB00001052-22097). Due to the participants’ blindness, a verbal informed consent process was meticulously followed. Before each interview, participants were thoroughly informed about the study’s purpose, procedures, their right to withdraw at any time without penalty, the voluntary nature of their participation, and the measures taken to ensure confidentiality. Oral consent was obtained after ensuring that participants fully understood all aspects of the study, and this consent was audio-recorded as part of the interview. To protect the privacy and confidentiality of participants, strict measures were implemented. All data collected, including interview transcripts and audio recordings, were anonymized immediately upon transcription by removing direct identifiers such as names, specific locations, or any other potentially identifying information. Pseudonyms were assigned to participants to ensure their anonymity in all research outputs. All data were stored securely on password-protected university servers accessible only to the research team. Participants received compensation ranging from 60 to 100 RMB (US $8.40 to $14.00) for their time and participation. We confirm that no images or other materials that could identify individual participants are included in this paper or any supplementary materials. All procedures involving human subjects were conducted in accordance with the ethical standards of the institutional and national research committee and with the Helsinki Declaration.

    Data Analysis

    This study used thematic analysis, a flexible and powerful method for systematically generating robust findings by “identifying, analyzing, and reporting patterns (themes) within data” []. Following the inductive qualitative thematic analysis approach outlined by Braun and Clarke [,], our data analysis encompassed 3 phases: reading, coding, and theming, informed by practical thematic analysis guidelines [].

    The reading phase commenced with the transcription of recorded interviews by 1 author (JZ), which were subsequently verified by the participants. The translated interview transcripts were then imported into the qualitative data analysis software MAXQDA 24 (VERBI GmbH) to facilitate the analytical process. During this phase, the researchers achieved extensive familiarization with the data through repeated readings.

    The coding phase began with initial code development and involved a systematic and iterative process. One researcher (JZ) initiated the process by assigning descriptive codes line-by-line to segments of the interview transcripts using MAXQDA 24. These codes were generated inductively, emerging organically from a close reading of the text. They represented specific concepts, ideas, or experiences directly relevant to the study’s objectives, aiming to capture the richness of participants’ perspectives in their own words. To ensure academic rigor and reliability, a second researcher (CS) independently analyzed 30% of the uncoded interview transcripts, generating her own list of key themes without any influence from JZ. After this blind coding process, the codes were discussed and compared among all authors. Code definitions were refined, and a shared codebook was developed. This iterative process involved reviewing and revising codes, merging similar concepts, and resolving discrepancies, ultimately ensuring a comprehensive and aligned approach to the remaining data [].

    The theming phase involved synthesizing these refined codes into broader, overarching themes that addressed the research questions [,]. Throughout the entire data analysis process, particular attention was paid to the concept of data saturation. Discussions regarding saturation began during the initial reading phase and continued iteratively throughout coding and theming to ensure that no new information was emerging and that the themes were well-developed and grounded in the data.

    Participants’ Characteristics

    A total of 12 educated and digitally literate young participants with blindness were included in this qualitative study (). The average age was 25.4 (SD 2.2) years, more than half were female (7/12, 58%), and most experienced blindness from an early age (9/12, 75%; aged <6 y). Reflecting the inclusion criteria, all participants were currently pursuing or had completed higher education: 17% (2/12) held junior college degrees, while 83% (10/12) had completed or were pursuing Bachelor’s degrees or higher. In terms of occupation, 58% (7/12) were employed, with the remaining 42% (5/12) being students or unemployed. Half of the participants (6/12, 50%) resided in first-tier cities, with the remaining half evenly distributed between new first-tier or second-tier and third-tier or below cities (3/12, 25% each). The most common reasons for seeking health care were acute conditions and injury treatment (6/12, 50%), followed by chronic and skin conditions (4/12, 33%).

    Table 1. Demographic information of educated young adults with blindness (N=12).
    Characteristics Value
    Age (y), mean (SD) 25.4 (2.2)
    Sex, n (%)
     Male 5 (42)
     Female 7 (58)
    Age of blindness onset (y), n (%)
     Congenital or early onset (0‐5) 9 (75)
     Acquired (>6) 3 (25)
    Education, n (%)
     Junior college 2 (17)
     Bachelor’s degree or higher (completed or in-progress) 10 (83)
    Occupation, n (%)
     Employed 7 (58)
     Students or unemployed 5 (42)
    Residence (city tier), n (%)
     First-tier 6 (50)
     New first-tier or second-tier 3 (25)
     Third-tier and below 3 (25)
    Primary health care visits, n (%)
     Acute conditions and injury treatment 6 (50)
     Chronic and skin conditions 4 (33)
     General check-ups 1 (8)
     Gynecological care 1 (8)

    Overarching Category: Experiences of Empowerment but Exclusion in Digital Health Care

    Participants’ experiences navigating health care in the digital age were rich and multifaceted, consistently revealing a complex dynamic of both empowerment and exclusion. Our thematic analysis yielded 7 key themes, which are presented under 2 overarching categories: empowerment (reflecting how digital technologies enhance autonomy and access), and exclusion (highlighting persistent barriers and unmet potentials in digital health care; ).

    Table 2. Overview of themes.
    Overarching category and theme Summary of key points identified
    Empowerment

    Digital platforms empowering self-management and health care access
    DHTs enabled participants to independently book appointments, reducing wait times and enhancing efficiency. These platforms also provided diverse and comprehensive health information, fostering self-advocacy and proactive health management.
    Digital platforms empowering for finding medical visit companions DHTs facilitated the discovery of medical companions, improving access to services and fostering a sense of independent navigation. This assistance provided both physical navigation and emotional support during hospital visits.
    Exclusion
    Inaccessible online appointment systems Online appointment systems often lacked inclusive booking options and featured cluttered interfaces not optimized for screen readers, limiting access for individuals with blindness despite the general shift to digital platforms.
    Inaccessible health care environments and information formats The absence of accessible interfaces on self-service machines (eg, for check-in, payment, and prescription pickup) and the lack of accessible formats for written materials (eg, laboratory reports) created significant barriers within hospital environments.
    Lack of provider competencies in respecting patient autonomy Provider assumptions of digital incompetence led to communication being directed at sighted companions, undermining patient autonomy and reinforcing stereotypes, despite patients’ digital literacy.
    Data privacy and security concerns The increased digitalization of health services heightened concerns over data breaches, making privacy harder to maintain. Complex interfaces and the use of voice-based assistive tools in public settings further complicated privacy management.
    Challenges related to the quality and consistency of online companion support While enabling, reliance on online platforms for companions introduced specific challenges related to the inconsistent quality and limited capabilities of support, often lacking emotional connection and accountability.

    aDHT: digital health technology.

    Empowerment: Digital Technologies Fostering Access

    Digital Platforms Empowering Self-Management and Health Care Access

    All 12 participants in this study demonstrated a high level of digital engagement, routinely using smartphones and screen reader technology to overcome accessibility challenges in daily life, extending their digital practices into areas such as information seeking, learning, and social interaction. The most frequently used applications include WeChat, Rednote (xiaohongshu in Chinese), TikTok (Douyin in Chinese), Bilibili, and Xianyu, which are popular platforms in China for social networking, content sharing, and e-commerce. This digital proficiency directly translated into enhanced health care engagement.

    Participants reported using digital platforms to access health care services and information, including managing appointments and consulting health-related content online. In participants’ views, digital platforms offer two key advantages: (1) they provide a wealth of diverse and comprehensive information, surpassing traditional word-of-mouth referrals; and (2) they enable users to access this information with temporal and spatial flexibility, offering greater convenience compared to time- and location-bound methods. This enhanced access to information did more than improve convenience; it facilitated a fundamental shift from passive reliance on others to proactive self-advocacy. Participants perceived this newfound ability to independently seek out and act on information as a powerful form of self-expression and a significant gain in personal freedom. For instance, one participant described how digital platforms enabled her to proactively seek mental health support tailored to her needs:

    I posted on Rednote saying that I am blind and looking for a psychiatrist who does not discriminate against me, and I received several responses from supportive individuals. This made me feel that I no longer need constant attention from my parents or those around me, as I can proactively seek information and help online.
    [Participant ZX, female, 22 years]

    For those with acquired vision loss (ie, vision loss that occurs after birth due to accidents, disease, or other environmental influences), the internet served as a crucial lifeline to rebuild life trajectories. As formal medical guidance on rehabilitation was often lacking, online patient communities and peer networks are usually the last resort of comfort:

    Doctors usually just said, ‘there’s no treatment,’ and offered little else. It was other patients—people I met online or in hospitals—who told me about schools for the blind, massage training, or what assistive devices to get.
    [Participant CT, male, 30]

    Compared to traditional hospital appointment scheduling that requires in-person visits, online appointment scheduling systems have greatly improved health care access by allowing patients to register remotely via hospital WeChat Official Accounts (inside WeChat). Real-time updates offer patients more control over scheduling, allowing them to easily find alternative hospitals with available appointments.

    Now, all tertiary hospitals have fully implemented online appointment systems, which is more convenient for blind people like us who could frequently use smartphones. I always make appointments through the hospital’s WeChat Official Account before seeing a doctor.
    [Participant ML, female, 27 years]

    Digital Platforms Empowering for Finding Medical Visit Companions

    Hospital visits without assistance posed significant challenges for individuals with blindness, sometimes leading to delays in seeking necessary health care. For people with blindness without family or friends nearby, digital platforms offer a potential solution by connecting them with volunteer networks or organizations providing paid medical visit companions (MVCs). All participants reported benefits when receiving assistance from MVCs, as the presence of a companion alleviated anxieties and provided a sense of security throughout their hospital journey.

    In the past, I would often delay medical visits because I felt overwhelmed by the hospital environment and often leave the hospital feeling that I had not addressed all my concerns, simply because I was too anxious to ask questions. When I was in Hangzhou, I began using Xianyu around one year ago to find companions. Over the past year, I have used this service a few times to arrange for someone to accompany me during medical appointments. I searched for keywords like ‘medical visit companions services’ and found options where individuals offered accompaniment services. They took me from home to the hospital and back, with charges from 30 to 80 yuan per hour. Having someone with me allows me to ask the right questions and make sure my issues are resolved.
    [Participant RL, male, 26 years]

    These insights highlight the empowering role that MVCs play in fostering both physical navigation and emotional support, making it easier for individuals with blindness to take charge of their health care. Through the combination of technological access and personal support, participants can be more engaged with their health care providers, which significantly improves health care seeking experience and their health outcomes.

    Exclusion: Persistent Barriers and Unmet Potentials in Digital Health Care

    Inaccessible Online Appointment Systems

    A significant challenge reported by participants was how the shift to digital platforms, while offering convenience, simultaneously erected new and formidable barriers. This dual reality was aptly summarized by a participant who noted:

    Online registration/payment has made things more convenient, but there’s still a lot that’s not working.
    [Participant HY, female, 24 years]

    This gap was particularly evident where digital platforms, despite offering convenience, featured designs that created new exclusionary hurdles. For instance, many hospital WeChat Official Accounts, while the primary channel for online appointments, presented cluttered interfaces with complex layouts and images not optimized for screen readers. This poor usability hindered navigation and undermined informed decision-making, as 1 participant explained:

    Each hospital has its own WeChat Official Account, and they differ from one another. The interface is complex, and the buttons are not designed with focus settings. This inaccessibility prevents me from accessing relevant information, thereby impacting my healthcare decision-making.
    [Participant CY, female, 24 years]

    Inaccessible Health Care Environments and Information Formats

    Participants reported that complex hospital environments remain highly challenging to navigate. Standardized accessibility features—such as Braille indicators in elevators, poorly designed tactile paths, and the lack of auditory cues in key areas—are commonly not available yet. More critically, the increasing digitalization within hospitals often introduced new barriers or failed to mitigate existing physical ones.

    For example, written materials such as laboratory reports, discharge records, and prescriptions are printed on paper without accessible formats like Braille or large print, making it difficult to understand and hindering patients with blindness from accessing vital information about their diagnosis and treatment. One participant expressed frustration:

    Even when I get my laboratory report and discharge record, they’re just regular paper printouts with no way for me to read them independently. I feel like I’m missing out on important information, and it’s frustrating.
    [Participant NX, female, 24 years]

    Moreover, hospitals are increasingly relying on touchscreen-based self-service machines for tasks like registration, payment, and report retrieval, which are often inaccessible to people with blindness due to the lack of screen reader compatibility. A participant reflected on this challenge:

    These machines have no screen reader compatibility, so I always need someone to briefly help me retrieve my reports.
    [Participant ZY, male, 26 years]

    Lack of Provider Competencies in Respecting Patients’ Autonomy

    Many participants indicated the lack of provider competencies in respecting their autonomy, the challenge that gained particular salience within the increasingly digitized health care landscape. Specifically, a pervasive issue identified was the default assumption among many health care providers that patients with blindness lack digital literacy or the ability to independently engage with digital platforms. In an age where digital tools are designed to empower patients with greater access to information and enhanced self-management capabilities, the lack of corresponding adaptation or improvement in provider communication creates a jarring and disempowering contrast. Consequently, while most health care providers display positive attitudes, they often lack the necessary skills to effectively engage with patients who are blind. This knowledge gap can lead to communication barriers, undermining the autonomy that technology aims to support. In extreme cases, some health care staff seem to view patients with blindness as objects of curiosity rather than patients in need of medical care. A participant summed up such experiences:

    Sometimes doctors ask irrelevant questions, like ‘Can you talk?’ or ‘Can you hear?’ as if they are observing an unfamiliar species instead of treating a patient. These kinds of questions only reinforce the communication barriers and make me feel like I’m not being taken seriously as a person in need of medical care, but rather as an object of curiosity.
    [Participant RL, male, 26 years]

    This lack of provider competencies is also reflected in the fact that health care providers often address the sighted companion instead of the patient with blindness during visits, despite the patient’s digital literacy and capacity for self-advocacy. Participants reported frequent occurrences where providers directed questions and communication to the companion, assuming they were unable to independently communicate or make decisions. One participant noted:

    Whenever I have a companion, the doctor naturally chooses to speak to them instead of me. Even after repeatedly reminding the doctors that I am the patient and should be the one answering questions, they still act as if I am incapable of engaging in a normal conversation. It’s frustrating and undermines my autonomy.
    [Participant ML, female, 27 years]

    Data Privacy and Security Concerns

    In the digital age, concerns regarding data privacy and information security are exacerbated for individuals with blindness, who often rely on assistive technologies and other forms of support in accessing health care. These vulnerabilities are not limited to physical interactions with medical staff but extend to broader digital infrastructures, including health platforms, mobile apps, and the public environments where these technologies are used.

    Participants consistently expressed difficulties in independently navigating privacy settings or understanding consent-related information embedded within digital health applications. Complex interfaces, inaccessible terms of service, and a lack of screen reader-compatible designs hinder the ability of these individuals to make informed choices. As a participant noted:

    Sometimes I just agree to everything because I can’t really read the privacy policy with the screen reader. The text layout is all over the place, and I’m not even sure what I’m consenting to.
    [Participant WQ, female, 27 years]

    Moreover, the use of voice-based assistive tools in public or semipublic settings presents distinct privacy risks. Given that these tools often verbalize sensitive health information, individuals in proximity may inadvertently overhear confidential data. This issue is further complicated by the involvement of MVCs, who assist with tasks such as navigating digital platforms, completing forms, or managing payments. While such assistance is often essential, it can inadvertently compromise the individuals’ sense of privacy and control. As 1 participant expressed:

    Having a companion can be helpful, but sometimes I still prefer to visit alone because there are certain things I don’t want others to know. Even if I ask the volunteer to keep the information confidential and not disclose it, I still don’t feel comfortable because they have to help with payments and other tasks, and I end up feeling like I have no privacy.
    [Participant CT, male, 30 years]

    Challenges Related to the Quality and Consistency of Online Companion Support

    While digital platforms offered new avenues for finding companions, this also introduced specific challenges related to the quality and consistency of support. Participants expressed concerns about the inconsistent experience and limited capabilities among MVCs, particularly regarding mobility assistance and understanding patient needs. Digital platforms often facilitated one-time interactions that lacked emotional connection and accountability, leading to varied and sometimes unreliable support:

    That volunteer is in such a rush to finish his task and go home that he barely listens to what I need. There were times when I had to repeat myself multiple times just to get basic assistance.
    [Participant YN, female, 27 years]

    In summary, the findings highlight a complex and often contradictory landscape for educated and digitally literate young people with blindness accessing health care in the digital age. While digital platforms offer significant opportunities for empowerment in areas like appointment booking and companion support, these benefits are consistently counterbalanced by pervasive challenges such as inaccessible interfaces, systemic gaps in provider competence, and exacerbated privacy concerns. This dual reality of simultaneous empowerment and exclusion underscores the heterogeneous nature of the digital divide within vulnerable populations.

    Principal Findings

    To the best of our knowledge, this qualitative study is the first to specifically explore the health care experiences of educated and digitally literate young people with blindness in China within the context of the rapidly evolving digital health landscape. Our findings reveal an “empowered but excluded” dynamic, a paradox that vividly illustrates the lived reality of young people with blindness as a digitally native yet vulnerable generation. On one hand, participants demonstrated that DHTs and online platforms served as valuable tools, empowering them in self-managing their health conditions, proactively accessing health care information, and efficiently finding MVCs. On the other hand, this potential for digital empowerment and enhanced independence was significantly undermined by persistent and systemic barriers. These included reduced offline access to essential services, inaccessible digital and physical health care interfaces, a pervasive lack of provider competencies in respecting patients’ autonomy within a digital context, and heightened concerns regarding data privacy and security exacerbated by digital interactions.

    Comparison With Prior Work: Empowerment

    Our findings corroborate existing literature on the empowering potential of DHTs for individuals with visual impairments. Participants’ ability to effectively use online platforms for appointment booking and to access a wealth of diverse and comprehensive health information aligns with previous research highlighting improved self-management and enhanced health literacy through digital tools [-]. The increased autonomy and freedom participants reported, stemming from their capacity to proactively seek information and support, resonates with the broader discourse on patient empowerment in the digital age [-]. This study extends these insights by specifically demonstrating how educated individuals with blindness, through their active engagement with screen reader technology and other digital tools, convert these opportunities into tangible benefits, challenging simplistic narratives of universal exclusion. The use of online patient communities and peer networks to fill gaps in formal medical guidance, particularly for those with acquired vision loss, further underscores the internet’s role as a crucial lifeline and a source of social support.

    A distinctive contribution of this study is the exploration of digital platforms for finding MVCs. While the importance of companions for individuals with blindness in navigating health care is well-documented [], the use of online platforms (such as Xianyu in China) to locate and coordinate such support represents an innovative, user-driven adaptation. This strategy allows for greater independence in arranging assistance, improving the overall health care–seeking experience, an area previously underexplored in digital health literature.

    Comparison With Prior Work: Exclusion

    Despite the empowering potential of DHTs, our participants’ experiences reveal a profound exclusion shaped by persistent technological disaffordances, provider interactions that often disregard patient autonomy, and digital privacy concerns, which collectively hinder their independent and equitable health care engagement. Our findings align with prior research showing that many digital health platforms remain largely inaccessible to users with blindness and low vision [,]. This inaccessibility manifests in specific barriers, including websites that fail to meet accessibility standards, visual-centric data displays, and complex interfaces that do not accommodate screen readers or alternative input methods [-]. These limitations are not just technical oversights but reflect a broader systemic neglect of the needs of people with disabilities in the design and development process.

    Furthermore, our study highlights how the interaction between digital and nondigital environments can amplify existing inequalities. Beyond technological inaccessibility, participants frequently encountered health care providers who failed to recognize and respect their autonomy. This finding is consistent with previous research which shows that health care providers may hold stereotypes or paternalistic assumptions about persons who are blind, leading to exclusionary communication practices and undermining patient-centered care [,]. Our study adds to this discourse by illustrating how relational autonomy—a framework that emphasizes the importance of direct, respectful communication and the clinician-patient relationship as central to support patients’ identities and capabilities [-]. When providers fail to engage patients with blindness as active participants in their care, it not only erodes trust but also reinforces structural inequities [,].

    Finally, our findings align with previous research showing that digital privacy poses unique challenges for users with blindness, extending beyond standard concerns about data breaches. When using visual assistance technologies or sharing sensitive data, they may be unable to independently verify what information is being disclosed [,]. This complexity of privacy for users with blindness is tightly interwoven with issues of accessibility, autonomy, and trust. Our study’s contribution lies in showing the compounded effect of these factors on young individuals with blindness in China, revealing that digital empowerment is fragile and easily overridden by systematic barriers within the health care environment.

    Implications for Practice and Policy

    To address the systematic barriers identified in this study and improve the health care experiences of young people with blindness, we propose the following feasible policy and practical implications.

    First, developers and policymakers must enforce adherence to established accessibility standards. For web-based platforms, this includes the Web Content Accessibility Guidelines []. However, as health care services increasingly migrate to mobile apps, it is equally critical to incorporate mobile-specific accessibility guidelines, such as Apple’s Human Interface Guidelines for accessibility []. Research shows that compliance is often partial; therefore, involving users with disabilities directly in a co-design process is critical for identifying specific needs, such as intuitive navigation, accessible onboarding, and the use of clear language [].

    Second, medical education and professional training must be enhanced. Evidence shows that structured communication skills training improves health care professionals’ self-efficacy and performance, leading to more effective and empathetic patient interactions. To address the biases reported by our participants, these training programs must include strategies to help providers recognize and mitigate unconscious bias related to disability, incorporating the perspectives of marginalized patient groups into the training design [-].

    Thirdly, robust and accessible privacy controls are needed. Individuals with blindness require privacy information and controls that are both accessible and understandable, emphasizing the need for clear, multi-modal communication and cross-platform compatibility in privacy tools. The development and implementation of accessible authentication methods, such as Braille passwords or universally usable verification tools, should be prioritized.

    Finally, it is crucial to empower young individuals with blindness by building their capacity for self-determination. Organizations led by and for individuals with blindness play a pivotal role in this process by equipping them with self-advocacy and daily living skills []. In the Chinese context, while organizations like the China Disabled Persons’ Federation provide foundational services [], nongovernmental organizations such as the Golden Cane, the Beijing Hongdandan Cultural Service Center, and the One Plus One Disability Charity Group are vital in promoting rights advocacy and independent living skills [-]. A notable gap remains in dedicated health care navigation training programs that integrate digital literacy for e-health services. Closing this gap is essential to ensure young blind individuals in China can fully leverage digital health advancements.

    Strengths, Limitations, and Future Research

    This study’s primary strength lies in its novel contribution to understanding the health care experiences of a previously overlooked subgroup: young, educated individuals with blindness in China. By focusing on this specific demographic, our research offers three key contributions. First, it challenges the homogeneous view of vulnerable groups by demonstrating that high-literacy individuals possess unique capabilities and face distinct challenges within the digital ecosystem. Second, it introduces and evidences the “empowered but excluded” paradox, providing a nuanced theoretical framework that moves beyond a simple narrative of digital exclusion. It shows that empowerment and exclusion are not mutually exclusive but coexist, shaped by the interplay between individual agency and systemic barriers. Third, this framework helps distinguish which challenges can be mitigated through individual effort and digital literacy versus those that require fundamental changes in policy, technology design, and clinical practice. The qualitative depth provides rich, contextualized insights that explain how and why these dynamics manifest, laying the groundwork for tailored interventions.

    This study has several limitations. As a qualitative study, the findings are based on a small sample of 12 educated young individuals with blindness in China and may not be generalizable to other age groups, cultural contexts, or countries with different health care and digital infrastructures. The recruitment strategy may have introduced selection bias, potentially attracting participants with more pronounced positive or negative experiences with DHTs. Furthermore, participant recall bias might have influenced their accounts of past health care experiences. Despite these limitations, this study offers rich, contextualized insights into the lived experiences of a typically underrepresented group in digital health research. Future research should explore these issues with larger, more diverse samples, potentially using quantitative or mixed-methods approaches to assess the prevalence of the themes identified and to evaluate the effectiveness of interventions aimed at improving health care accessibility and autonomy for people with blindness in the digital age. Comparative studies across different socioeconomic and cultural settings would also be beneficial.

    Conclusions

    This study explored how educated young adults with blindness in China navigate health care in the digital age, revealing an “empowered but excluded” dynamic. The potential for digital empowerment and enhanced independence, though present, is consistently curtailed by systematic barriers including inaccessible technologies, provider practices that limit patient autonomy, and privacy vulnerabilities. To bridge this gap, our findings underscore the necessity of a multifaceted approach: enhancing technological accessibility through robust standards adherence and inclusive co-design processes; improving health care provider competencies in patient-centered care via targeted training; and empowering young individuals with blindness by building their capacity for self-determination. Implementing these integrated strategies is vital for realizing equitable health care access and true independence for this digitally native yet vulnerable generation.

    The authors would like to express our sincere gratitude to all the participants for their courage and openness in sharing their experiences, and to the key informants for contributing their valuable perspectives.

    This work was supported by the National Natural Science Foundation of China (grant number 72442021) and the University of Chinese Academy of Social Sciences Innovation Fund (grant number 2025-KY-077). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

    The data supporting this study are available upon reasonable request from the corresponding author.

    All authors contributed to the paper and approved the final submitted version.

    None declared.

    Edited by Alicia Stone, Amaryllis Mavragani; submitted 24.Jun.2025; peer-reviewed by Kabelo Leonard Mauco, Soyoung Choi; final revised version received 22.Oct.2025; accepted 23.Oct.2025; published 21.Nov.2025.

    © Junling Zhao, Can Su, Xiji Zhu, Cong Cai, Wei Liu, Xiaochen Ma. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 21.Nov.2025.

    This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

    Continue Reading

  • Journal of Medical Internet Research

    Journal of Medical Internet Research

    Age-related macular degeneration (AMD) is a progressive retinal disorder affecting millions of people worldwide []. In its advanced stages, characterized by neovascularization and geographic atrophy (GA), it can lead to significant vision loss, although symptoms may be subtle during the early and intermediate phases []. The Classification of Atrophy Meetings group has defined atrophy lesion development as incomplete retinal pigment epithelium (RPE) and outer retinal atrophy and complete RPE and outer retinal atrophy (cRORA) based on imaging methods []. GA, also known as cRORA, is the endpoint of dry AMD and is characterized by the loss of photoreceptors, RPE, and choriocapillaris [,]. With the advent of 2 approved therapies for GA secondary to AMD in 2023, namely pegcetacoplan (Syfovre) [] and avacincaptad pegol [], the treatment of GA represents a significant breakthrough. However, the effectiveness of these therapies relies heavily on early detection and the ability to monitor treatment response—a significant unmet need in current clinical practice. The recent approval of complement inhibitors underscores the necessity for precise, reproducible, and practical tools to not only identify GA at its earliest stages but also to objectively track morphological changes over time, thereby evaluating therapeutic efficacy [,]. Artificial intelligence (AI) is uniquely positioned to address this gap by enabling precise, reproducible, and automated quantification of GA progression and treatment response using noninvasive imaging modalities []. Unlike conventional methods that rely on subjective and time-consuming manual assessments, AI algorithms can detect subtle subclinical changes in retinal structures—such as photoreceptor integrity loss, RPE atrophy, and hyperreflective foci—long before they become clinically apparent. Thus, AI-based retinal imaging offers a critical foundation for early detection and timely intervention in GA.

    Various imaging techniques, both invasive and noninvasive, can directly visualize GA lesions. Invasive methods, such as fluorescence angiography, often result in a poor patient experience and entail high costs due to pupil dilation and sodium fluorescein injection. While it remains the gold standard for assessing neovascular AMD and offers significant diagnostic insights for retinal vascular diseases, in most cases, noninvasive fundus images are used for GA diagnosis and management []. Color fundus photography (CFP), fundus autofluorescence (FAF), and near-infrared reflectance (NIR) are based on 2D images, which can generally produce results to quantify the atrophic area but fail to identify the retinal structure axially []. Compared with fundus imaging, optical coherence tomography (OCT) provides high-resolution, noninvasive 3D images of retinal structures for macular assessment. In addition, conventional B-scan (axial direction) OCT images can be integrated with en-face scans, facilitating the identification of atrophy borders similar to FAF [,]. Nonetheless, manual labeling is tedious, time-consuming, and impractical in a clinical setup []. There is an urgent and unmet need for early detection and management of GA using retinal image modalities. Recent advancements in AI, especially deep learning (DL), present a promising opportunity for enhancing GA detection, classification, segmentation, quantification, and prediction.

    In the 1950s, AI referred to computer systems capable of performing complex tasks that historically only a human could do. So what is AI? How is it used in medicine today? And what may it do in the future? AI refers to the theory and development of computer systems capable of performing tasks that historically required human intelligence, such as recognizing speech, making decisions, and identifying patterns. AI is an umbrella term that encompasses a wide variety of technologies, including machine learning (ML) and DL []. ML is a subfield of AI that uses algorithms trained on datasets to create self-learning models capable of predicting outcomes and classifying information without human intervention []. ML refers to the general use of algorithms and data to create autonomous or semiautonomous machines. DL, meanwhile, is a subset of ML that layers algorithms into “neural networks” with 3 or more layers. Thus, it somewhat resembles the human brain, enabling machines to perform increasingly complex tasks []. DL algorithms generally have high and clinically acceptable diagnostic accuracy across different areas (ophthalmology, respiratory, breast cancer, etc) in radiology []. Within ophthalmology, DL algorithms showed reliable performance for detecting multiple findings in macular-centered retinal fundus images []. Therefore, automatic GA segmentation plays a vital role in the diagnosis and management of advanced AMD and its application in the clinical setting.

    Given the rapid evolution of AI applications in ophthalmology and the growing clinical importance of GA, this study aimed to systematically review the current evidence on AI-based approaches for the detection and management of GA secondary to dry AMD using noninvasive imaging modalities. We aimed to evaluate diagnostic accuracy relative to reference standards and examine methodological challenges to inform the design of future research and clinical implementation.

    Protocol and Registration

    Before starting this systematic review and meta-analysis, we registered a protocol on the PROSPERO website. This review adhered to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and PRISMA-DTA (PRISMA of Diagnostic Test Accuracy) checklists [,].

    Eligibility Criteria

    We included studies using AI algorithms to detect, classify, identify, segment, quantify, or predict GA secondary to AMD from CFP, OCT, OCT angiography, FAF, or NIR. The data were from participants, with or without symptoms, who were diagnosed with GA (or cRORA) secondary to nonexudative AMD. Study designs were not restricted; multicenter or single-center, prospective or retrospective, post hoc analysis, clinical study, or model development studies were all accepted. Eyes with neovascular complications or macular atrophy from causes other than AMD, any previous anti-vascular endothelial growth factor treatment, any confounding retinopathy, or poor image quality were excluded.

    Electronic Search Strategy

    Two consecutive searches were conducted on PubMed, Embase, Web of Science, Scopus, Cochrane Library, and CINAHL. Because this review required the extraction of baseline data and items, considering the completeness of the data, we did not conduct any in press or print source searches and excluded conference proceedings and similar materials. The initial search was completed from the date of entry to December 1, 2024; the updated search, from December 1, 2024, to October 5, 2025. We used a search strategy for patient (GA) and index tests (AI and retinal images) that had been used in previous Cochrane Review without any search peer review process []. There were no restrictions on the date of publication. The language was limited to English. In , detailed search strategies for each database are provided. During this process, no filters were used. During the search process, we adhered to the PRISMA-S (Preferred Reporting Items for Systematic reviews and Meta-Analyses literature search extension) reporting guidelines [].

    Selection Process

    All relevant literature was imported into EndNote (version 20; Clarivate Analytics) software, and literature screening was conducted independently by 2 researchers (NS and JL) who specialize in ophthalmology. Duplicates were removed from the software, and the titles and abstracts of the literature were reviewed to identify those relevant to the topic. Finally, the full texts were downloaded and examined, leading to the selection of literature that met the inclusion criteria. In cases of inconsistencies in the final inclusion decisions made by the 2 researchers, a third professional (LL) was consulted to resolve the dispute.

    Data Collection Process

    Using standardized data items, the data were extracted independently from the included studies by 2 researchers (NS and JL). A third review author (LL) confirmed or adjudicated any discrepancies through group discussion. We retrieved the following data items: (1) study characteristics (author, year, study design, region, and theme), (2) dataset characteristics (databases, source of databases, training/validation/testing ratio, patient number, number of images or volumes, scan number, mean age, clinical registration number, and model evaluation method), (3) image and algorithm characteristics (devices, metrics, image modality, image resolution, and AI algorithms), (4) performance metrics (outcomes, performance of models, ground truth, and performance of the ophthalmologists), and (5) main results. All the information was retrieved from the main text and the tables provided in . Therefore, we did not seek additional data by contacting the authors or experts. In some studies, the authors reported multiple sets of performance data based on a subset of a single dataset. For example, they may have reported results such as sensitivity, specificity, accuracy, and so forth, conducted on the cross-validation set, the test set, or the development set. We referred to the relevant literature to select the optimal set of test performance results []. However, when the primary study provided performance results based on a single test, the development dataset was used to train the AI model, and an external validation set ultimately was used to determine the performance of the optimal model. We extracted the external validation set performance data [].

    Risk of Bias and Application

    We worked in pairs to assess the risk of bias and the applicability of the studies, which involved detection, classification, identification, segmentation, and quantification using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-AI [] and the modified QUADAS-2 tool [], while predictive studies used the Prediction Model Risk of Bias Assessment Tool (PROBAST) [].

    In the current context, QUADAS-AI has not yet established a complete specification of items. Therefore, we referenced the examples provided by QUASAS-AI and the published literature to compile the revised QUADAS-AI items, which included 4 domains and 9 leading questions (Table S4 in ). The PROBAST tool comprises participants, predictors, outcomes, and analysis, containing 20 signaling questions across 4 domains (Table S5 in ). We also evaluated the applicability of the study based on the leading or signaling questions in the first 3 domains. A study with “yes” answers to all index questions was considered to have a low risk of bias. If the answer to any of the informational questions was “no,” there was a potential for bias, leading the authors to rate the risk of bias as high. “Indeterminate” grades were applied when detailed content was not provided in the literature, making it difficult for the evaluator to reach a judgment. They were used only when the reported data were insufficient. Throughout the process, disagreements between the 2 reviewers (NS and JL) were resolved by consulting the senior reviewer (LL).

    Data Synthesis

    As very few studies reported the number of true positives, true negatives, false positives, and false negatives, we restricted the quantitative analysis to determine the diagnostic accuracy of AI as a triaging tool for GA secondary to nonexudative AMD. However, a meta-analysis was not performed due to significant methodological heterogeneity across studies, arising from diverse AI architectures, imaging modalities, outcome metrics, and validation protocols. Instead, a systematic review was performed to qualitatively summarize performance trends. This approach allowed for a comprehensive evaluation of the AI capabilities in the detection and management of GA via noninvasive images.

    Study Selection

    A total of 979 records related to the topic of this systematic review were searched across 6 different databases using a combination of subject terms and free-text terms. After removing duplicates, 335 records remained and were examined for titles and abstracts. Excluding studies not relevant to the research topic resulted in 200 reports. The full texts were then downloaded and reviewed in detail based on the eligibility criteria for the studies. In the final qualitative analysis, 41 studies were included. Of these, 10 focusing on GA diagnosis, 20 on GA assessment and progression, and 11 on GA prediction. presents the detailed flow diagram of the literature selection.

    Figure 1. PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram for literature selection. GA: geographic atrophy.

    AI in Detecting the Presence of GA

    Ten of the 41 included studies focused on AI-based detection of GA using noninvasive retinal images (Table S1 in ). As listed in , the studies were published from 2018 to 2025. Four of the studies [-] focused on model development, 3 [-] were retrospective studies, and 3 [-] were prospective studies (1 multicenter cohort study, 1 multicenter and low-interventional clinical study, and 1 clinical study). Geographically, half were from the United States, with others from Israel, Italy, Switzerland, Germany, and a multicenter European collaboration. The studies addressed several detection-related tasks: 5 focused solely on GA detection [-,,], 2 covered detection and classification [,], and others integrated detection with quantification or segmentation [,,].

    Table 1. Characteristics of studies evaluating artificial intelligence (AI) models for geographic atrophy (GA) detection using noninvasive retinal imaging.
    Author Study design Region Purpose of the study Source of datasets Number of patients Number of images or scans Model evaluation method Image modality (image resolution) AI algorithms Outcomes Performance of models
    Fineberg et al [] Retrospective cohort study Israel (Petah Tikva) Detection and classification (GA) Rabin Medical Center 113 659 10-fold cross-validation NIR (640*640 pixels) CNNs: ResNet50, EfficientNetB0, ViT_B_16, and YOLOv8 variants. ACC, P, SEN, SPE, F1, IoU, and DSC
    • GA classification:
      EfficientNetB0: ACC=0.9148; P=0.9204; SEN=0.9233; SPE=1.0; F1=0.9147.
    • ResNet50: ACC=0.8815; P=.8933; SEN=0.8917; SPE=0.9833; F1=0.8812.
    • ViT_B_16: ACC=0.963; P=.9632; SEN=0.9667; SPE=1.0; F1=0.9629.
    • GA detection: YOLOv8-Large: SEN=0.91; P=0.91; IoU=0.84; DSC=0.88.
    Kalra et al [] Retrospective clinical study United States (Cleveland) Detection, quantification, and segmentation (presence of GA and pixel-wise GA area measurement) the Cole Eye Institute of the Cleveland Clinic 341 900 triple-fold cross-validation SD-OCT (256*256 pixels) CNN: U-Net F1, ACC, P, R, SEN, and SPE
    • GA detection- ACC=0.91, SEN=0.86, SPE=0.94, F1=0.87.
    • GA segmentation: ACC=0.96, SEN=0.95, SPE=0.93, F1=0.82.
    Derradji et al [] Retrospective clinical study Switzerland (Lausanne) Detection and segmentation (RORA) An existing image database of the Medical Retina Department at Jules-Gonin Eye Hospital 57 62 5-fold cross-validation SD-OCT (NR) CNN: U-Net SEN, DSC, P, and Kappa
    • Grader 1: DSC: mean 0.881 (SD 0.074); Precision: mean 0.928 (SD 0.054); SEN: mean 0.850 (SD 0.119); Kappa: mean 0.846 (SD 0.072).
    • Grader 2: DSC: mean 0.844 (SD 0.076); Precision: mean 0.799 (SD 0.133); SEN: mean 0.915 (SD 0.064); Kappa: mean 0.800 (SD 0.082).
    de Vente et al [] Prospective multicenter and low-interventional clinical study (including cross-sectional and longitudinal study part) 20 sites in 7 European countries Detection and quantification (cRORA) The MACUSTAR Study Cohort 168 143 (ZEISS); 167 (Spectrails) NR SD-OCT (512*650 pixels) CNN: U-Net SEN, SPE, PPV, NPV, and Kappa
    • ZEISS: SEN=0.6; SPE=0.964; PPV=0.375; NPV=0.985.
    • Spectralis: SEN=0.625; SPE=0.974; PPV=0.714; NPV=0.961.
    Sarao et al [] Prospective clinical study Italy (Udine) Detection (presence of GA) the Istituto Europeo di Microchirurgia Oculare (IEMO) study 180 540 NR CFP (NR) CNN: Efficientnet_b2 SEN, SPE, ACC, F1, R, AUROC, and AUPRC
    • SEN: 100% (95%CI 83.2%-100%); SPE=97.5% (95% CI 86.8%-99.9%); ACC=98.4%; F1=0.976; R=1; AUROC=0.988 (95% CI 0.918-1); AUPRC=0.952 (95%CI 0.719-0.994).
    Keenan et al [] Multicenter and prospective cohort study United States (Maryland) Detection (presence of GA) Age-Related Eye Disease Study (AREDS) dataset 4582 59,812 5-fold cross-validation CFP (512 pixels) CNN: inception v3 ACC, SEN, SPE, P, AUC, and Kappa
    • ACC=0.965 (95% CI 0.959-0.971); Kappa=0.611 (95% CI 0.533-0.689); SEN=0.692 (95% CI 0.560-0.825); SPE=0.978 (95% CI 0.970-0.985); Precision=0.584 (95% CI 0.491-0.676).
    Yao et al [] Model development and evaluation United States (California) Detection (presence of nGA) the Early Stages of AMD (LEAD) study 140 1884 5-fold cross-validation SD-OCT (512*496 pixels) CNN: ResNet18 SEN, SPE, ACC, P, and F1
    • SEN=0.76 (95% CI 0.67-0.84); SPE=0.98 (95% CI 0.96-0.99); PRE=0.73 (95% CI 0.54-0.89); ACC=0.97 (95% CI 0.95-0.98); F1=0.74 (95% CI 0.61-0.84).
    Chiang et al [] Model development United States (California) Detection (complete retinal pigment epithelial and outer retinal atrophy (cRORA) in eyes with AMD) University of Pennsylvania, University of Miami, and Case Western Reserve University; (2) Doheny Image Reading Research Laboratory, Doheny-UCLA (University of California Los Angeles Eye Centers) 71 (training); 649 (testing #1); 60 (testing #2) 188 (training); 1117 (testing #1) 5-fold cross-validation SD-OCT (256*256 pixels) CNN: ResNet18 SEN, SPE, PPV, NPV, AUROC, and AUPRC
    • SEN=0.909 (95% CI 0.778-1.000); SPE=0.553 (95% CI 0.394-0.703); PPV=0.541 (95% CI 0.375-0.707); NPV=0.913 (95% CI 0.778-1.000); AUROC=0.84 (95% CI 0.75-0.94); AUPRC=0.82 (95% CI 0.70-0.93).
    Elsawy et al [] Model development United States (Maryland) Detection (explain decision making and compare methods) The Age-Related Eye Disease Study 2 (AREDS2) Ancillary SD-OCT study from Devers Eye Institute, Emory Eye Center, Duke Eye Center, and the National Eye Institute 311 1284 scans 10-fold cross-validation SD-OCT (128*128 or 224* pixels) 3D CNN: deep-GA-Net ACC, P, R, F1, Kappa, AUROC, and AUPRC
    • ACC=0.93 (95% CI 0.92-0.94); Precision=0.90 (95% CI 0.88-0.91); Recall=0.90 (95% CI 0.89-0.92); F1 score=0.90 (95% CI 0.89-0.91); Kappa=0.80 (95% CI 0.77-0.83); AUROC=0.94 (95% CI 0.93-0.95); AUPRC=0.91 (95% CI 0.90-0.93).
    Treder et al [] Model development Germany (Muenster) Detection and classification (GA) Public database: ImageNet 400 (training); 60 (test set) 400 (training); 60 (test set) NR FAF (NR) Deep CNN: self-learning algorithm SEN, SPE, and ACC
    • Probability score: mean 0.981 (SD 0.048); SEN=100%; SPE=100%; ACC=100%.

    aAI: artificial intelligence.

    bACC: accuracy.

    cAUPRC: area under the precision-recall curve.

    dCNN: convolutional neural network.

    eCFP: color fundus photography.

    fcRORA: complete retinal pigment epithelium and outer retinal atrophy.

    gDSC: dice similarity coefficient.

    hFAF: fundus autofluorescence.

    iIoU: intersection over union.

    jNR: not reported.

    kOCT: optical coherence tomography.

    lPPV: positive predictive value.

    mP: precision.

    nR: recall.

    oSD-OCT: spectral domain OCT.

    pSEN: sensitivity.

    qSPE: specificity.

    rAUROC: area under the receiver operating characteristic curve.

    sAMD: age-related macular degeneration.

    tNPV: negative predictive value.

    Dataset configurations varied: 6 studies used training, validation, and test sets [-,,]; 3 used only training and test sets [,,]; and 1 included a tuning set []. Collectively, these studies involved at least 7132 participants, with ages ranging from 50 to 85 years. Three studies were registered with ClinicalTrials.gov (NCT00734487, NCT01790802, and NCT03349801) [,,]. Cross-validation methods included 5-fold (40% of studies) [,,,], 10-fold (20%) [,], and triple-fold (10%) []; 30% did not report validation details.

    Spectral-domain (SD)–OCT was the most frequently used imaging modality (6/10 of studies) [-,,,], followed by CFP (2/10) [,], and FAF or NIR (2/10 each) [,]. Most studies applied image preprocessing techniques—such as size standardization, orientation adjustment, intensity normalization, and noise reduction—to improve model performance. DL-based algorithms for GA detection have been developed for multiple image modalities. For example, Derradji et al [] trained a convolutional neural networks (CNNs), U-Net, to predict atrophic signs in the retina, based on the EfficientNet-b3 architecture. Kalra et al [] and de Vente et al [] also trained a DL model based on U-Net. Yao et al [] applied 3D OCT scans with ResNet18 pretrained on the ImageNet dataset, and Chiang et al [] developed CNN (ResNet18) to improve computational efficiency. Elsawy et al [] proposed Deep-GA-Net, a 3D backbone CNN with a 3D loss-based attention layer, and evaluated the effectiveness of using attention layers. Sarao et al [] used a deep CNN, the EfficientNet_b2 model, which was pretrained on the ImageNet dataset and is well-known for its high efficiency and small size. Keenan et al [] established their model using Inception v3, while Treder et al [] performed a deep CNN, a self-learning algorithm, processing input data with FAF images.

    A total of 14 performance sets were extracted from the 10 studies. Key metrics included sensitivity, specificity, accuracy, positive predictive value, negative predictive value, intersection over union, area under the receiver operating characteristic curve, area under the precision-recall curve, F1-score, precision, recall, Kappa, and dice similarity coefficient. Six OCT-based studies showed that DL models could detect GA with high accuracy, comparable to human graders [-,,,]. Two studies using CFP also reported strong performance [,], while FAF- and NIR-based approaches demonstrated excellent repeatability and reliability [,].

    We conducted a thorough evaluation of the 10 diagnostic studies’ methodological quality for the “participant selection,” “index test,” “reference standard,” and “flow and timing” domains at the study level (). None of the studies had an overall low or unclear risk of bias; instead, every study had a high risk of bias in at least 1 of the 4 domains. Regarding “patient selection,” only 4 studies [,,,] described the eligibility criteria; the rest did not report them. One study [] used an open dataset (ImageNet) and did not include a test set. The small sample size of 4 studies [,,,] may have resulted in overfitting. In addition, 3 studies [,,] did not report image formats and resolutions. Five studies [,,-] had a high risk of bias in participant selection because the included participants were not only GA secondary to dry AMD but also had other unrelated diseases. Regarding the “Index test,” only 1 algorithm was externally validated using a different dataset []; all other items were evaluated as low risk.

    Table 2. Methodological quality and applicability assessment for studies on geographic atrophy (GA) detection using the revised Quality Assessment of Diagnostic Accuracy Studies–Artificial Intelligence (QUADAS-AI).
    Study Risk of bias Concerns regarding applicability
    Patient selection Index test Reference standard Flow and timing Patient selection Index test Reference standard
    Chiang et al [] High risk Low risk Low risk Low risk Low risk Low risk Low risk
    Elsawy et al [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    Kalra et al [] High risk High risk Low risk Low risk High risk Low risk Low risk
    Keenan et al [] High risk High risk Low risk Low risk High risk Low risk Low risk
    Sarao et al [] High risk High risk Low risk Low risk High risk Low risk Low risk
    Yao et al [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    Treder et al [] High risk High risk Low risk Low risk High risk Low risk Low risk
    Vente et al [] High risk High risk Low risk Low risk High risk Low risk Low risk
    Derradji et al [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    Fineberg et al [] High risk High risk Low risk Low risk Low risk Low risk Low risk

    AI in GA Assessment and Progression

    Twenty studies explored AI for GA assessment and progression using noninvasive imaging, published between 2019 and 2025 (Table S2 in ). As shown in , these studies covered 11 segmentation [,,-], 2 algorithm optimization [,], 3 AMD progression classification [-], and 3 combined tasks such as identification, segmentation, and quantification [-]. One study focused solely on GA quantification []. Retrospective analyses accounted for 9 studies [,,,,,,,,], while 7 were model development [,-,,,], and the remainder were prospective [,], comparative [], or cross-sectional []. Geographically, contributions came from China (6/20), the United States (7/20), the United Kingdom (2/20), Australia (2/20), France (1/20), Israel (1/20), and Austria (1/20).

    Table 3. Characteristics of studies evaluating artificial intelligence (AI) models for geographic atrophy (GA) assessment and progression using noninvasive retinal imaging.
    Author Study design Region Purpose of the study Source of datasets Number of patients Number of images or scans Model evaluation method Image modality (Image resolution) AI algorithms Outcomes Performance of models
    Pramil et al [] Retrospective review of images United States (Boston) Segmentation (GA lesions) The “SWAGGER” cohort of the non-Exudative Age-Related Macular Degeneration (from New England Eye Center at Tufts Medical Center) 90 126 5-fold cross-validation SS-OCT (500*500 pixels) CNN: U-Net SEN, SPE, and DICE
    • SEN=0.95; SPE=0.91; DSC (vs G1): mean 0.92 (SD 0.11); DSC (vs G2): mean 0.91 (SD 0.12).
    Siraz et al [] Retrospective comparative study United States (North Carolina) Classification (central and noncentral GA) Atrium Health Wake Forest Baptist 104 355 NR SD-OCT (224*224 pixels) CNNs: ResNet50, MobileNetV2, and ViT-B/16 AUROC, F1, and ACC
    • ResNet50: AUROC: mean 0.545 (SD 0.004), F1: mean 0.431 (SD 0.00); ACC: mean 0.756 (SD 0.00).
    • MobileNetV2: AUROC: mean 0.521 (SD 0.016), F1: mean 0.432 (SD 0.002); ACC: mean 0.756 (SD 0.00).
    • ViT-B/16: AUROC: mean 0.718 (SD 0.002), F1: mean 0.602 (SD 0.004); ACC: mean 0.780 (SD 0.005).
    Arslan et al [] Retrospective cohort clinical study Australia (Victoria) Segmentation (GA lesion area) The Center for Eye Research Australia or a private ophthalmology practice diagnosed with GA 51 702 5-fold cross-validation FAF (768*768 or 1536*1536 pixels) CNN: U-Net DSC, DSC loss, SEN, SPE, MAE, ACC, R, and P
    • DSC: mean 0.9780 (SD 0.0124); DSC loss: mean 0.0220 (SD 0.0041); SEN: mean 0.9903 (SD 0.0041); SPE: mean 0.7498 (SD 0.0955); MAE: mean 0.0376 (SD 0.0184); ACC: mean 0.9774 (SD 0.0090); P: mean 9837 (SD 0.0116).
    Hu et al [] Retrospective clinical study China (Shenyang) Classification (dry AMD progression phases) Shenyang Aier Eye Hospital 338 3401 5-fold cross-validation SD-OCT (NR) CNNs: EfficientNetV2, DenseNet169, Xception, and ResNet50NF ACC, SEN, SPE, F1, Macro-f1, and Kappa
    • ACC=97.31%; SEN=89.25%; SPE=98.80%; F1=91.21%; Macro-f1=92.08%; Kappa=95.45%.
    Spaide et al [] Retrospective analysis and model comparison United States (Washington) Segmentation (GA lesion area) The SWAGGER cohort from the New England Eye Center at Tufts Medical Center 87 126 scans 5-fold cross-validation SS-OCT (NR) CNN: U-Net DSC
    • UNet-1: 0.82 (95% CI 0.78-0.86).
    • UNet-Avg: 0.88 (95% CI 0.85-0.91).
    • UNet-Drop: 0.90 (95% CI 0.87-0.93).
    Vogl et al [] Retrospective analysis Austria (Vienna) Identification (GA progression after pegcetacoplan treatment) The FILLY trial 156 NR NR SD-OCT (512*512 pixels) CNN: U-Net LPR
    • Compared with sham treatment, monthly: −28% (−42.8 to −9.4).
    • Every other month: −23.9% (−40.2 to −3.0).
    Szeskin et al [] Retrospective analysis Israel (Jerusalem) Identification, quantification (GA lesion) Datasets D1 and D2 from the Hadassah University Medical Center D1: 18; D2: 16 NR 4-fold cross-validation SD-OCT (496*1024 pixels and 496*1536 pixels) CNN: the custom column classification CNN AUROC, P, R, and F1
    • AUROC=0.970; (Segment) P: mean 0.84 (SD 0.11); R: mean 0.94 (SD 0.03); (Lesion) P: mean 0.72 (SD 0.03); R: mean 0.91 (SD 0.18).
    Spaide et al [] Retrospective analysis United States (California) Segmentation (GA lesion area) Proxima A and B Proxima A: 154; Proxima B: 183 Proxima A: 497; Proxima B: 940 NR FAF, NIR (768 *768 pixels) Multimodal DL: U-Net; YNet DSC and r2
    • (G1-Ynet)DSC: mean 0.92 (SD 0.09).
    • (G1-Unet)DSC: mean 0.90 (SD 0.09).
    • (G2-Ynet)DSC: mean 0.91 (SD 0.09).
    • (G2-Unet)DSC: mean 0.90 (SD 0.09).
    • (Ynet) r2: 0.981.
    • (Unet) r2: 0.959.
    AI-khersan et al [] Retrospective analysis United States (Texas) Segmentation (GA) The Retina Consultants of Texas and Retina Vitreous Associates 33; 326 367; 348 5-fold cross-validation SD-OCT (512*496pixels; 200*1024pixels) CNN: 3D-to-2D U-Net DSC and r2
    • For Spectralis data, DSC=0.826; r2=0.906.
    • For Cirrus data, DSC=0.824; r2=0.883.
    Chu et al [] Prospective study United States (Washington) Identification, segmentation, and quantification (GA) The University of Miami 70; 20; 25 NR NR SS-OCT (512*512 pixels) CNN: U-Net DSC, SEN, and SPE
    • DSC: mean 0.940 (SD 0.032). SEN=100%; SPE: 100%.
    Merle et al [] Prospective observational study Australia (Victoria) Quantification (GA) The Center for Eye Research Australia 50 NR NR SD-OCT; FAF (NR) CNN: U-Net Spearman correlation coefficient and SEN
    • (OCT-automatically) Spearman correlation coefficient=0.85 (95% CI 0.71-0.91); SEN=0.59.
    Yang et al [] Model development China (Shenyang) Classification (stage of dry AMD progression) Shenyang Aier Excellence Eye Hospital 1310 16,384 3-fold cross-validation SD-OCT (NR) CNNs: ResNet50, EfficientNetB4, MobileNetV3, Xception ACC, SEN, SPE, and F1
    • ACC(GA): ResNet50=92.35%; EfficientNetB4=93.85%; MobileNetV3=89.64%; Xception=91.16%.
    • ACC (nascent GA): ResNet50=91.56%; EfficientNetB4=89.66%; MobileNetV3=89.43%; Xception=85.22%.
    Ji et al [] Model development China (Nanjing) Segmentation (GA lesion area) Dataset1 and dataset2 8; 54 NR NR SD-OCT (224*224 pixels) Weakly supervised multitask learning: Mirrored X-Net DSC, IoU, AAD, and CC
    • DSC: mean 0.862 (SD 0.080); IoU: mean 0.765 (SD 0.119); AAD: mean 0.090 (SD 0.090); CC: 0.992.
    Ma et al [] Model development China (Jinan) Segmentation (GA lesion area) Dataset1 and dataset2 62 NR 5-fold cross-validation SD-OCT (224*224 pixels) Weakly supervised model: VGG16 DSC, OR, AAD, CC, and AUROC
    • DSC: mean 0.847 (SD 0.087); OR: mean 0.744 (SD 0.126); AAD: mean 0.150 (SD 0.149); CC: 0.969; AUROC: 0.933.
    Royer et al [] Model development France (Issy-Les-Moulineaux) Segmentation (GA lesion area) the Clinical Imaging Center of the Quinze-Vingts Hospital 18 328 8 different random combinations of 12 series to train the model and 6 for the tests NIR (256*256 pixels) Unsupervised neural networks: W-net F1, P, and R
    • F1: mean 0.87 (SD 0.07); P: mean 0.90 (SD 0.07); R: mean 0.85 (SD 0.11).
    Xu et al [] Model development China (Jinan) Segmentation (GA lesion area) dataset1 and dataset2 8 (test I); 56 (test II) 55 (dataset1); 56 (dataset2) NR SD-OCT (1024*512*128pixels; 1024*200*200pixels) Self-learning algorithm OR, AAD, and CC
    • OR: mean 84.48% (SD 11.98%); AAD: mean 11.09% (SD 13.61%); CC: 0.9948.
    Zhang et al [] Model development United Kingdom (London) Segmentation and quantification (GA) The FILLY study 200 984 NR SD-OCT (NR) CNN: U-Net DSC, ICC, ACC, SEN, SPE, and F1
    • Approach 1: ACC=0.91 (95% CI 0.89-0.93); F1=0.94 (95% CI 0.92-0.96); SEN=0.99 (95% CI 0.97-1.00); SPE=0.54 (95% CI 0.47-0.61); DSC: mean 0.92 (SD 0.14); ICC=0.94.
    • Approach 2: ACC=0.94 (95% CI 0.92-0.96); F1=0.96 (95% CI 0.94-0.98); SEN=0.98 (95% CI 0.96-1.00); SPE=0.76 (95% CI 0.70-0.82); DSC: mean 0.89 (SD 0.18); ICC: 0.91.
    Liu et al [] Model development China (Wuhan) Segmentation (GA) Wuhan Aier Eye Hospital; the public dataset OCTA500 300 2923 5-fold cross-validation SD-OCT (512*512 pixels) Self-learning algorithm (dual-branch image projection network) Jaccard index, DSC, ACC, P, and R
    • DSC: mean 7.03 (SD 2.73); Jaccard index: mean 80.96 (SD 4.29); ACC: mean 91.84 (SD 2.13); P: mean 87.12 (SD 2.34); R: mean 86.56 (SD 2.92).
    Williamson et al [] Cross-sectional study United Kingdom (London) Segmentation (GA lesion area) INSIGHT Health Data Research Hub at Moorfields Eye Hospital 9875 (OCT); 81 (FAF) NR NR 3D-OCT; FAF (512*512 pixels) Self-learning algorithm PPV
    Safai et al [] Comparative analysis United States (Wisconsin) Identification (the best AI framework for segmentation of GA) AREDS2 study; the GlaxoSmithKline (GSK) study 271(AREDS2); 100(GSK) 601 (AREDS2); 156 (GSK) 5-fold cross-validation FAF (512*512 pixels) CNNs: UNet, FPN, PSPNet, EfficientNet, ResNet, VGG, mViT CC and DSC
    • FPN_EfficientNet: CC=0.98, DSC=0.931.
    • FPN_CCesNet: CC=0.98, DSC=0.902.
    • FPN_VGG: CC=0.98, DSC=0.934.
    • FPN_mViT: CC=0.99, DSC=0.939.
    • UNet_EfficientNet: CC=0.98, DSC=0.924.
    • UNet_CCesNet: CC=0.97, DSC=0.930.
    • UNet_VGG: CC=0.97, DSC=0.896; UNet_mViT: CC=0.99, DSC=0.938.
    • PSPNet_EfficientNet: CC=0.93, DSC=0.890.
    • PSPNet_CCesNet: CC=0.87, DSC=0.877.
    • PSPNet_VGG: CC=0.95, DSC=0.900.
    • PSPNet_mViT: CC=0.98, DSC=0.889.

    aSS-OCT: swept-source OCT.

    bCNN: convolutional neural network.

    cSEN: sensitivity.

    dSPE: specificity.

    eDSC: dice similarity coefficient.

    fNR: not reported.

    gSD-OCT: spectral domain OCT.

    hAUROC: area under the receiver operating characteristic curve.

    iACC: accuracy.

    jCGA: central geographic atrophy.

    kNCGA: noncentral geographic atrophy.

    lFAF: fundus autofluorescence.

    mMAE: mean absolute error.

    nR: recall.

    oP: precision.

    pAMD: age-related macular degeneration.

    qLPR: local progression rate.

    rNIR: near-infrared reflectance.

    sDL: deep learning.

    tr2: Pearson correlation coefficient.

    uOCT: optical coherence tomography.

    vIoU: intersection over union.

    wAAD: absolute area difference.

    xCC: correlation coefficient.

    yOR: overlap ratio.

    zICC: intraclass coefficient.

    aaPPV: positive predictive value.

    abAREDS2: Age-Related Eye Disease Study 2.

    acFPN: Feature Pyramid Network.

    adVGG: Visual Geometry Group.

    aemViT: Mix Vision Transformer.

    Dataset configurations varied: 9 out of 20 studies used training, validation, and test sets [,,-,-]; 11 studies used training and test sets [,,-,]; 2 studies used training and validation sets [,]; 1 study comprised training, tuning, and internal validation sets []; and 2 studies did not specify [,]. Across studies, at least 14,064 participants provided image data for analysis. Less than half of the studies (9/20, 45%) provided demographic information, with the average age of participants ranging from 55 to 94 years. Six studies were registered with ClinicalTrials.gov (NCT01342926, NCT02503332, NCT02479386, NCT02399072, and NCT04469140 [,,,,,]). To assess the generalization ability of the DL model, cross-validation methods included 5-fold (8/20 studies [,,,-,]), 4-fold (1/20 study []), 3-fold (1/20 study []), and other approaches (1/20 study []). Nine studies did not report validation specifics.

    Multiple imaging modalities supported GA assessment: spectral domain optical coherence tomography (SD-OCT) was most common, followed by swept-source OCT (SS-OCT), 3D-OCT, FAF, and NIR. Preprocessing techniques were widely applied to standardize images and improve model performance. Algorithm architectures varied, with U-Net being the most frequently used. Other approaches included custom CNNs, self-learning algorithms, weakly supervised models, and multimodal networks. For example, Hu et al [] trained the DL models (ResNet-50, Xception, DenseNet169, and EfficientNetV2), evaluating them on a single fold of the validation dataset, with all F1-scores exceeding 90%. Yang [] proposed an ensemble DL architecture that integrated 4 different CNNs, including ResNet50, EfficientNetB4, MobileNetV3, and Xception, to classify dry AMD progression stages. GA lesions on FAF were automatically segmented using multimodal DL networks (U-Net and Y-Net) fed with FAF and NIR images []. In contrast to the multimodal algorithms mentioned above (ie, the examples of DL models), Safai [] investigated 3 distinct segmentation architectures along with 4 commonly used encoders, resulting in 12 different AI model combinations to determine the optimal AI framework for GA segmentation on FAF images.

    From 20 studies, 42 performance sets were collected. Common metrics included correlation coefficient, mean absolute error, Spearman correlation coefficient, intraclass coefficient, overlap ratio, Pearson correlation coefficient (r2), Kappa, specificity (SPE), sensitivity (SEN), accuracy, positive predictive value (PPV), F1-score, P, R, intersection over union, and dice similarity coefficient (DSC). Regarding the segmentation, classification, identification, and quantification of GA in SD-OCT, 12 studies demonstrated performance comparable to that of clinical experts [,,,,,-,,]. AI was also capable of efficiently detecting, segmenting, and measuring GA in SS-OCT, 3D-OCT, and FAF images, according to 4 studies [,,,]. AI for GA segmentation in FAF and NIR images, with clinical data showing good segmentation performance [,,].

    We performed a comprehensive assessment of the methodological quality of 16 GA assessment and progression studies encompassing 4 domains (). Only 8 studies detailed the eligibility criteria in the “patient selection” category, while the others had not been published. Three of the studies [-] lacked complete datasets, and 3 others [,,] had small datasets or limited volumes of data. In addition, 3 studies [,,] failed to provide information on image formats or resolutions. Two studies [,] were ranked as high risk regarding patient selection since the participants included other types of dry AMD (drusen, nascent GA). In terms of applicability, 18 studies were classified as low risk, while 2 were deemed high risk concerning patient selection. Concerning the “Index test,” only 3 algorithms underwent external validation with a different dataset [,,]. All other items were evaluated as low risk.

    Table 4. Methodological quality and applicability summary of geographic atrophy (GA) assessment and progression studies using revised Quality Assessment of Diagnostic Accuracy Studies–Artificial Intelligence (QUAUAS-AI).
    Study Risk of bias Concerns regarding applicability
    Patient selection Index test Reference standard Flow and timing Patient selection Index test Reference standard
    M Hu [] High risk High risk Low risk Low risk High risk Low risk Low risk
    JK Yang [] High risk High risk Low risk Low risk High risk Low risk Low risk
    A Safai [] Low risk Low risk Low risk Low risk Low risk Low risk Low risk
    WD Vogl [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    A Szeskin [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    ZD Chu [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    ZX Ji [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    X Ma [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    C Royer [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    T Spaide [] High risk Low risk Low risk Low risk Low risk Low risk Low risk
    T Spaide [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    DJ Williamson [] Low risk High risk Low risk Low risk Low risk Low risk Low risk
    RB Xu [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    J Arslan [] Low risk High risk Low risk Low risk Low risk Low risk Low risk
    V Pramil [] Low risk High risk Low risk Low risk Low risk Low risk Low risk
    GY Zhang [] High risk Low risk Low risk Low risk Low risk Low risk Low risk
    DA Merle [] High risk High risk Low risk Low risk Low risk Low risk Low risk
    H AI-khersan [] Low risk High risk Low risk Low risk Low risk Low risk Low risk
    S Siraz [] Low risk High risk Low risk Low risk Low risk Low risk Low risk
    XM Liu [] High risk High risk Low risk Low risk Low risk Low risk Low risk

    AI in Predicting GA Lesion Area and Progression

    Eleven studies used AI for predicting GA lesion growth and progression using noninvasive imaging (Table S3 in ). These studies were published between 2021 and 2025, with some information provided in . The study designs consisted of 6 retrospective studies [-], 2 model development studies [,], 2 post hoc analyses [,], and 1 clinical evaluation of a DL algorithm []. Participants or images came from various regions: 6 studies were based in the United States [,-,], 3 in Australia [-], 1 in Switzerland [], and another involving multiple centers in China and the United States []. Research aims focused on GA growth prediction [,,-,,], combined prediction and evaluation of lesion features [], treatment response assessment [], and integrated segmentation-prediction tasks [,].

    Table 5. Characteristics of studies evaluating artificial intelligence (AI) models for geographic atrophy (GA) prediction using noninvasive retinal imaging.
    Author Study design Region Purpose of the study Source of datasets Number of patients Number of images or scans or cubes Model evaluation method Image modality (resolution) AI algorithms Outcomes Performance of models
    Gigon et al [] Retrospective monocentric study Switzerland (Lausanne) Prediction (RORA progression) Jules Gonin Eye Hospital 119 NR NR SD-OCT (384*384 pixels) CNN: EfficientNet-b3 DSC
    • 0-6 months: 0.84
    • 6-12 months: 0.84
    • >12 months: 0.89
    Dow et al [] Retrospective cohort study United States (Atlanta, Georgia, Portland, Oregon, North Carolina; Maryland, Raleigh, Morrisville, Cary); United Kingdom (Durham, South Durham) Prediction (iAMD to GA within 1 year) 3 independent datasets from AREDS2 and a tertiary referral center and associated satellites 316; 53; 48 1085; 53; 48 5-fold cross-validation SD-OCT (512 *1000 pixels) CNN: Inception v3 SEN, SPE, PPV, NPV, ACC
    • SEN: 0.91 (95% CI 0.74-0.98); SPE: 0.80 (95% CI 0.63-0.91); PPV: 0.78 (95% CI 0.70-0.85); NPV: 0.92 (95% CI 0.90-0.95); ACC: 0.85 (95% CI 0.87-0.91)
    Cluceru et al [] Retrospective clinical study; observation study United States (California) Prediction and evaluation (GA growth rate and GA features related to shape and size) The lampalizumab phase 3 clinical trials and an accompanying observational study 1041; 255 NR 5-fold cross-validation FAF (384 * 384 pixels) CNN: VGG16 r2
    • Full FAF images: 0.44 (95% CI 0.36-0.49)
    • Rim only: 0.37 (95% CI 0.35-0.4)
    • Lesion only: 0.34 (95% CI 0.31-0.36)
    • Background only: 0.3 (95% CI 0.27-0.33)
    • Mask only: 0.27 (95% CI 0.24-0.29)
    Anegondi et al [] Retrospective clinical study; observation study United States (California) Prediction and prognosis (GA lesion area and GA growth rate after lampalizumab treatment) The lampalizumab phase 3 clinical trials and an accompanying observational study 1279; 443; 106; 169 NR 5-fold cross-validation SD-OCT, FAF (512*512 pixels) CNN: Inception v3 r2 GA prediction:

    • FAF-only: 0.98 (95% CI 0.97‐0.99)
    • OCT-only: 0.91 (95% CI 0.87‐0.95),
    • Multimodal: 0.94 (95% CI 0.92‐0.96).

    GA growth rate:

    • FAF-only: 0.65 (95% CI 0.52‐0.75),
    • OCT-only: 0.36 (95% CI 0.29‐0.43),
    • Multimodal: 0.47 (95% CI 0.40‐0.54)
    Salvi et al [] Retrospective analysis United States (California) Prediction (the 1 year region of growth of GA lesions) The following lampalizumab clinical trials and prospective observational studies 597 NR NR FAF (768*768 pixels or 1536*1536 pixels) CNN: U-Net P, R, DSC, r2 Whole lesion:

    • P: mean 0.70 (SD 0.12); R: mean 0.73 (SD 0.12); DSC: mean 0.70 (SD 0.09); r2: 0.79
    Yoshida [] Retrospective analysis United States (California) Prediction (GA progression) Three prospective clinical trials 1219; 442 NR 5-fold cross-validation 3D OCT (496*1024*49 voxels) CNNs: (1) en-face intensity maps; (2) SLIVER-net; (3) a 3D CNN; and (4) en-face layer thickness and between-layer intensity maps from a segmentation model r2
    • GA lesion area: En-face intensity map: 0.91; SLIVER-net: 0.83; 3D DenseNet: 0.90; OCT EZ and RPE thickness map: 0.90;
    • GA growth rate: En-face intensity map: 0.33; SLIVER-net: 0.33; 3D DenseNet: 0.35; OCT EZ and RPE thickness map: 0.35.
    GS Reiter [] Post hoc analysis Austria (Vienna) Prediction (GA lesions progression) the phase II randomized controlled trial FILLY 134 268 scans 5-fold cross-validation FAF, NIR, SD-OCT (NR) CNN: PSC-UNet ACC, Kappa, concordance index
    • ACC: 0.48; Kappa: 0.23; concordance index: 0.69
    J Mai [] Post hoc analysis Austria (Vienna) Segmentation, quantification, and prediction (GA lesion and progression) The phase 2 FILLY clinical trial and the Medical University of Vienna (MUV) 113; 100 226; 967 5-fold cross-validation SD-OCT, FAF (768*768 and 1536*1536 pixels) CNN: U-Net DSC, Hausdorff distance, ICC
    • MUV: DSC: mean 0.86 (SD 0.12); Hausdorff distance: mean 0.54 (SD 0.45);
    • FILLY: DSC: mean 0.91 (SD 0.05); Hausdorff distance: mean 0.38 (SD 0.40)
    YH Zhang [] Model development China (Nanjing); United States (California) Prediction (GA growth) The Byers Eye Institute of Stanford University; the Jiangsu Provincial People’s Hospital 22; 3 86 cubes; 33 cubes Leave-one-out cross-validation SD-OCT (178*270 pixels) Recurrent neural network: the bi-directional long-short term memory network; CNN: 3D-UNet DSC, CC
    • Scenario I: DSC: 0.86; CC: 0.83;
    • Scenario II: DSC: 0.89; CC: 0.84;
    • Scenario III: DSC: 0.89; CC: 0.86;
    • Scenario IV: DSC: 0.92; CC: 0.88;
    • Scenario V: DSC: 0.88; CC: 0.85;
    • Scenario VI: DSC: 0.90; CC: 0.86
    SX Wang [] Model development United States (California) Segmentation and prediction (GA lesion area and GA progression) The University of California—Los Angeles 147 NR 8-fold cross-validation SD-OCT, FAF (512*512 pixels) CNN: U-Net SEN, SPE, ACC, OR
    • ACC: 0.95; SEN: 0.60; SPE: 0.96; OR: 0.65
    J Mai [] Clinical evaluation of a DL-based algorithm Austria (Vienna) Prediction (GA lesions progression) The Medical University of Vienna 100 967 5-fold cross-validation SD-OCT, FAF (NR) CNN: PSC-UNet DSC, MAE, and r2
    • 0-1 year: DSC: mean 0.25 (SD 0.16); MAE: mean 0.13 (SD 0.11)
    • 1-2 years: DSC: mean 0.38 (SD 0.20); MAE: mean 0.25 (SD 0.24);
    • 2-3 years: DSC: mean 0.38 (SD 0.21); MAE: mean 0.35 (SD 0.34);
    • >3 years: DSC: mean 0.37 (SD 0.23); MAE: mean 0.72 (SD 0.48)

    aRORA: retinal pigment epithelial and outer retinal atrophy.

    bNR: not reported.

    cOCT: optical coherence tomography.

    dCNN: convolutional neural network.

    eDSC: dice similarity coefficient.

    fAMD: age-related macular degeneration.

    gAREDS2: Age-Related Eye Disease Study 2.

    hSEN: sensitivity.

    iSPE: specificity.

    jPPV: positive predictive value.

    kNPV: negative predictive value.

    lACC: accuracy.

    mFAF: fundus autofluorescence.

    nr2: Pearson correlation coefficient.

    oP: precision.

    pR: recall.

    qEZ: ellipsoid zone.

    rRPE: retinal pigment epithelium.

    sNIR: near-infrared reflectance.

    tICC: intraclass coefficient.

    uCC: correlation coefficient.

    vOR: overlap ratio.

    wMAE: mean absolute error.

    Dataset structures varied: 3 out of 11 studies used training-validation-test splits [,,]; 2 out of 11 studies used training-test sets [,]; 3 out of 11 studies used training-validation sets [,,]; and the rest adopted development–holdout [,] or development-holdout-independent test configurations []. In total, 6706 participants were included across studies. Fewer than half of the studies (4/11, 36.4%) reported demographic information, with mean age ranges spanning from 74 to 83 years [,,,]. Six studies [-,,] were ethically approved and registered on ClinicalTrials.gov under the following identifiers: NCT02503332, NCT02247479, NCT02247531, NCT02479386, NCT01229215, and NCT02399072. The DL model’s generalizability was assessed using leave-one-out cross-validation in 1 study [], 5-fold cross-validation in 7 studies [,,,,-], and 8-fold cross-validation in 1 study []. The remaining 2 studies [,] did not specify the cross-validation methodology.

    Studies used 3D-OCT, SD-OCT, NIR, and FAF images, primarily sourced from Heidelberg, Zeiss, and Bioptigen devices. While most reported image metrics, 2 studies did not specify resolution details [,]. Commonly used DL architectures included Inception v3 [,], PSC-UNet [,], U-Net [,,], EfficientNet-b3 [], and VGG16 []. In addition, some studies introduced novel approaches, such as en-face intensity maps, SLIVER-net, 3D CNN, and a recurrent neural network, for improved GA progression forecasting.

    According to various image modalities, datasets, and follow-up durations, we gathered 31 sets of performance data from 11 studies. The performance metrics included the Hausdorff distance, concordance index, overlap, SEN, SPE, accuracy, mean absolute error, Kappa, DSC, P, PPV, R, r2, and negative predictive value. The findings for a single image modality (3D-OCT, SD-OCT, or FAF) demonstrated the development of DL algorithms to predict GA growth rate and progression with excellent performance characteristics comparable to trained experts [-,-]. Multimodal approaches combining FAF, NIR, and SD-OCT further showed feasibility for individualized lesion growth prediction and localization [,-].

    In this systematic review, we used the PROBAST tool to rigorously evaluate prediction models across 4 domains, addressing 20 signaling questions for each paper reviewed. Within the “participants” domain, all studies used appropriate data sources; however, only 6 studies [-,,] clearly outlined their inclusion and exclusion criteria for participants, leaving the others unclear. In terms of “predictors,” these were defined and evaluated similarly for all participants, having no connection to outcome data and being available at baseline. All studies evaluated “yes” to the questions on outcome measurement methods, definitions, interference factors, and measurement time intervals. Concerning “analysis,” Dow [] and Zhang [] applied a small dataset with an insufficient number of participants. While Zhang performed internal validation, the lack of external validation notably limits the model’s generalizability, which was constructed with bi-directional long-short term memory networks and CNN frameworks. Two studies by Salvi [] and Yoshida [] lacked independent and external validation. Gigon [] failed to explicitly mention missing data handling, complex problems, and model overfitting. Conversely, all other items were evaluated as low risk, and the applications of the studies were universally ranked as low risk (Table S1 in ).

    Principal Findings

    This systematic review evaluated the performance of AI, particularly DL algorithms, in detecting and managing GA secondary to dry AMD using noninvasive imaging modalities. Our findings demonstrate that AI models exhibit strong capabilities in accurately detecting, segmenting, quantifying, and predicting GA progression from OCT, FAF, CFP, and NIR imaging, achieving diagnostic accuracy comparable to that of human experts. However, this review also identified several methodological challenges, such as limited sample sizes, inconsistent annotation standards, and a general lack of external validation, which may hinder the clinical generalizability and practical application of these models. Despite these limitations, AI-based tools show significant potential for future use by both specialists and nonspecialists in primary and specialty care settings.

    AI in Detecting GA With OCT, FAF, NIR, and CFP Images

    Ten studies published between 2018 and 2025 were included, involving at least 7132 participants aged 50 to 85 years. Half of the studies were conducted in the United States, while others originated from European countries. SD-OCT was the most frequently used imaging modality (6/10 studies), followed by CFP (2/10 studies), NIR (1/10 studies), and FAF (1/10 studies). Image preprocessing techniques, such as standardization of size, orientation, and intensity, as well as noise reduction, were consistently applied to enhance model stability and training efficiency. However, 3 studies did not report critical image parameters, such as resolution, potentially limiting reproducibility. DL-based algorithms, including CNNs, were the primary methodologies used for GA detection. Cross-validation techniques, such as 5-fold and 10-fold methods, were used in half of the studies to assess model robustness, though 3 studies did not report validation strategies. AI, particularly DL algorithms, holds significant promise for the detection of GA using noninvasive imaging modalities. OCT, CFP, NIR, and FAF each demonstrated robust diagnostic potential, with performance metrics rivaling or exceeding human expertise.

    AI for GA Management With OCT, FAF, and NIR Images

    A total of 20 studies (14,064 participants) were published between 2019 and 2025, focusing on themes such as GA segmentation, classification, quantification, and progression prediction. The research designs and geographic regions are diverse. The studies included retrospective analysis (9/20), model development (7/20), and prospective, comparative, or cross-sectional studies (4/20). Significant contributions came from China (6/20) and the United States (7/20), with additional studies from the United Kingdom (2/20), Australia (2/20), France (1/20), Israel (1/20), and Austria (1/20). The studies used a variety of imaging modalities to assess GA, including SD-OCT, FAF, NIR, SS-OCT, and 3D-OCT. DL algorithms demonstrated remarkable performance in GA management tasks. U-Net was the most commonly used architecture. Multimodal approaches combined FAF and NIR images with DL networks to improve segmentation accuracy. Performance metrics, such as DSC, Kappa, SEN, SPE, and accuracy, consistently showed strong diagnostic accuracy, with several studies achieving performance comparable to clinical experts.

    Eleven studies with 6706 participants, published between 2021 and 2025, concentrated on the application of AI for predicting and segmenting GA lesions, as well as their growth and progression. The methodologies were diverse, including retrospective studies, model development studies, post hoc analyses, and clinical algorithm assessment. Participants or images were gathered from regions such as the United States, Australia, Switzerland, and various centers in China and the United States, ensuring broad geographic representation. Demographic information was reported in fewer than half of the studies, with a mean age ranging from 74 to 83 years. Imaging modalities, such as 3D-OCT, SD-OCT, NIR, and FAF, were obtained from devices including Bioptigen, Heidelberg Spectralis HRA+OCT, and Cirrus OCT. While the image preprocessing parameters were consistent across most studies, some did not specify image resolution. Multiview CNN architectures and advanced frameworks, such as the bi-directional long-short term memory networks, were used. DL algorithms exhibited excellent predictive capabilities, with multimodal approaches enabling individualized GA lesion growth prediction.

    Noninvasive Image Analysis Techniques for GA

    GA, a late-stage form of dry AMD, is marked by the irreversible loss of photoreceptors, RPE, and choriocapillaris [,]. The application of noninvasive imaging modalities has revolutionized the detection and management of GA. A comparative summary of AI performance across these modalities is provided in Table S2 in . CFP serves as a standard initial assessment tool, useful for screening and early detection. It identifies GA lesions as visible underlying choroidal vessels and well-defined regions of RPE hypopigmentation []. FAF imaging using a blue excitation wavelength (488 nm) visualizes metabolic changes at the level of photoreceptor or RPE complex and is practical in assessing GA lesion size and progression with hypo-autofluorescence []. In contrast to nonatrophic areas, GA lesions on NIR (787-820 nm, longer than FAF) typically appear brighter and less harmful to the eye []. In addition, NIR can help detect the boundaries of foveal lesions, where image contrast is lower on FAF []. Recently, the Classification of Atrophy Meeting group recommended that atrophy in both patients with and those without neovascular AMD be defined based on specific drusen characteristics and other anatomical features, and it is most easily characterized by OCT [,]. OCT stands out as the gold standard for GA detection and classification, providing high-resolution, cross-sectional, and en face images of the retina and choroid. SD-OCT is widely used in research and clinical trials, offering precise measurement of GA area and growth rates, while SS-OCT and 3D-OCT offer superior structural insights and potential for AI-driven automation [,,]. Despite the higher cost and technical complexity of advanced OCT technologies, their detailed GA assessment capabilities make them indispensable tools in both clinical practice and research. Furthermore, OCT provides volumetric (3D) structural data, unlike the 2D en face projections of FAF, CFP, and NIR. It allows AI to learn not just the surface appearance of atrophy but also the cross-sectional structure alterations that define and precede GA []. As technology advances, the integration of AI and further developments in imaging techniques are expected to enhance the utility of these modalities, overcoming current limitations and expanding their applications in ophthalmology.

    Advantages and Challenges of AI Architectures in Clinical Workflow

    AI addresses critical limitations of traditional GA monitoring, such as labor-intensive manual grading and intergrader variability []. Therefore, automated algorithms enable rapid, standardized analysis of large fundus image datasets, reducing clinician workload and enhancing reproducibility []. Furthermore, our review revealed a clear trend in the choice of model architectures tailored to specific clinical tasks. A critical analysis of these architectures is provided in Table S3 in . Interestingly, with the advancement of AI algorithm architectures, numerous studies have emerged that use these technologies to identify atrophy caused by various retinal diseases and to evaluate treatment outcomes through image analysis. Miere et al [] pretrained a DL-based classifier to automatically distinguish GA from atrophy secondary to inherited retinal diseases on FAF according to etiology, using 2 approaches (a trained and validated method and a 10-fold cross-validation method), achieving good accuracy and excellent area under the receiver operating characteristic (AUROC) values. In addition, a study examined the association between treatment and changes in photoreceptor lamina thickness in patients with GA secondary to AMD. The effect of pegcetacoplan on photoreceptors in OCT was supported by this post hoc analysis, which demonstrated that treatment with the drug was linked with reduced outer retinal thinning []. Similarly, DL-based OCT image analysis assessed the therapeutic effectiveness of complement component 3 inhibition in delaying GA progression, with findings indicating decreased photoreceptor thinning and loss []. Recent studies demonstrating the application of AI algorithms in imaging further validate their potential as reliable supplements to human expertise in the diagnosis and management of GA.

    Technical Challenges and Limitations

    Despite the promising advancements in AI for GA detection and management, several technical challenges and limitations persist. A significant limitation of OCT-based AI models is their difficulty in distinguishing GA secondary to AMD from other forms of retinal atrophy; thus, the findings may not generalize to broader AMD cases or other retinal diseases, which limits their clinical applicability. In addition, images from different OCT devices show significant variability and imprecision, not offering good enough data acquisition []. Another major challenge is the variability in algorithm performance caused by differences in training data, image acquisition protocols, and disease definitions. These differences reduce reproducibility and limit practical deployment. For instance, the absence of standardized reporting in AI studies can result in discrepancies when interpreting results and hinder comparisons between different models. Moreover, despite the high-performance metrics (eg, SEN, SPE, DSC>0.85, and AUROC>0.95) reported by many studies, methodological limitations remain. All diagnostic studies included in this review were assessed as high risk in at least 1 domain (10/10), only 1 GA assessment study (1/20) was evaluated as low risk across all domains, and several prediction studies (7/11) were ranked as high or unclear risk in at least 1 domain, primarily due to small or nonrepresentative datasets and a lack of detailed reporting on image preprocessing and external validation. These methodological shortcomings may lead to an overestimation of AI model performance and reduced overall robustness, thereby decreasing the generalizability of the findings and limiting confidence in their real-world applicability. Future studies should prioritize the use of larger, more diverse datasets and implement rigorous validation frameworks to enhance performance metrics (including detection, segmentation, quantification, and prediction accuracy) and conduct prospective, multicenter validation studies to improve clinical applicability and generalizability. Furthermore, adherence to established reporting guidelines for AI studies (such as the Standards for Reporting Diagnostic Accuracy-AI and Checklist for Artificial Intelligence in Medical Imaging [,]) would improve comprehension and transparency, allow for more meaningful comparisons between systems, and facilitate meta-analyses.

    Real-World Implications and Research Contributions

    Overall, despite some limitations, AI is constantly evolving and holds great potential for transformation in the health care sector []. AI has the potential to accelerate existing forms of medical analysis; however, its algorithms require further testing to be fully trusted. Clinically, AI-based automated tools show strong potential to facilitate early detection, precise quantification, progression, and prediction of GA, thereby reducing the burden on retinal specialists and improving diagnostic consistency. Furthermore, DL algorithms have demonstrated effectiveness in identifying retinal image features associated with cognitive decline, dementia, Parkinson disease, and cardiovascular risk factors []. These findings indicate that AI-based retinal images hold promise for transforming primary care and systemic disease management. Although most AI applications remain in the validation phase, the integration of AI with multimodal imaging, novel biomarkers, and emerging therapeutics holds promise for transforming clinical management paradigms in GA and advancing personalized medicine. Future efforts should focus on developing standardized datasets, improving algorithmic generalizability, and conducting real-world validation studies to fully integrate AI into routine ophthalmic practice.

    Conclusion

    AI, especially DL-based algorithms, holds considerable promise for the detection and management of GA secondary to dry AMD, with performance comparable to trained experts. This systematic review synthesizes and critically appraises the current evidence, highlighting that AI’s capabilities extend across GA management—from initial detection and precise segmentation to the forecasting of lesion progression, which informs future research directions. Meanwhile, with the development of C5 inhibitors, AI-based noninvasive fundus image analysis is expected to detect, identify, and monitor GA at an early stage, thereby increasing the window of opportunity in the future. AI has strong potential to augment and streamline clinical workflows by offering automated, reproducible analysis that can assist clinicians in managing large volumes of imaging data; however, more studies are needed to further validate its effectiveness, repeatability, and accuracy.

    The authors declared that artificial intelligence (AI) or AI-assisted technologies were not used in the writing process of this manuscript.

    This research was funded by the Central High-Level Traditional Chinese Medicine Hospital Project of the Eye Hospital, China Academy of Chinese Medical Sciences (grant no GSP5-82); the National Natural Science Foundation of China (grant no 82274589); the Science and Technology Innovation Project, China Academy of Chinese Medical Sciences (grant no CI2023C008YG); the Institute-level Research Launch Fund of the Eye Hospital, China Academy of Chinese Medical Sciences (grant no kxy-202402); and the Special Project for the Director of the Business Research Office (grant no 2020YJSZX-2).

    All data generated or analyzed during this study are included in this published article and its multimedia appendix files.

    None declared.

    Edited by Amaryllis Mavragani, Stefano Brini; submitted 26.Jul.2025; peer-reviewed by Jiale Zhang, Xiaolong Liang; final revised version received 11.Oct.2025; accepted 11.Oct.2025; published 21.Nov.2025.

    © Nannan Shi, Jiaxian Li, Mengqiu Shang, Weidao Zhang, Kai Xu, Yamin Li, Lina Liang. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 21.Nov.2025.

    This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

    Continue Reading

  • Why trouble for the biggest foreign buyer of U.S. debt could ripple through America’s bond market

    Why trouble for the biggest foreign buyer of U.S. debt could ripple through America’s bond market

    By Vivien Lou Chen

    Developments in Japan are creating a risk that investors in the U.S. Treasury market may one day pull the rug out by keeping more of their savings at home

    Why turmoil around Japan’s new government could wash up in U.S. financial markets.

    Recent developments overseas have the potential to complicate the White House’s agenda to bring down borrowing costs, while heightening competition for investors in the U.S. and Japanese bond markets.

    Aggressive fiscal-stimulus efforts by the cabinet of Japan’s first female prime minister, Sanae Takaichi, have created a spike in long-dated yields of Japanese government bonds and further weakness in the yen (USDJPY) in the past few weeks. It’s a situation that is being likened to the September-October 2022 crisis in the U.K., which stemmed from a crisis in confidence over a package of unfunded tax cuts proposed by then-Prime Minister Liz Truss’s government.

    Read: Liz Truss redux? Simultaneous drop for Japanese currency and bonds draws eerie parallels

    The U.S. needs to manage the cost of interest payments given a more than $38 trillion national debt, and this is a primary motivation for why the Trump administration wants to bring down long-term Treasury yields. Last week, Treasury Secretary Scott Bessent said in a speech in New York that the U.S. is making substantial progress in keeping most market-based rates down. He also said the 10-year “term premium,” or additional compensation demanded by investors to hold the long-dated maturity, is basically unchanged. Longer-duration yields matter because they provide a peg for borrowing rates used by U.S. households, businesses and the government.

    Developments in Japan are now creating the risk that U.S. yields could rise alongside Japan’s yields. This week, Japanese government-bond yields hit their highest levels in almost two decades, with the country’s 10-year rate BX:TMBMKJP-10Y spiking above 1.78% to its highest level in more than 17 years. The 40-year yield BX:TMBMKJP-40Y climbed to an all-time high just above 3.7%.

    In the U.S., 2- BX:TMUBMUSD02Y and 10-year yields BX:TMUBMUSD10Y finished Friday’s session at the lowest levels of the past three weeks, at 3.51% and almost 4.06% respectively. The 30-year U.S. yield BX:TMUBMUSD30Y fell to 4.71% or lowest level since Nov. 13.

    There’s a risk now that U.S. yields may not fall as much as they otherwise might after factoring in market-implied expectations for a series of interest-rate cuts by the Federal Reserve into 2026.

    Japan’s large U.S. footprint

    Treasury yields are not going to necessarily follow rates on Japanese government bonds higher “on a one-for-one basis,” but there might be a limit on how low they can go, said Adam Turnquist, chief technical strategist at LPL Financial. He added that the impact of Japanese developments on the U.S. bond market could take years to play out, but “we care now because of the direction Japan’s policy is going in” and the possibility that this impact might occur even sooner.

    Some of the catalysts that usually tend to push Treasury yields lower, such as any commentary from U.S. monetary policymakers that suggests the Fed might be inclined to cut rates, “might be muted because of the increased value of foreign debt,” Turnquist added.

    U.S. government debt rallied for a second day on Friday, pushing yields lower, after New York Fed President John Williams said there is room to cut interest rates in the near term.

    All three major U.S. stock indexes DJIA SPX COMP closed higher Friday, but notched sharp weekly losses, as investors attempted to calm doubts over the artificial-intelligence trade.

    The troubling spike in yields on Japanese government bonds hasn’t fully spilled over into the U.S. bond market yet, but it remains a risk. “A repeat of the Truss episode is what people are afraid of,” said Marc Chandler, chief market strategist and managing director at Bannockburn Capital Markets.

    Concerns about Japan gained added significance on Friday, when Takaichi’s cabinet approved a 21.3 trillion yen (or roughly $140 billion) economic stimulus package, which Reuters described as lavish. The amount of new spending being injected into the country’s economy from a supplementary budget, much of which is not repurposed from existing funds, is 17.7 trillion yen ($112 billion).

    Anxiety over Takaichi’s stimulus efforts has resulted in a Japanese yen that has weakened against its major peers and fallen to a 10-month low ahead of Friday’s session, and in a spike in the country’s long-dated yields. Yields on 30-year BX:TMBMKJP-30Y Japanese government debt have risen this month to 3.33%.

    Japan is the biggest foreign holder of Treasurys, with a roughly 13% share, according to the most recent data from the U.S. Treasury Department, and the concern is that the country’s investors might one day pull the rug by keeping more of their savings at home.

    Bond-auction anxiety

    Earlier in the week, a weak 20-year auction in Japan was cited as one reason why U.S. Treasury yields were a touch lower in early New York trading, which means that demand for U.S. government paper remained in place. Global investors are often incentivized to move their money based on which country offers the highest yields and best overall value.

    “The conventional wisdom is that as yields rise in Japan, the Japanese are more likely to keep their savings at home rather than export it,” Chandler said. “The Japanese have been buyers of Treasurys and U.S. stocks, and if they decide to keep their money at home, those U.S. markets could lose a bid.”

    For now, Japanese investors, which include insurers and pension funds, appear to be continuing to export their savings by buying more foreign government debt like Treasurys. Data from the U.S. Treasury Department shows that as of September, Japanese investors held just under $1.19 trillion in Treasurys, a number which has been climbing every month this year and is up from about $1.06 trillion last December.

    One reason for this is the exchange rate. The yen has depreciated against almost every major currency this year. Japanese investors have been buying U.S. Treasurys because they can diversify against the yen, which is the weakest of the G-10 currencies on an unhedged basis, according to Chandler.

    If concerns about the Takaichi government’s stimulus efforts translate into even higher yields in Japan, this could incentivize local investors to keep more of their savings at home, but might also mean rising yields for countries like the U.S.

    -Vivien Lou Chen

    This content was created by MarketWatch, which is operated by Dow Jones & Co. MarketWatch is published independently from Dow Jones Newswires and The Wall Street Journal.

    (END) Dow Jones Newswires

    11-21-25 1609ET

    Copyright (c) 2025 Dow Jones & Company, Inc.

    Continue Reading

  • Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

    Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

    • We’re introducing Zoomer, Meta’s comprehensive, automated debugging and optimization platform for AI. 
    • Zoomer works across all of our training and inference workloads at Meta and provides deep performance insights that enable energy savings, workflow acceleration, and efficiency gains in our AI infrastructure. 
    • Zoomer has delivered training time reductions, and significant QPS improvements, making it the de-facto tool for AI performance optimization across Meta’s entire AI infrastructure.

    At the scale that Meta’s AI infrastructure operates, poor performance debugging can lead to massive energy inefficiency, increased operational costs, and suboptimal hardware utilization across hundreds of thousands of GPUs. The fundamental challenge is achieving maximum computational efficiency while minimizing waste. Every percentage point of utilization improvement translates to significant capacity gains that can be redirected to innovation and growth.

    Zoomer is Meta’s automated, one-stop-shop platform for performance profiling, debugging, analysis, and optimization of AI training and inference workloads. Since its inception, Zoomer has become the de-facto tool across Meta for GPU workload optimization, generating tens of thousands of profiling reports daily for teams across all of our apps. 

    Why Debugging Performance Matters

    Our AI infrastructure supports large-scale and advanced workloads across a global fleet of GPU clusters, continually evolving to meet the growing scale and complexity of generative AI.

    At the training level it supports a diverse range of workloads, including powering models for ads ranking, content recommendations, and GenAI features.  

    At the inference level, we serve hundreds of trillions of AI model executions per day.

    Operating at this scale means putting a high priority on eliminating GPU underutilization. Training inefficiencies delay model iterations and product launches, while inference bottlenecks limit our ability to serve user requests at scale. Removing resource waste and accelerating workflows helps us train larger models more efficiently, serve more users, and reduce our environmental footprint.

    AI Performance Optimization Using Zoomer

    Zoomer is an automated debugging and optimization platform that works across all of our AI model types (ads recommendations, GenAI, computer vision, etc.) and both training and inference paradigms, providing deep performance insights that enable energy savings, workflow acceleration, and efficiency gains.  

    Zoomer’s architecture consists of three essential layers that work together to deliver comprehensive AI performance insights: 

    Infrastructure and Platform Layer

    The foundation provides the enterprise-grade scalability and reliability needed to profile workloads across Meta’s massive infrastructure. This includes distributed storage systems using Manifold (Meta’s blob storage platform) for trace data, fault-tolerant processing pipelines that handle huge trace files, and low-latency data collection with automatic profiling triggers across thousands of hosts simultaneously. The platform maintains high availability and scale through redundant processing workers and can handle huge numbers of profiling requests during peak usage periods.

    Analytics and Insights Engine

    The core intelligence layer delivers deep analytical capabilities through multiple specialized analyzers. This includes: GPU trace analysis via Kineto integration and NVIDIA DCGM, CPU profiling through StrobeLight integration, host-level metrics analysis via dyno telemetry, communication pattern analysis for distributed training, straggler detection across distributed ranks, memory allocation profiling (including GPU memory snooping), request/response profiling for inference workloads, and much more. The engine automatically detects performance anti-patterns and also provides actionable recommendations.

    Visualization and User Interface Layer

    The presentation layer transforms complex performance data into intuitive, actionable insights. This includes interactive timeline visualizations showing GPU activity across thousands of ranks, multi-iteration analysis for long-running training workloads, drill-down dashboards with percentile analysis across devices, trace data visualization integrated with Perfetto for kernel-level inspection, heat map visualizations for identifying outliers across GPU deployments, and automated insight summaries that highlight critical bottlenecks and optimization opportunities.

    The three essential layers of Zoomer’s architecture.

    How Zoomer Profiling Works: From Trigger to Insights

    Understanding how Zoomer conducts a complete performance analysis provides insight into its sophisticated approach to AI workload optimization.

    Profiling Trigger Mechanisms

    Zoomer operates through both automatic and on-demand profiling strategies tailored to different workload types. For training workloads, which involve multiple iterations and can run for days or weeks, Zoomer automatically triggers profiling around iteration 550-555 to capture stable-state performance while avoiding startup noise. For inference workloads, profiling can be triggered on-demand for immediate debugging or through integration with automated load testing and benchmarking systems for continuous monitoring.

    Comprehensive Data Capture

    During each profiling session, Zoomer simultaneously collects multiple data streams to build a holistic performance picture: 

    • GPU Performance Metrics: SM utilization, GPU memory utilization, GPU busy time, memory bandwidth, Tensor Core utilization, power consumption, clock frequencies, and power consumption data via DCGM integration.
    • Detailed Execution Traces: Kernel-level GPU operations, memory transfers, CUDA API calls, and communication collectives via PyTorch Profiler and Kineto.
    • Host-Level Performance Data: CPU utilization, memory usage, network I/O, storage access patterns, and system-level bottlenecks via dyno telemetry.
    • Application-Level Annotations: Training iterations, forward/backward passes, optimizer steps, data loading phases, and custom user annotations.
    • Inference-Specific Data: Rate of inference requests, server latency, active requests, GPU memory allocation patterns, request latency breakdowns via Strobelight’s Crochet profiler, serving parameter analysis, and thrift request-level profiling.
    • Communication Analysis: NCCL collective operations, inter-node communication patterns, and network utilization for distributed workloads

    Distributed Analysis Pipeline

    Raw profiling data flows through sophisticated processing systems that deliver multiple types of automated analysis including:

    • Straggler Detection: Identifies slow ranks in distributed training through comparative analysis of execution timelines and communication patterns.
    • Bottleneck Analysis: Automatically detects CPU-bound, GPU-bound, memory-bound, or communication-bound performance issues.
    • Critical Path Analysis: Systematically identifies the longest execution paths to focus optimization efforts on highest-impact opportunities.
    • Anti-Pattern Detection: Rule-based systems that identify common efficiency issues and generate specific recommendations.
    • Parallelism Analysis: Deep understanding of tensor, pipeline, data, and expert parallelism interactions for large-scale distributed training.
    • Memory Analysis: Comprehensive analysis of GPU memory usage patterns, allocation tracking, and leak detection.
    • Load Imbalance Analysis: Detects workload distribution issues across distributed ranks and recommendations for optimization.

    Multi-Format Output Generation

    Results are presented through multiple interfaces tailored to different user needs: interactive timeline visualizations showing activity across all ranks and hosts, comprehensive metrics dashboards with drill-down capabilities and percentile analysis, trace viewers integrated with Perfetto for detailed kernel inspection, automated insights summaries highlighting key bottlenecks and recommendations, and actionable notebooks that users can clone to rerun jobs with suggested optimizations.

    Specialized Workload Support

    For massive distributed training for specialized workloads, like GenAI, Zoomer contains a purpose-built platform supporting LLM workloads that offers specialized capabilities including GPU efficiency heat maps and N-dimensional parallelism visualization. For inference, specialized analysis covers everything from single GPU models, soon expanding to massive distributed inference across thousands of servers.

    A Glimpse Into Advanced Zoomer Capabilities

    Zoomer offers an extensive suite of advanced capabilities designed for different AI workload types and scales. While a comprehensive overview of all features would require multiple blog posts, here’s a glimpse at some of the most compelling capabilities that demonstrate Zoomer’s depth:

    Training Powerhouse Features:

    • Straggler Analysis: Helps identify ranks in distributed training jobs that are significantly slower than others, causing overall job delays due to synchronization bottlenecks. Zoomer provides information that helps diagnose root causes like sharding imbalance or hardware issues.
    • Critical Path Analysis: Identification of the longest execution paths in PyTorch applications, enabling accurate performance improvement projections
    • Advanced Trace Manipulation: Sophisticated tools for compression, filtering, combination, and segmentation of massive trace files (2GB+ per rank), enabling analysis of previously impossible-to-process large-scale training jobs

    Inference Excellence Features:

    • Single-Click QPS Optimization: A workflow that identifies bottlenecks and triggers automated load tests with one click, reducing optimization time while delivering QPS improvements of +2% to +50% across different models, depending on model characteristics. 
    • Request-Level Deep Dive: Integration with Crochet profiler provides Thrift request-level analysis, enabling identification of queue time bottlenecks and serving inefficiencies that traditional metrics miss.
    • Realtime Memory Profiling: GPU memory allocation tracking, providing live insights into memory leaks, allocation patterns, and optimization opportunities.

    GenAI Specialized Features:

    • LLM Zoomer for Scale: A purpose-built platform supporting 100k+ GPU workloads with N-dimensional parallelism visualization, GPU efficiency heat maps across thousands of devices, and specialized analysis for tensor, pipeline, data, and expert parallelism interactions.
    • Post-Training Workflow Support: Enhanced capabilities for GenAI post-training tasks including SFT, DPO, and ARPG workflows with generator and trainer profiling separation.

    Universal Intelligence Features:

    • Holistic Trace Analysis (HTA): Advanced framework for diagnosing distributed training bottlenecks across communication overhead, workload imbalance, and kernel inefficiencies, with automatic load balancing recommendations.
    • Zoomer Actionable Recommendations Engine (Zoomer AR): Automated detection of efficiency anti-patterns with machine learning-driven recommendation systems that generate auto-fix diffs, optimization notebooks, and one-click job re-launches with suggested improvements.
    • Multi-Hardware Profiling: Native support across NVIDIA GPUs, AMD MI300X, MTIA, and CPU-only workloads with consistent analysis and optimization recommendations regardless of hardware platform.

    Zoomer’s Optimization Impact: From Debugging to Energy Efficiency

    Performance debugging with Zoomer creates a cascading effect that transforms low-level optimizations into massive efficiency gains. 

    The optimization pathway flows from: identifying bottlenecks → improving key metrics → accelerating workflows → reducing resource consumption → saving energy and costs.

    Zoomer’s Training Optimization Pipeline

    Zoomer’s training analysis identifies bottlenecks in GPU utilization, memory bandwidth, and communication patterns. 

    Example of Training Efficiency Wins: 

    • Algorithmic Optimizations: We delivered power savings through systematic efficiency improvements across the training fleet, by fixing reliability issues for low efficiency jobs.
    • Training Time Reduction Success: In 2024, we observed a 75% training time reduction for Ads relevance models, leading to 78% reduction in power consumption.
    • Memory Optimizations: One-line code changes for performance issues due to inefficient memory copy identified by Zoomer, delivered 20% QPS improvements with minimal engineering effort. 

    Inference Optimization Pipeline:

    Inference debugging focuses on latency reduction, throughput optimization, and serving efficiency. Zoomer identifies opportunities in kernel execution, memory access patterns, and serving parameter tuning to maximize requests per GPU.

    Inference Efficiency Wins:

    • GPU and CPU Serving parameters Improvements: Automated GPU and CPU bottleneck identification and parameter tuning, leading to 10% to 45% reduction in power consumption.
    • QPS Optimization: GPU trace analysis used to boost serving QPS and optimize serving capacity.

    Zoomer’s GenAI and Large-Scale Impact

    For massive distributed workloads, even small optimizations compound dramatically. 32k GPU benchmark optimizations achieved 30% speedups through broadcast issue resolution, while 64k GPU configurations delivered 25% speedups in just one day of optimization.

    The Future of AI Performance Debugging

    As AI workloads expand in size and complexity, Zoomer is advancing to meet new challenges focused on several innovation fronts: broadening unified performance insights across heterogeneous hardware (including MTIA and next-gen accelerators), building advanced analyzers for proactive optimization, enabling inference performance tuning through serving param optimization, and democratizing optimization with automated, intuitive tools for all engineers. As Meta’s AI infrastructure continues its rapid growth, Zoomer plays an important role in helping us innovate efficiently and sustainably.


    Continue Reading

  • Here are real AI stocks to invest in and speculative ones to avoid

    Here are real AI stocks to invest in and speculative ones to avoid

    Continue Reading

  • Analog Devices to Participate in the UBS Global Technology Conference

    Analog Devices to Participate in the UBS Global Technology Conference

    WILMINGTON, Mass., Nov. 21, 2025 /PRNewswire/ — Analog Devices, Inc. (Nasdaq: ADI) today announced that the Company’s Executive Vice President & Chief Financial Officer, Richard Puccio, will discuss business topics and trends at the UBS Global Technology Conference, taking place at the Phoenician Hotel, located in Scottsdale, Arizona on Tuesday, December 2, 2025, at 10:15 a.m. MST.

    The webcast for the conference may be accessed live via the Investor Relations section of Analog Devices’ website at investor.analog.com. An archived replay will also be available following the webcast for at least 30 days.

    About Analog Devices, Inc.

    Analog Devices, Inc. (NASDAQ: ADI) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologies into solutions that help drive advancements in digitized factories, mobility, and digital healthcare, combat climate change, and reliably connect humans and the world. With revenue of more than $9 billion in FY24 and approximately 24,000 people globally, ADI ensures today’s innovators stay Ahead of What’s Possible. Learn more at www.analog.com and on LinkedIn and Twitter (X).

    For more information, please contact:
    Jeff Ambrosi
    Senior Director of Investor Relations
    Analog Devices, Inc.
    781-461-3282
    [email protected]

    SOURCE Analog Devices, Inc.


    Continue Reading

  • The first-ever 3x levered bitcoin funds are launching in Europe next week. The timing couldn’t be worse.

    The first-ever 3x levered bitcoin funds are launching in Europe next week. The timing couldn’t be worse.

    By Gordon Gottsegen

    Bitcoin has dropped as much as 35% from its October high

    Triple leverage and elevated volatility around bitcoin could be a dangerous combination.

    Bitcoin is having a bad week on top of a rough month: The benchmark cryptocurrency is currently down more than 33% from its October all-time high of $126,272.76, wiping out more than $1.2 trillion in market cap.

    If you somehow foresaw this huge drop in the price of bitcoin (BTCUSD) and took out a bearish position, you could’ve made a lot of money on your trade. If you guessed incorrectly, you would’ve lost a lot. However, thanks to a new leveraged financial product, traders will be able to up the ante on bitcoin’s swings – either multiplying their money, or potentially losing everything.

    Exchange-traded-fund analyst Eric Balchunas posted on X that he spotted listings for new triple-leveraged bitcoin and ether (ETHUSD) ETFs, meaning that traders would be able to triple their exposure to the upside or downside for both cryptocurrencies, depending on which product they trade.

    According to Balchunas, these are the first tripled-leverage bitcoin and ether ETFs to launch, and are coming to European markets next week.

    Leveraged ETFs have been growing in popularity recently, with ETF providers filing for triple- and even quintuple-leveraged ETFs in the U.S. These products offer traders the ability to multiply earnings on daily price swings, but they also risk taking heavy losses if the underlying asset swings too far in the wrong direction.

    “These leveraged products, not only in crypto but stocks also, are nothing more than gambling instruments. The SEC needs to fulfill their duty of investor protection and limit them. What’s to stop issuers from going 5x, 10x or 100x?” Joe Saluzzi, co-head of equity trading at Themis Trading, told MarketWatch.

    Leveraged ETFs bring even more risk to volatile assets like crypto. There have been days in the past where specific crypto tokens have more than doubled, or bear markets where they’ve dropped by half. All it takes is a 33% move in the wrong direction for someone trading a triple-leveraged ETF to lose their money.

    “If it’s tracking correctly, you’d be down 99%. … You will basically be wiped out,” Todd Sohn, ETF strategist at Strategas Asset Management, told MarketWatch. “The odds of that happening when you are playing with more volatile instruments has clearly increased.”

    Some investors in Europe learned this lesson the hard way last month. On Oct. 6, the GraniteShares 3x Short AMD Daily exchange-traded product was terminated by its issuer after shares of Advanced Micro Devices Inc. (AMD) rose by more than 33% intraday. This product was an inverse fund; such funds climb when the price of the underlying asset falls, and decline when the price of the underlying asset rises. So, the 33% intraday swing in AMD shares drove the value of the exchange-traded product to zero.

    Although the triple-leveraged bitcoin and ether ETFs would be the first in the market, investors in the U.S. already have access to double-leveraged ETFs like 2x Bitcoin Strategy ETF BITX, ProShares Ultra Bitcoin ETF BITU and T-Rex 2X Long Bitcoin Daily Target ETF BTCL. And on a week like this one, when bitcoin is falling, those ETFs fall even more: All three of them have lost more than 20% this week.

    Most of these leveraged financial products are intended to be held for short durations. According to filings for the products, the advertised leverage only applies to daily moves, meaning holding them for longer periods can cause returns to deviate meaningfully from what one might expect. Over the long term, some of these funds have even underperformed their underlying stocks or assets.

    Leveraged ETFs appeal to a specific kind of trader with a very high tolerance for risk. But not everyone who comes across these products may know that. That’s why Sohn said it’s important for traders to “know the ingredients” of leveraged ETFs before trading them, adding that investors should only trade what they’re willing to lose.

    “If you are betting – trading, whatever you want to call it – on a levered product and you get wiped out, make sure it’s not going to ruin your life,” Sohn said.

    -Gordon Gottsegen

    This content was created by MarketWatch, which is operated by Dow Jones & Co. MarketWatch is published independently from Dow Jones Newswires and The Wall Street Journal.

    (END) Dow Jones Newswires

    11-21-25 1559ET

    Copyright (c) 2025 Dow Jones & Company, Inc.

    Continue Reading

  • Dollar slips against yen but heads for broad weekly rise – Reuters

    1. Dollar slips against yen but heads for broad weekly rise  Reuters
    2. Dollar weakens against yen  Business Recorder
    3. Policy Showdown: Yen’s Decline Challenges Japan’s Economic Unity  Bitget
    4. Japan finance minister signals chance of currency intervention  TradingView
    5. Japanese Yen struggles to lure buyers; seems vulnerable amid BoJ uncertainty  FXStreet

    Continue Reading

  • Why trouble for the biggest foreign buyer of U.S. debt could ripple through America’s bond market

    Why trouble for the biggest foreign buyer of U.S. debt could ripple through America’s bond market

    By Vivien Lou Chen

    Developments in Japan are creating a risk that investors in the U.S. Treasury market may one day pull the rug out by keeping more of their savings at home

    Why turmoil around Japan’s new government could wash up in U.S. financial markets.

    Recent developments overseas have the potential to complicate the White House’s agenda to bring down borrowing costs, while heightening competition for investors in the U.S. and Japanese bond markets.

    Aggressive fiscal-stimulus efforts by the cabinet of Japan’s first female prime minister, Sanae Takaichi, have created a spike in long-dated yields of Japanese government bonds and further weakness in the yen (USDJPY) in the past few weeks. It’s a situation that is being likened to the September-October 2022 crisis in the U.K., which stemmed from a crisis in confidence over a package of unfunded tax cuts proposed by then-Prime Minister Liz Truss’s government.

    Read: Liz Truss redux? Simultaneous drop for Japanese currency and bonds draws eerie parallels

    The U.S. needs to manage the cost of interest payments given a more than $38 trillion national debt, and this is a primary motivation for why the Trump administration wants to bring down long-term Treasury yields. Last week, Treasury Secretary Scott Bessent said in a speech in New York that the U.S. is making substantial progress in keeping most market-based rates down. He also said the 10-year “term premium,” or additional compensation demanded by investors to hold the long-dated maturity, is basically unchanged. Longer-duration yields matter because they provide a peg for borrowing rates used by U.S. households, businesses and the government.

    Developments in Japan are now creating the risk that U.S. yields could rise alongside Japan’s yields. This week, Japanese government-bond yields hit their highest levels in almost two decades, with the country’s 10-year rate BX:TMBMKJP-10Y spiking above 1.78% to its highest level in more than 17 years. The 40-year yield BX:TMBMKJP-40Y climbed to an all-time high just above 3.7%.

    In the U.S., 2-year BX:TMUBMUSD02Y, 10-year BX:TMUBMUSD10Y and 30-year U.S. yields BX:TMUBMUSD30Y finished Thursday’s session at their lowest levels of the past one to two weeks, and kept falling further on Friday. The benchmark 10-year yield was about 4.06% on Friday.

    There’s a risk now that U.S. yields may not fall as much as they otherwise might after factoring in market-implied expectations for a series of interest-rate cuts by the Federal Reserve into 2026.

    Japan’s large U.S. footprint

    Treasury yields are not going to necessarily follow rates on Japanese government bonds higher “on a one-for-one basis,” but there might be a limit on how low they can go, said Adam Turnquist, chief technical strategist at LPL Financial. He added that the impact of Japanese developments on the U.S. bond market could take years to play out, but “we care now because of the direction Japan’s policy is going in” and the possibility that this impact might occur even sooner.

    Some of the catalysts that usually tend to push Treasury yields lower, such as any commentary from U.S. monetary policymakers that suggests the Fed might be inclined to cut rates, “might be muted because of the increased value of foreign debt,” Turnquist added.

    U.S. government debt was rallying for a second day on Friday, pushing most yields beyond their lowest levels of the past one or two weeks, after New York Fed President John Williams said there is room to cut interest rates in the near term.

    All three major U.S. stock indexes DJIA SPX COMP traded sharply higher Friday, but remained on pace for weekly losses, as investors attempted to calm doubts over the artificial-intelligence trade. Separately, some traders suggested bitcoin (BTCUSD) bets were a factor in Thursday’s stock-market selloff.

    The troubling spike in yields on Japanese government bonds hasn’t fully spilled over into the U.S. bond market yet, but it remains a risk. “A repeat of the Truss episode is what people are afraid of,” said Marc Chandler, chief market strategist and managing director at Bannockburn Capital Markets.

    Concerns about Japan gained added significance on Friday, when Takaichi’s cabinet approved a 21.3 trillion yen (or roughly $140 billion) economic stimulus package, which Reuters described as lavish. The amount of new spending being injected into the country’s economy from a supplementary budget, much of which is not repurposed from existing funds, is 17.7 trillion yen ($112 billion).

    Anxiety over Takaichi’s stimulus efforts has resulted in a Japanese yen that has weakened against its major peers and fallen to a 10-month low ahead of Friday’s session, and in a spike in the country’s long-dated yields. Yields on 30-year BX:TMBMKJP-30Y Japanese government debt have risen this month to 3.33%.

    Japan is the biggest foreign holder of Treasurys, with a roughly 13% share, according to the most recent data from the U.S. Treasury Department, and the concern is that the country’s investors might one day pull the rug by keeping more of their savings at home.

    Bond-auction anxiety

    Earlier in the week, a weak 20-year auction in Japan was cited as one reason why U.S. Treasury yields were a touch lower in early New York trading, which means that demand for U.S. government paper remained in place. Global investors are often incentivized to move their money based on which country offers the highest yields and best overall value.

    “The conventional wisdom is that as yields rise in Japan, the Japanese are more likely to keep their savings at home rather than export it,” Chandler said. “The Japanese have been buyers of Treasurys and U.S. stocks, and if they decide to keep their money at home, those U.S. markets could lose a bid.”

    For now, Japanese investors, which include insurers and pension funds, appear to be continuing to export their savings by buying more foreign government debt like Treasurys. Data from the U.S. Treasury Department shows that as of September, Japanese investors held just under $1.19 trillion in Treasurys, a number which has been climbing every month this year and is up from about $1.06 trillion last December.

    One reason for this is the exchange rate. The yen has depreciated against almost every major currency this year. Japanese investors have been buying U.S. Treasurys because they can diversify against the yen, which is the weakest of the G-10 currencies on an unhedged basis, according to Chandler.

    If concerns about the Takaichi government’s stimulus efforts translate into even higher yields in Japan, this could incentivize local investors to keep more of their savings at home, but might also mean rising yields for countries like the U.S.

    -Vivien Lou Chen

    This content was created by MarketWatch, which is operated by Dow Jones & Co. MarketWatch is published independently from Dow Jones Newswires and The Wall Street Journal.

    (END) Dow Jones Newswires

    11-21-25 1541ET

    Copyright (c) 2025 Dow Jones & Company, Inc.

    Continue Reading

  • Oil Falls on Ukraine Peace Plan as Russia Sanctions Set to Start – Bloomberg.com

    1. Oil Falls on Ukraine Peace Plan as Russia Sanctions Set to Start  Bloomberg.com
    2. Oil prices settle down at lowest in a month as US seeks Russia-Ukraine peace deal  Reuters
    3. Crude oil price today: WTI price bearish at European opening  FXStreet
    4. Oil Prices Have Fallen Sharply  Rigzone
    5. Bearish Momentum Builds in Oil Markets as China Stockpiles Crude  Crude Oil Prices Today | OilPrice.com

    Continue Reading