Blog

  • Lessons from AI for data integration in neuroscience

    Lessons from AI for data integration in neuroscience

    In the previous article, I argued that advancing data integration in neuroscience requires incorporating resting-state spontaneous activity into each experiment, framing it as ‘adhesive dots.’ Here, I extend that discussion by drawing strategic lessons from the success of large language models (LLMs) and by concretizing the earlier claims from the perspective of data

    What LLMs can teach us about data integration?

    The worldwide construction of data centers illustrates how AI development has advanced through scaling – expanding data volume, model size, and computational resources. LLM performance improves according to power-law scaling when all three expand together. (1) Furthermore, scaling model size and dataset size in tandem has been shown to be near-optimal. (2)

    Yet progress has required more than scale: cleaning and curation have been equally crucial. GPT-3 demonstrated the power of large-scale training with filtered Common Crawl, (3) and T5 achieved major improvements by building the C4 dataset after aggressively removing duplicates and low-quality text. (4) PaLM 2 also reported the significant impact of data quality. (5) On the other hand, concerns have been raised about the potential exhaustion of high-quality web text, (6) and partly motivated by applications such as edge AI (the concept of running AI on devices or chips), efficiency efforts such as Mixture-of-Experts (MoE) and compact models are also being pursued in parallel. (7,8) In short, AI has advanced through scaling and curation, while efficiency has also evolved along a complementary path.

    Current landscape and challenges in neuroscience

    By contrast, neuroscience has yet to fully address the ‘limits of data accumulation’ or the ‘optimization of modeling.’ Advances in optical methods, such as two-photon calcium imaging, now enable large-scale simultaneous measurements at single-neuron resolution. Recently, functional data spanning multiple fields of view have been integrated with EM connectomics, yielding analyses on the order of 75,000 neurons in total – marking major progress. (9)

    The greater challenge, however, lies in behavioral and environmental diversity. Laboratory experiments still focus largely on ‘screen-based stimuli’ and ‘controlled tasks.’ Although natural scene stimuli are increasingly employed, (10) real-world contexts are far harder to reproduce. Consider sudden crowd surges in a train station, unexpected issues at immigration control, or nighttime evacuation after a major earthquake with power outages and aftershocks. Such scenarios are common in life, but even if reproduced and recorded, the resulting datasets would be rare and highly specialized. Thus, it becomes essential to examine how such data – naturally incorporating individual differences – can be meaningfully connected to others.

    This contextual diversity makes integration particularly difficult. Unlike web text, which is relatively static and independent at scale, neural time-series data are strongly influenced by arousal, attention, individuality, apparatus, and surrounding environment. Therefore, standardized and shareable frameworks (NWB, BIDS, DANDI, OpenNeuro), (11) together with detailed metadata such as illumination, arousal state, and behavioral logs, are indispensable.

    Figure 1. Relationship between spontaneous activity and task-related activity
    Spontaneous activity states (Spon.1, Spon.2) represent the baseline states before the task. For simplicity, they are depicted as points in this figure, but in reality they are temporally fluctuating dynamics. Conventionally, analyses have been limited to quantifying the changes Δ1 and Δ2 in post-task activity (Aft.Task1, Aft.Task2) relative to each spontaneous state, without considering the relationship between Spon.1 and Spon.2. If the two differ substantially, comparing only Δ1 and Δ2 is insufficient to properly discuss task effects. Therefore, understanding the relative relationship between spontaneous states is essential, and this figure illustrates the necessity of comparing baseline states in addition to observing differences.

    The Idea of a ‘ten-minute spontaneous activity’ baseline

    As a realistic step, I have proposed adding a ‘ten-minute spontaneous activity’ segment to each experiment. Spontaneous activity provides a statistical foundation less constrained by specific tasks or environments, reflecting arousal, attention, and individuality while serving as the substrate for task-evoked activity. This has been supported by findings from both human fMRI and mouse research. (12)

    Moreover, resting brain activity exhibits scale-free long-range correlations lasting minutes to tens of minutes. (13) A ten-minute window thus captures the key temporal scales while remaining feasible as a unifying standard across laboratories. Longer recordings are, of course, preferable, but a two-step strategy – first establishing a ten-minute baseline and then extending it for refinement – is the most pragmatic approach.

    Attaching this “ten-minute spontaneous activity” baseline forms the “adhesive dots” (as described in a previous article), enabling cross-comparison across studies (Fig.1).

    This is not an abstract ideal: the role of resting-state structure as a foundation for interpreting and predicting task responses has been empirically demonstrated. (12)

    Definitions and non-stationarity of spontaneous activity

    The definition of spontaneous activity differs across species and paradigms. In humans, it is typically defined as an ‘eyes-open, fixation-rest state,’ whereas in animals it is categorized as ‘head-fixed, task-free’ or ‘freely moving without tasks.’ Importantly, spontaneous activity is not a static point, but a fluctuating dynamic influenced by arousal and microenvironmental factors. Thus, detailed metadata are indispensable. Notably, the diversity within spontaneous activity is far smaller than the vast diversity of tasks and environments.

    This – the “Principle of External Complexity” – highlights that in situations like crowded trains or large gatherings, where one brain is surrounded by dozens or even thousands of other brains, environmental complexity can easily exceed an individual’s internal complexity, making neural data integration difficult. Focusing first on the limited variability of spontaneous activity provides AI with a practical intermediate target for translation and alignment.

    A bridge to the next article

    The key lesson from LLMs is that breakthroughs emerged not from scaling alone but at the convergence of scaling, curation, and efficiency. In neuroscience, progress likewise requires addressing not only the expansion of neuron counts but also the challenge of behavioral and environmental diversity. As a preparatory step, standardizing the inclusion of a “ten-minute spontaneous activity” segment in each experiment – curated and shared as adhesive dots – would provide a common foundation for integration. This article has emphasized data-side strategies; the next will examine how AI can serve as the glue, through representational mapping and transformation learning, to connect fragmented datasets into a coherent understanding.

    CLICK HERE for references

    Continue Reading

  • Maldives becomes the first country to achieve ‘triple elimination’ of mother-to-child transmission of HIV, syphilis and hepatitis B – ReliefWeb

    1. Maldives becomes the first country to achieve ‘triple elimination’ of mother-to-child transmission of HIV, syphilis and hepatitis B  ReliefWeb
    2. Maldives becomes first country in the world to achieve triple elimination of mother-to-child…

    Continue Reading

  • Audio Pro launches W-Generation multi-room speaker systems

    Audio Pro launches W-Generation multi-room speaker systems

    Audio Pro has unveiled the W-generation multi-room speaker systems, bringing together cutting-edge engineering, elegant design and new software.

    “We’ve worked hard on refreshing our entire range, listening to consumer feedback while…

    Continue Reading

  • Purdue Brand Studio honored for excellence in content marketing, social media

    Purdue Brand Studio honored for excellence in content marketing, social media

    Purdue’s “In Our STEM ERA” Makerspace is among four Purdue Brand Studio campaigns that earned top honors at Ragan’s PR Daily 2025 Social Media & Digital Awards and Content Marketing…

    Continue Reading

  • Standing Up to Arthritis: MEAF’s World Arthritis Day Brings Patients and Experts Together

    Standing Up to Arthritis: MEAF’s World Arthritis Day Brings Patients and Experts Together

    By Rafiq Vayani

    DUBAI: The Middle East Arthritis Foundation (MEAF), a non-profit dedicated to promoting health and hope, successfully hosted…

    Continue Reading

  • CNN Launches Multiplatform Show, CNN Creators, From New Doha Bureau

    CNN Launches Multiplatform Show, CNN Creators, From New Doha Bureau

    CNN International is launching a multiplatform show, called “CNN Creators,” from its new facility in Doha‘s Media City Qatar.

    The new 30-minute weekly CNN International show — which will air on Thursdays at 11:30 a.m. ET — will…

    Continue Reading

  • Morgan Earns Singles Victories On Day 2 Of Gold Invitational

    Morgan Earns Singles Victories On Day 2 Of Gold Invitational

    ANNAPOLIS, Md. (Oct. 11, 2025)–The Morgan State men’s tennis team continued play on Saturday in the Gold…

    Continue Reading

  • Weekly Top Trending Games on Steam (6th–12th of October 2025)

    Weekly Top Trending Games on Steam (6th–12th of October 2025)

    Every week on Mondays, we like to take a look at what has been earning a lot of Followers on Steam to get a good idea of what’s been growing in the industry. This gives us a lot of information,…

    Continue Reading

  • Nomad Foods names Dominic Brisby CEO – SeafoodSource

    1. Nomad Foods names Dominic Brisby CEO  SeafoodSource
    2. Could Leadership Change at Nomad Foods (NOMD) Reshape Its Innovation and Cost Management Agenda?  simplywall.st
    3. Evaluating Nomad Foods (NYSE:NOMD) Valuation Following CEO Transition Announcement  Yahoo Finance
    4. Flora Foods’ Brisby named Nomad Foods CEO  Global Food Industry News
    5. Nomad Foods names Dominic Brisby as new CEO, Descheemaeker to retire  Investing.com

    Continue Reading

  • WHO Foundation and Lilly Collaborate to Support Global Dementia Action Plan

    BERLIN, Oct. 13, 2025 /PRNewswire/ — The WHO Foundation and Lilly have announced a collaboration to financially support the aims of the World Health Organization’s global action…

    Continue Reading