Newswise — North America has lost an estimated 3 billion birds since 1970—a nearly 30% drop across species—mostly due to habitat loss and degradation. So when a team of researchers repeated a bird population study they did 30 years earlier in a very large commercial forest landscape in Maine, they were stunned to find more birds than before.
“When we started this project, we expected to add to the pile of bad news,” says Michael Reed, a professor of biology in the School of Arts and Sciences at Tufts University and co-author of the study. “So we were very pleasantly surprised to find that, for most of the bird species in our study, things were actually looking up.”
The research team wanted to see if bird populations and habitat use had changed over the decades, particularly given a shifting forest landscape. “Forest management practices in Maine have changed significantly since the early 1990s,” says Reed. Due to social pressure, clearcutting has become much less common in Maine. Today, most logging operations remove fewer trees per acre—returning to spots every decade or so—and spread their activity across a broader area
The study, published in Biological Conservation, found that 26—or more than half—of 47 species counted by the researchers had significantly increased in numbers since the early 1990s, while populations for 13 species (or 28%) had remained stable. That’s contrary to what happened across much of the continent, with the North American Breeding Bird Survey showing that 35—or 75%—of the same species analyzed had seen their numbers decline, both regionally and continentally, for the same timeframe.
But what makes the commercial forests of Maine so different from other forests in the northern Atlantic states and North America? And can we learn anything from them to bolster bird populations regionally and nationally?
“Numerous factors are likely behind the abundance of bird populations we see in northcentral Maine,” says Reed. “We can’t know what they all are, but we know at least one: It’s all forest up there.”
The original and current study took place in a 588,000-acre commercial forest nestled within 10 million acres of commercial, public, and protected forest landscape. Together, these woodlands create the largest contiguous tract of non-developed forest east of the Mississippi. The habitat is recognized worldwide as an Important Bird Area, an area officially recognized as critical for protecting bird species and biodiversity by a coalition of international bird conservation groups.
The remote forestland is also one of the darkest places left on the East Coast. “Most bird species migrate at night, orienting by the stars,” says John Hagan, the study’s senior author and the founder and president of Our Climate Common. “So it may be when birds flying at night get tired, they look down, spot a vast patch of darkness, and decide it’s a good place to land and raise young.”
Many of the bird species observed seemed more flexible in their habitat use than previously thought; the researchers have another paper on these findings in the works. This suggests that high-quality forest next to more average forest may still be appealing enough to attract and support more birds over time. Individual birds often return to breed where they were hatched, and many migratory species are drawn to areas where others of their kind are already present. As Reed puts it, this may mean that “the rich get richer” when it comes to birds in Maine’s commercial forests.
Much of the population growth came from more birds per acre, not more habitat. “You’d expect bird numbers to go up if there’s more habitat,” says Reed. “But we actually saw increased numbers for some species whose main habitat acreage stayed the same size or even decreased from the 1990s. For example, in places where we previously counted two ovenbirds singing before, we now counted four.”
Data from one species—the golden-crowned kinglet—suggests that how forests are managed may affect species’ ability to thrive. These tiny, round songbirds are declining sharply across much of North America, including in the commercial forests of New Brunswick, Canada. But just over the border in Maine, their numbers are rising.
In Canada, commercial forests are commonly replanted in neat plantation-style rows, creating simpler forests, with less understory and trees that are all the same age. In contrast, Maine’s commercial forests rely on natural regrowth, creating denser forests with a broader mix of tree species and ages. Reed and Hagan hypothesize this more natural approach may offer better shelter or support more of the insects that kinglets need to raise their young.
Despite the widespread increases, bird numbers for 14 species—about 30%—still declined in the study area. The researchers hope to more closely examine the pressures faced by these decreasing bird species, including species like the winter wren and the Canada warbler, to see if commercial forestry could do something differently to better support them. They are particularly worried about steep declines in mature trees—some more than 200 years old—and its impact on the bird populations. Hagan is now leading additional research to assess this conservation threat. The threat could be on their migration or overwintering grounds, in which case little can be done in Maine to improve their numbers.
Even though about two-thirds of U.S. forestland is available or used to produce industrial wood products, the research team believes theirs is the first bird survey to compare species population numbers in a commercial forest over a long period of time. And given the 521 million acres of commercial U.S. forestland, they hope it certainly will not be the last.
In the face of ongoing human habitat expansion and continental declines in bird populations, the team says it’s important to understand how all types of forest ownership may help create important sanctuaries for birds. “Birds also may be thriving in commercial forests in Michigan, Wisconsin, and Minnesota, which are managed similarly to those in Maine,” Hagan says. “Hopefully, someone will look to see.”
“Many people don’t expect places where you harvest wood to serve as valuable habitat because they are cutting trees down,” adds Reed. “But nobody thinks the goal of a farm is to cut down corn—it’s to grow it. And commercial forests grow trees.”
_______
How to Help the Birds at Home
You may not own an acre of land—never mind hundreds of acres—but you can still bring some qualities of Maine forestland to your own yard.
Plant a tree that is native to your area.
Add native shrubs, which provide vital food and shelter for birds while offering multi-season visual appeal for you.
“Leave the leaves” in the parts of your yard where you are creating understory. Nature abhors a leaf blower—and it means less yardwork and free mulch for you.
Fight light pollution by reducing outdoor lighting to what’s truly necessary.
Where safe to do so, leave dead trees standing. “Snags” provide valuable habitat for owls, woodpeckers, and cavity-nesting birds.
Replace part of your lawn with native plants. Check out Douglas Tallamy’s book Nature’s Best Hope for ideas.
Starlink interference threatens radio astronomy’s golden age Astronomy Magazine
How Satellites Are Silencing the Universe Universe Today
Scientists analyze 76 million radio telescope images, find Starlink satellite interference ‘where no signals are supposed to be present’ Space
Starlink Satellites Interfere With Nearly a Third of Low Frequency Radio Astronomy extremetech.com
Internet Starlink satellites aren’t just messing up visible light images of the universe, they’re unintentionally interfering with radio astronomy as well PC Gamer
About 56 million years ago, Earth slipped into the Paleocene-Eocene Thermal Maximum (PETM), a sudden global warming pulse that pushed temperatures up by at least 5°C (9°F). In the badlands of what is now Wyoming, one mid-sized predator, Dissacus praenuntius, found itself confronting a landscape in flux.
A new study led by Rutgers University suggests the animal’s solution was unexpected: when prey became scarce, it started crunching bones.
“What happened during the PETM very much mirrors what’s happening today and what will happen in the future,” said Andrew Schwartz, the Rutgers anthropology doctoral student who led the research.
The work combines field excavations in the Bighorn Basin with a forensic technique called dental microwear texture analysis (DMTA).
By reading microscopic pits and scratches on fossil teeth, the team reconstructed what the animal was chewing in the weeks before it died; tougher, deeper marks usually point to hard items such as bone.
Dissacus praenuntius ate bones
Dissacus praenuntius was roughly the size of a modern coyote and, at first glance, looked like a lanky wolf. But it walked on tiny hooves and belonged to an archaic lineage known as mesonychids.
“They looked superficially like wolves with oversized heads,” Schwartz said, describing them as super weird mammals “Their teeth were kind of like hyenas. But they had little tiny hooves on each of their toes.”
Before the planet heated up, DMTA shows the predator had a diet which resembled that of today’s cheetahs: mostly soft but sinewy meat. Once the PETM began, everything changed.
“We found that their dental microwear looked more like that of lions and hyenas,” Schwartz said. “That suggests they were eating more brittle food, which were probably bones, because their usual prey was smaller or less available.”
The microwear evidence dovetails with other signs of environmental stress. Earlier work has shown many mammal species shrank during the PETM, and Dissacus appears to have followed that trend – smaller bodies need less food.
While scientists once blamed this shrinkage solely on heat, the new findings strengthen the case that limited nutrition was a major driver.
Adaptability drove survival success
“One of the best ways to know what’s going to happen in the future is to look back at the past,” Schwartz said. “How did animals change? How did ecosystems respond?”
For Schwartz, the PETM offers a natural experiment on how warming reshapes food webs. The lesson, he argues, is that dietary flexibility can make or break a species.
“In the short term, it’s great to be the best at what you do,” he said. “But in the long term, it’s risky. Generalists, meaning animals that are good at a lot of things, are more likely to survive when the environment changes.”
That principle is visible today. “We already see this happening,” Schwartz said. “In my earlier research, jackals in Africa started eating more bones and insects over time, probably because of habitat loss and climate stress.”
Pandas, koalas, and other extreme specialists lack that wiggle room and could therefore be more vulnerable as habitats fragment or shift.
Wyoming’s ancient badlands
To trace Dissacus praenuntius through time, the team zeroed in on the Bighorn Basin, one of the rare places where sedimentary layers preserve an almost year-by-year record across the PETM.
Field crews collected dozens of jaws and isolated teeth from successive strata, then scanned their chewing surfaces under high-resolution confocal microscopes.
Sophisticated software converted the textures into numerical values capturing roughness, complexity, and orientation – clues to the animal’s final meals.
Professor Robert Scott, a co-author on the paper, notes that DMTA is revolutionizing paleontology because it captures dietary snapshots just weeks or months before death, unlike isotope analyses that average years.
The method revealed a clear pattern: before the thermal maximum, tooth surfaces carried long, shallow scratches characteristic of slicing flesh; during and after, they showed deeper pits and gouges typical of bone fragmentation.
Dissacus praenuntius faded away
Despite this adaptive flair for bones, Dissacus praenuntius did not make it through the next 15 million years. Bigger, better-equipped carnivores eventually displaced it, and the shifting vegetation of post-PETM North America favored more agile hunters.
Still, its bone-crunching episode underscores how rapidly predators can recalibrate their behavior when climate jolts prey supply.
Schwartz hopes insights like these will aid modern conservation biology by spotlighting which species could be climate winners or losers.
The study also signals the importance of preserving continuous fossil sites: without the Bighorn Basin’s layered rocks, the dietary flip would have remained invisible.
Local fossils, global lessons
Schwartz traces his fascination with fossils to childhood trips along New Jersey’s waterways with his father, an amateur collector. Now close to completing his Ph.D., he is eager to show the broader public how deep time research can illuminate future challenges.
“If I see a kid in a museum looking at a dinosaur, I say, ‘Hey, I’m a paleontologist. You can do this, too.’”
That encouragement matters because, as Earth barrels toward PETM-like CO₂ levels, society will need scientists who can decode past crises to steer us through the next one.
The tale of a jackal-sized mammal gnawing harder fare in a hothouse world is more than a curiosity; it is a mirror held up to our own warming century.
The study is published in the journal Palaeogeography, Palaeoclimatology, Palaeoecology
Image Credit: ДиБгд, CC BY 4.0 , via Wikimedia Commons
—–
Like what you read? Subscribe to our newsletter for engaging articles, exclusive content, and the latest updates.
Check us out on EarthSnap, a free app brought to you by Eric Ralls and Earth.com.
Companion cockatoos are renowned for their problem-solving and intriguing characters. It’s no surprise these large, long-lived and intelligent parrots are known to display complex behaviour.
Owners often film their birds dancing to music and post the videos to social media. Snowball, a famous dancing cockatoo, has been shown to have 14 different dance moves.
We wanted to find out more about the dance repertoire of cockatoos and why they might be doing this. In our new research, we examined videos of dance behaviour and played dance music to six cockatoos at an Australian zoo.
These birds weren’t just doing a side step or bobbing up and down. Between them, they had a rich repertoire of at least 30 distinct moves. Some birds coordinated their head bobbing with foot movements, while others undertook body rolls. Our research shows at least 10 of the 21 cockatoo species dance.
If we saw this behaviour in humans, we would draw a clear link between music and dancing and interpret the behaviour as enjoyable. After watching cockatoos voluntarily begin dancing for reasonable lengths of time, it was difficult to reach any conclusion other than cockatoos most likely dance because it’s fun.
A Goffin’s cockatoo dancing while a Guns and Roses song plays.
How many moves does a cockatoo have?
Dancing is complicated. To dance to music, animals need to be able to learn from others, imitate movements and synchronise their movements. These complex cognitive processes are only known to exist in humans – but evidence is emerging for its presence in chimpanzees and parrots such as cockatoos.
To catalogue the dance moves of cockatoos, we began by studying videos of the behaviour. We analysed 45 dancing videos and recorded all distinct moves.
The five species in these videos were the familiar sulfur-crested cockatoos and little corellas, as well as Indonesian species such as Goffin’s cockatoos, white cockatoos and Moluccan cockatoos.
Across the videos, we spotted 30 movements, including 17 that hadn’t been described scientifically. We also observed 17 other movements, which we classified as “rare” because they were only seen in a single bird.
Head movements were the most common dance move, especially the downward bobbing motion. Half of all videoed cockatoos performed this move.
The ten most common dance moves across all five species include bobbing up and down, headbanging and going side to side. Zenna Lugosi/Author provided, CC BY-NC-ND
Dancing – but not to music
Once we catalogued the moves, we then tested whether music could elicit this behaviour in captive cockatoos who weren’t kept as companions.
We undertook a playback experiment with six adult cockatoos at Wagga Wagga Zoo in New South Wales, comprising two sulfur-crested cockatoos, two pink cockatoos and two galahs.
Over three sessions, we played a piece of electronic dance music on repeat for 20 minutes and recorded any responses on video. We repeated our experiment with no music and again with a podcast featuring people talking.
All six cockatoos we studied showed some dancing behaviour at least once over the three sessions. But the rates of dancing weren’t any higher during the playing of music – it was similar to dancing during silence and the podcast.
We don’t fully know why this is. One possibility could be because we played music to existing male-female pairs, and the social environment alone was sufficient to trigger dance behaviour.
Why do cockatoos dance at all?
To find out whether the cockatoo species most prone to dancing were those most closely related, we analysed similarities across species. Goffin’s cockatoos and white cockatoos had the most similar moves, while Goffin’s cockatoo and little corella were the furthest apart.
But this clashed with genetics, as Goffin’s cockatoos are most closely related to little corellas. This suggests dancing behaviour may not be connected to genetic links.
Interestingly, these behaviours are mainly recorded in companion birds. Music playback in the online videos does seem to encourage the bird to keep it going for longer than likely to be seen in zoo or wild birds. These dance moves might represent an adaptation of courtship display movements as a way to connect with their human owners.
Other researchers report being able to trigger dancing behaviour in an African grey parrot and a sulfur-crested cockatoo with music. But the zoo cockatoos in our playback study didn’t respond the same way. This suggests there may be an element of learning to respond to humans.
A galah bobs and side steps while a song plays. But it’s not clear the movements are a response to the music.
It’s usually easy to tell if a human behaviour is play or not. But in animals, it can be much more difficult. Researchers define a behaviour as play if it meets four criteria: it occurs while animals are relaxed, it’s begun voluntarily, has no obvious function and appears rewarding. Cockatoo dancing would meet all four of these criteria.
By contrast, repetitive behaviours such as pacing seen in animals kept alone in small cages would not be play – it’s not rewarding and the animals don’t seem relaxed. Parrots kept in poor conditions exhibit self-harming behaviours such as constant screeching and feather pulling.
Captive parrots have complex needs and can experience welfare problems in captivity. Playing music may help enrich their lives.
For cockatoo owners, this suggests that if their birds are dancing, they’re feeling good. And if they’re busting out many different moves in response to music, even better – they might be showing creativity and a willingness to interact.
Acknowledgement: Honours student Natasha Lubke is the lead author of the research on which this article is based.
The thyroid, a vital endocrine organ in vertebrates, plays a key role in regulating metabolism and supporting growth. The first gland of both the nervous system and endocrine system to mature during an embryo’s development, it initially evolved more than 500 million years ago out of a “primitive” precursor organ in chordates known as the endostyle. Now, using lamprey as a model organism, Caltech researchers have discovered how the evolutionary acquisition of a certain kind of stem cell, called a neural crest cell, facilitated the evolution of the endostyle into the thyroid.
The research is described in a paper appearing in the journal Science Advances on August 6. The work was conducted primarily in the laboratory of Marianne Bronner , the Edward B. Lewis Professor of Biology and director of Caltech’s Beckman Institute.
Bronner’s lab has long focused on neural crest cells and their role in vertebrate development and evolution. For example, the team previously examined the role of neural crest cells in forming the bony scales that protect sturgeon and other primitive fish, heart tissue in zebrafish and chickens , and neurons of the peripheral nervous system in lamprey.
“Neural crest cells seem to promote evolution,” Bronner says. “When Darwin first proposed the theory of evolution, he was looking at the different shapes of beaks of finches on the Galapagos Islands. Beaks, in addition to other parts of the facial skeleton, happen to arise from neural crest cells. These cells seem to be able to change faster in evolutionary time than cells that are more ancient.”
Vertebrates have neural crest cells, while invertebrates do not, further suggesting that these cells contribute to the evolution of complex body forms. The Bronner lab uses lamprey, slimy parasitic eel-like fish, as a model organism, because modern lamprey share some characteristics with the earliest vertebrates.
The new work, led by Senior Postdoctoral Scholar Research Associate Jan Stundl, examines how neural crest cells contribute to the development of the endostyle in the lamprey. The endostyle is an evolutionary novelty of chordates (animals in the phylum Chordata, which includes vertebrates), and lampreys are the only vertebrates that retain this organ, whose primary function is associated with filter feeding. In lampreys, the larval endostyle, composed of two lobes in a butterfly-like shape, transforms into thyroid follicles during metamorphosis. Stundl and the team traced how neural crest cells give rise to the five different cell types of the endostyle, two of which give rise to the thyroid follicles. Using the gene-editing technology CRISPR, they then genetically deleted genes associated with the neural crest developmental program in lamprey embryos. These modified lampreys failed to develop a fully formed endostyle, exhibiting instead only a primitive lobe resembling the simplified endostyle of invertebrate chordates. The findings suggested that neural crest cells are essential for driving the evolutionary transition from the chordate endostyle to the vertebrate thyroid gland.
“Mother Nature is ‘smart,’” Stundl says. “Instead of evolving something new, you can rebuild from something already present, like the endostyle. Neural crest cells seem to play an important role in enabling this transition to happen. Without the neural crest, we might still be filter feeders.”
The paper is titled “Acquisition of neural crest promoted thyroid evolution from chordate endostyle.” In addition to Stundl and Bronner, Caltech co-authors are postdoctoral scholars Ayyappa Raja Desingu Rajan and Tatiana Solovieva; graduate student Hugo Urrutia; research associate Jana Stundlova; and former postdoctoral scholar Megan Martik now of UC Berkeley. Additional co-authors are Jake Leyhr, Tatjana Haitina, and Sophie Sanchez of Uppsala University; and Zuzana Musilova of Charles University in Prague, Czech Republic. Funding was provided by the National Institutes of Health, the European Union, Alex’s Lemonade Stand Foundation, the American Heart Association, the Helen Hay Whitney Foundation, Swedish Research Council Vetenskapsrådet, and the European Synchrotron Radiation Facility. Marianne Bronner is an affiliated faculty member with the Tianqiao and Chrissy Chen Institute for Neuroscience at Caltech .
A picturesque Scottish peninsula immortalized in a hit Paul McCartney song from the 1970s will host a new U.K. rocket development hub as the country works toward its goal of becoming a major player in European space launch.
The Mull of Kintyre peninsula in southwestern Scotland once offered refuge to the famous ex-Beatle, who lived there on a farm in the aftermath of the legendary band’s acrimonious split. The peninsula’s misty coastline and rolling hills inspired the namesake Wings tune that became the U.K.’s best-selling hit of the 1970s. Now, that wild landscape will become the backdrop for a different kind of history-making.
A new rocket-testing facility, dubbed the MachLab, has just opened its doors near the tiny town of Campbeltown, hoping to speed up the development of innovative engines for small rockets. For years now, the U.K. has been working toward establishing itself as Europe’s gateway to space.
A January 2023 attempt to fly an air-launched Virgin Orbit rocket from a site in Cornwall failed, helping lead to the company’s collapse. But several companies, including the U.K.’s homegrown Orbex and Germany’s Rocket Factory Augsburg, are now positioned to begin launching vertical rockets from the SaxaVord site in the Shetland Islands, off the northern coast of Scotland, within a year.
The MachLab, overseen by the University of Glasgow, received around £500,000 (about $670,000 U.S.) in funding from the U.K. government and industry to help aid this endeavor.
“MachLab is ready to play a key role in the U.K.’s strategy to return to vertical launch, ensuring that students and researchers can access hotfire facilities in a safe and controlled environment,” Professor Patrick Harkness, of the University of Glasgow’s James Watt School of Engineering, said in a statement.
MachLab – Scotland’s Propulsion Test Facility – YouTube
Watch On
“MachLab will allow us to cooperate with other countries establishing or reestablishing their access to space,” Harkness added. “We have already had visitors from South Africa, and we expect to welcome partners from Australia in the near future.”
Breaking space news, the latest updates on rocket launches, skywatching events and more!
The facility is located on the site of the former RAF Machrahanish airbase, which housed U.S. nuclear weapons during the Cold War era.
MachLab has already hosted early-stage hotfire tests of a new kind of 3D-printed rocket engine with an advanced cooling system, a project supported by the U.K. Space Agency.
The facility’s equipment can support tests of rocket engines using solid, liquid and cryogenic propellants.
“MachLab has been two years in the making, with all the systems required to operate a liquid bipropellant rocket engine being created from the ground up,” Krzystof Bzdyk, a research associate at the University of Glasgow’s James Watt School of Engineering, said in the statement. “We’re excited to be ready to start making our mark in rocket research, development and teaching in Scotland.”
About five years ago NASA awarded initial space station development contracts to four different companies: Northrop Grumman, Blue Origin, Axiom Space, and Voyager Space. Since then Northrop has dropped its effort and joined Voyager’s team. There has also been some interest from other companies, most notably Vast, which is working with SpaceX to develop its initial space station.
Probably the most striking thing about the new directive is that it seems to favor Vast over NASA’s original contractors. Specifically, Vast’s Haven-1 module is designed for four astronauts to spend two weeks in orbit, and the company has a more straightforward pathway to building a station that would meet NASA’s minimum requirements.
The other companies had been planning larger stations that would have more permanence in orbit, which matched NASA’s original desires for a successor to the International Space Station. The new directive favors a company building up capabilities through interim steps, including stations with a limited lifespan on orbit.
“All the current players are going to have to do some kind of pivot, at least revisit their current configuration,” McAlister said. “Certain players are going to have to do a harder pivot.”
One industry official, speaking anonymously, put it more bluntly: “Only Haven-1 can succeed in this environment. That is our read.”
The chief executive officer of Vast, Max Haot, told Ars that the company bet that starting with a minimum viable product was the best business strategy and fully funded that approach without government money.
“It seems like NASA is now leaning into an approach for the future of CLDs that is led by what industry believes it can achieve technically and build a credible business model around,” Haot told Ars. “Seeing that information from contractors before committing to buying services can help increase long-term risk reduction. This seems similar to the successful approaches used by NASA for cargo and crew.”
Vast has worked closely with SpaceX in the development of its station, to the extent that Haven-1 will largely rely on the Dragon spacecraft for life support and propulsion. Future stations, such as Haven-2, will have more independent capabilities.
“To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.” – Ronald A. Fisher
The modern biology toolbox is larger than ever, as a widening array of cutting-edge molecular techniques supplements the classic approaches that still drive the field forward. Teams of researchers with complementary expertise may combine these tools in countless creative ways to produce novel research. Regardless of which methods are chosen for data collection, all empirical research projects share a common foundation: the principles of good experimental design. Far from eliminating the need for statistical literacy, -omics technologies make careful, sound experimental design more important than ever1,2.
This Perspective was motivated by the observation that many biology projects are doomed to fail by experimental design errors that make rigorous inference impossible. The results of such errors can range from waste to, in the worst case, the introduction of misleading conclusions into the scientific literature, including those with clinical consequences. Our goal is to highlight common experimental design errors and how to avoid them. Although methods for analyzing challenging datasets are improving3,4,5,6, even advanced statistical techniques cannot rescue a poorly designed experiment. For this reason, we focus on choices that must be made before an experiment or study is conducted, rather than on data analysis choices (however, if you are working on microbiomes, we highly recommend that you read this article in conjunction with Willis and Clausen7, who highlight important points for planning and reporting on statistical analyses). In particular, we address four key elements of a well-designed experiment: adequate replication, inclusion of appropriate controls, noise reduction, and randomization.
First, we discuss how in -omics research, many errors arise because of the misconception that a large quantity of data (e.g., deep sequencing or the measurement of thousands of genes, molecules, or microbes) ensures precision and statistical validity. In reality, it is the number of biological replicates that matters. We also explain how to recognize and avoid the problem of pseudoreplication, and we introduce power analysis as a useful method for optimizing sample size. Second, we introduce several strategies (blocking, pooling, and covariates) for minimizing noise, or variation in the data caused by randomness or other unplanned factors. Third, we briefly review how missing positive and negative controls can compromise experimental results, and provide examples from both -omics and non-omics research. Finally, we describe two critical functions of experimental randomization: preventing the influence of confounding factors, and empowering researchers to rigorously test for interactions between two variables.
While these practices are covered in undergraduate and graduate level classes and in textbooks, as reviewers we—along with editors and some colleagues—have observed basic experimental design errors in submitted manuscripts. Indeed, the authors of this Perspective have also made and learned from some of these errors the hard way. This fact inspired this Perspective to provide a succinct overview of important experimental design principles that can be used in training of early-career scientists and as a refresher for seasoned biologists. We present examples from projects that use high-throughput DNA sequencing; however, unlike best methods for statistical analysis, these experimental design principles apply equally to any experiment regardless of the type of data being collected, including proteomics, metabolomics, and non-omics data. We also provide a list of additional resources on the topics discussed (Table 1) and practical steps for designing a rigorous experiment (Box 1).
Table 1 Additional resources for improving experimental design strategies
Box 1. Putting it into practice
Thinking about statistics early in project development will minimize your need to use advanced statistical techniques later and will maximize your work’s scientific value. It will certainly prevent you from spending time and money on an experiment that is doomed to fail. The steps outlined below describe how to work backwards from your planned analysis to ensure a properly designed experiment.
1.
Think about your experimental goals. If you are doing hypothesis-based research, what kinds of observations would support or refute your hypothesis? Does your goal relate to an interaction between two or more independent (explanatory) variables? With that in mind, sketch a plan to generate the data you need to test your hypothesis; then, write out the statistical test you will use and explicitly define the units of replication for that test.
2.
Create a mock dataset for your planned experiment: a spreadsheet with a row for each replicate, and a column for each independent variable (any variables that may influence the outcome). For now, include just 3 replicates per experimental group. If there are multiple independent variables, ensure that all necessary combinations of those variables are represented. Several mock dataset templates are provided in the Supplementary Information.
3.
Add a column to hold randomly-generated numbers, which you can use later to randomize experimental units. Add columns to hold any covariates and nuisance variables (e.g., batches) that you will want to control for later.
4.
Add columns to hold measurements (your response or dependent variables). For -omics projects these are often features such as ASVs, transcripts, or metabolites; they may also be higher-level properties of those features such as microbial phyla or gene ontology categories. Consider what form your measurements will take; for example, many types of -omics data are counts ranging from zero to large integers.
5.
Run a practice statistical test using your mock dataset. This will ensure that all the variables you need to test your hypothesis are included in your dataset. As a bonus, you will be well prepared to quickly analyze your real data when you get it.
6.
Anticipate the possible outcomes of this statistical test. What are alternative explanations for each of those possibilities? If there are changes you could make to the experiment or analysis to remove the ambiguity, return to Step 1 and revise accordingly.
7.
Optimize your sample size. Use your domain expertise to define the minimum effect size that you would consider biologically valuable, and conduct a power analysis for your chosen statistical test. Then, revisit your mock dataset and add more rows as needed to increase the sample size.
8.
Use the literature and your domain expertise to determine what positive and negative controls are necessary for correct interpretation of your expected results.
9.
Share your completed plan with colleagues and ask for their feedback. If they were reviewing this work, would they be convinced by the results? Do they think the plan is feasible? If not, what improvements would they suggest?
10.
Consider submitting your finalized plan for peer review at a journal that publishes Registered Reports (Box 3).
Empowerment through replication
Many researchers intuitively understand that any individual data point might be an outlier or a fluke. As a result, we have very low confidence in any conclusion that was reached based on one isolated observation. With each additional data point that we collect, however, we get a better sense of how representative that first data point really was, and our confidence in our conclusion grows. Statistics empower us to measure and communicate that confidence in a way that other researchers will understand.
Why biological replication is more important than sequencing depth
Most biologists have heard that having more data will empower them to test their hypotheses. But what does it really mean to have more data? High-throughput technologies that generate millions to billions of DNA sequence reads, and counts of thousands of different genes or microbes, can create the illusion of a big dataset even if the number of replicates, or sample size, remains small. Although deeper sequencing per replicate can improve power in some cases, it is primarily the number of biological replicates that enables researchers to obtain clear answers to their questions.
To illustrate why this is, consider the hypothesis that two species of plants host different ratios of two microbial taxa in their roots. We can estimate this ratio for the two groups or populations of interest (i.e., all existing individuals of the two species) by collecting random samples from those populations. A sample size of 1 plant per species would be essentially useless, because we would have no way of knowing whether that plant is representative of the rest of its population, or instead is an anomaly. This is true regardless of the amount of data we have for that plant; whether it is based on 103 sequence reads or 107 sequence reads, it is still an observation of a single individual and cannot be extrapolated to make inferences about the population as a whole. Similarly, if we measure the abundances of thousands of microbes per plant, this same problem would apply to our estimates for each of those microbes. In contrast, measuring more plants per species will provide an increasingly better sense of how variable the trait of interest is in each population.
To what extent does the amount of data per replicate matter? Deeper sequencing can modestly increase power to detect differential abundance or expression, but those gains quickly plateau after a moderate sequencing depth is achieved3,8. Extra sequencing is most beneficial for the detection of less-abundant features, such as rare microbes or low-expression transcripts, and features with high variance8. Projects that focus specifically on such features will require deeper sequencing than those that do not, or else may benefit from a more targeted approach. Finally, it is worth highlighting the related problem of treating -omics features (e.g., genes or microbial taxa) as the units of replication, as is common in gene set enrichment and pathway analyses. Such analyses describe a pattern within the existing dataset, but they are entirely uninformative about whether that pattern would hold in another group of replicates9. Instead, they only allow inference about whether that pattern would hold for a newly-observed feature in the already-measured group of replicates, which is often not the researcher’s intended purpose.
Replication at the right level
Biological replicates are crucial to statistical inference precisely because they are randomly and independently selected to be representatives of their larger population. The failure to maintain independence among replicates is a common experimental error known as pseudoreplication10 (Box 2). When experimental units are truly independent, no two of them are expected to be more similar to each other than any other two. Pseudoreplication becomes a problem when the incorrect unit of replication is used for a given statistical inference, which artificially inflates the sample size and leads to false positives and invalid conclusions (Fig. 1). In other words, not all data points are necessarily true replicates.
Fig. 1: Valid experimentation requires independence of all replicates.
A To test whether carbon use efficiency differs between freshwater and marine microbial communities, a researcher collects three vials from Lake Tahoe and three from the Sea of Japan. This design is pseudoreplicated because the vials from the same location are not independent of each other; they are expected to be more similar to each other than they would be to other vials randomly sampled from the same population (i.e., freshwater or marine). However, they could be used to test the narrower question of whether carbon use efficiency differs between Lake Tahoe and the Sea of Japan, assuming that the vials were randomly drawn from each body of water. B In contrast to the design in panel A, collection of one vial from each of three randomly-selected freshwater bodies and three randomly-selected saltwater bodies enables a valid test of the original hypothesis. Alternatively, each replicate could be the composite or average of three sub-samples per location; this would be an example of pooling to improve the signal:noise ratio. C Pseudoreplication during experimental evolution can lead to false conclusions. Each time that the flasks are pooled within each treatment prior to passaging, independence among replicates is eliminated. As a result, a stochastic event that arises in a single replicate lineage (symbolized by the blue die) can spread to influence the other lineages, so that the stochastic event is confounded with the treatment itself. D In contrast, by maintaining strict independence of the replicate lineages, the influence of the stochastic event is isolated to a single replicate. The researcher can confidently rule out the possibility that a stochastic event has systematically influenced one of the treatment groups.
Although pseudoreplication is occasionally unpreventable (particularly in large-scale field studies), it can and should be anticipated and avoided whenever possible. The correct units of replication are those that can be randomly assigned to receive one of the treatment conditions that the experiment aims to compare. In experimental evolution, for instance, the replicates are random subsets of the starting population, assumed to be identical, each of which may be assigned to a different selective environment11,12. Failure to include enough independent sub-populations, or to keep them independent throughout the experiment (e.g., by pooling replicates; Fig. 1C–D), will cause pseudoreplication of the evolutionary process of interest13. In some cases, however, mixed-effects modeling techniques can adequately account for the non-independence of replicates10.
Box 2. Examples of inconclusive and/or incorrect conclusions due to experimental design errors
Pseudoreplication. Lack of clarity during experimental design about the actual experimental unit can lead to incorrect assumptions about the number of replicates available for statistical testing. A highly-cited 2012 study72, for example, concluded that the transfer of intestinal microbiota from pregnant women in their third trimester into germ-free mice induces greater adiposity and inflammation as compared to microbiota from women in the first trimester. To properly support this conclusion, the necessary experimental unit is the microbiota from a pregnant woman in her first or third trimester, requiring the microbiota of several women in each group to be tested separately for proper replication. The authors, however, pooled the microbiota of five women per condition and used this as the inoculum for six germ-free mice, resulting in N = 1 per condition (only one third-trimester inoculum and one first-trimester inoculum). The mice are in this case observational units, but not experimental units. Therefore, the broad conclusions drawn about effects of first versus third trimester microbiomes on mouse phenotypes are not statistically valid. This incorrect use of the observational unit (mouse) as a replicate instead of the experimental unit (individual human-derived inocula) has been widespread in experiments transferring human microbiomes into germ-free mice73.
Lack of appropriate controls. While it is difficult to show post hoc that conclusions of a study are actually false, the absence of appropriate controls can put results and conclusions into serious doubt. (1) A 2019 study74 compared microbiome composition of rats fed two different sources of dietary protein (casein versus chicken) using metagenomics. The authors found that Lactococcus lactis was significantly higher in rats fed the casein diet, however, the experiment did not have controls to test for potential contamination of food sources with microbial DNA. It has been shown previously that casein contains high amounts of L. lactis DNA and protein75,76, and while we cannot be certain that this applies to the rat diet study as well, due to the absence of controls it is likely that the enrichment of L. lactis is a false positive result. (2) A 2016 study77 used 16S rRNA gene sequencing to investigate the bacterial communities associated with the roots of maize planted in sterile sand and two soils. The plants grown in sterile sand shared >20 OTUs (microbial taxa) with the plants grown in natural soils, and from this the authors concluded that transmission via the seed must be a major source of bacteria found in maize plants, as the bacteria in the sterile sand could not have come from the soil. However, no negative controls were sequenced, raising the possibility that contamination introduced during sample processing such as the “kit-ome”78 was the true source of the shared OTUs. This puts the conclusions relating to the presence of microbial taxa in sterile plants into question. (3) A major controversy in the microbiome field in the last decade was the question of whether the placentas of healthy pregnant women contain a microbiome. Initially several studies presented evidence for a placental microbiome based on amplicon sequencing of microbial marker genes. However, all of these studies lacked appropriate positive and negative controls and did not account for potential contamination during sample processing. Ultimately, upon inclusion of proper controls, the presence of a placental microbiome was solidly refuted by many studies79. The controversy led to many improvements in microbiome sequencing procedures.
Optimizing sample size
If most measurements in a dataset are similar to each other, this indicates that we are measuring individuals from a low-variance population with respect to the dependent variable. In contrast, a wide range of trait values signals a high-variance population. This within-group variance (i.e., the variance within one population) is central to determining how many biological replicates are necessary to achieve a clear answer to a hypothesis test: when within-group variance is high relative to the between-group variance, more replicates are required to achieve a given level of confidence (Fig. 2A). However, when the budget for sequencing is fixed, then increasing the sample size is costly—not only because wet lab costs increase, but also because the amount of data per observation decreases. Too many replicates can waste time and money, while too few can waste an entire experiment. How can a biologist know ahead of time how many replicates are enough?
Fig. 2: Experimental design strategies for optimizing sample size and improving signal:noise ratio.
Statistical power depends on both the between-group variance (the signal or effect size) and the within-group variance (the noise). Smaller effect sizes require larger sample sizes to detect, especially when noise is high. A Points show trait values of individuals comprising two populations (used in the statistical sense of the word); horizontal lines indicate the true mean trait value for each population. Thus, the distance between horizontal lines is the effect size. The populations could be two different species of yeast; the same species of yeast growing in two experimental conditions; the wild type and mutant genotypes; etc. To estimate the difference between populations, the researcher can only feasibly measure a subset (i.e., sample) of individuals from each population. The yellow boxes report the minimum sample size per group needed to provide an 80% chance of detecting the difference using a t-test, as determined using power analysis. B Blocking reduces the noise contributed by unmeasured, external factors (e.g., soil quality in a field experiment with the goal of comparing fungal colonization in the roots of two plant species). Soil quality is represented by the background color in each panel. Top: without blocking, soil quality influences the dependent variable for each replicate in an unpredictable way, creating high within-group variance. Middle: spatial arrangement of replicates into two blocks allows estimation of the difference between species while accounting for the fact that trait values are expected to differ between the blocks on average. Bottom: in a paired design, each block contains one replicate from each group, allowing the difference in fungal colonization to be calculated directly for each block and tested through a powerful one-sample or paired t-test. C Three statistical models that could be used in an ANOVA framework to test for the difference in fungal density between plant species, as illustrated in panel B. Relative to Model 1, the within-group variance for each plant species will be reduced in both Model 2 and Model 3. For Model 2, this is accomplished using blocking; a portion of the variance in fungal density can be attributed to environmental differences between the blocks (although these may be unknown and/or unmeasured) and therefore removed from the within-group variances. For Model 3, it is accomplished by including covariates; a portion of the variance in fungal density can be attributed to the concentrations of N and P in the soil near each plant and therefore removed from the within-group variances. Note that for Model 3, the covariates need to be measured for each experimental unit (i.e., plant), rather than each block; in fact, it is most useful when blocking is not an option.
A flexible but underused solution to this problem—power analysis—has existed for nearly a century14,15. Power analysis is a method to calculate how many biological replicates are needed to detect a certain effect with a certain probability, if the effect exists (Fig. 2A). It has five components: (1) sample size, (2) the expected effect size, (3) the within-group variance, (4) false discovery rate, and (5) statistical power, or the probability that a false null hypothesis will be successfully rejected. By defining four of these, a researcher can calculate the fifth.
Usually, both the effect size and within-group variance are unknown because the data have not yet been collected. The choice of effect size for a power analysis is not always obvious: researchers must decide a priori what magnitude of effect should be considered biologically important. Acceptable solutions to this problem include expectations based on the results of small pilot experiments, using values from comparable published studies or meta-analyses, and reasoning from first principles. For example, a biologist planning to test for differential gene transcription may define the minimum interesting effect size as a 2-fold change in transcript abundance, based on a published study showing that transcripts stochastically fluctuate up to 1.5-fold in a similar system. In this scenario, the stochastic 1.5-fold fluctuations in transcript abundance suggest a reasonable within-group variance for the power analysis. As another example, a bioengineer may only be interested in mutations that increase cellulolytic enzyme activity by at least 0.3 IU/mL relative to the wild type, because that is known to be the minimum necessary for a newly designed bioreactor. To determine the within-group variance for a power analysis, the bioengineer could conduct a small pilot study to measure enzyme activity in wild-type colonies. As a final example, if a preliminary PerMANOVA for a very small amount of metabolomics data (say, N = 2 per group) from a pilot study estimated that R2 = 0.41, then a researcher could use that value as the target effect size for which he/she aims to obtain statistical support.
Freely available software facilitates power analysis for basic statistical tests of normally-distributed variables, including correlation, regression, t-tests, chi-squared tests, and ANOVA16,17,18,19. Power analysis for -omics studies, however, is more complex for several reasons. -Omics data comprise counts that are not normally distributed and may contain many zeroes, and some features may be inherently correlated with each other; these properties must be modeled appropriately for a power analysis to be useful20. The large number of features often requires adjustment of P-values to correct for inflated Type I error rates, which in turn decreases power. Furthermore, statistical power varies among features because they differ not only in within-group variance, but also in their overall abundance. In general, more replicates are required to detect changes in low-abundance features than in high-abundance ones8. Statistical power also varies among different alpha- and beta-diversity metrics21 that are commonly used in amplicon-sequencing analysis of microbiomes. For power analysis of multivariate tests (e.g., PERMANOVA), which incorporate information from all features simultaneously to make broad comparisons between groups, simulations are necessary to account for patterns of co-variation among features. In such cases, pilot data or other comparable data are crucial for accurate power analysis.
Fortunately, tools are available to estimate power for common analyses used in proteomics22, RNA-seq3,20,23,24, and microbiome studies21,25,26. Recent reviews27,28 demonstrate this process for various forms of amplicon-sequencing data, including taxon abundances, alpha- and beta-diversity metrics, and taxon presence/absence. They also consider several types of hypothesis tests, including comparisons between groups and correlations with continuous predictors. Although power analysis may seem daunting at first, it is an investment with excellent returns. This skill empowers biologists to budget more accurately, write more convincing grant proposals, and minimize their risk of wasting effort on experiments that cannot generate conclusive results.
Empowerment through noise reduction
As explained above, statistical power is positively related to the sample size and negatively related to the within-group variance. Thus, we can increase power either by including more biological replicates or by reducing within-group variance. Because our budgets do not allow us to increase replication indefinitely, it is useful to consider: What methods exist to increase power by decreasing within-group variance?
A classic way to minimize within-group variance is to remove as many variables as possible29. Common examples include using a single strain instead of multiple; strictly controlling the lab environment; and using only male host animals30. Such practices minimize noise, or variance contributed by unplanned, unmeasured variables. However, they come with a drawback: the loss of generalizability30,31,32. If a mutation’s phenotype cannot be detected in the face of minor environmental variation, is it likely to be relevant in nature? If an interesting function occurs only in a lab strain, is it important for the species in general? Such limitations should always be considered when interpreting results.
Another technique to reduce within-group variance is blocking, the physical and/or temporal arrangement of replicates into sets based on a known or suspected source of noise (Fig. 2B). For instance, clinical trials may block by sex, so that results will be based on comparisons of males to males and females to females. Crucially, all experimental treatments must be randomly assigned within each block. Blocking is also useful for large experiments with more replicates than can be measured at once; as long as the replicates within each block are treated identically, sources of noise such as multiple observers can be controlled. The most powerful form of blocking is paired design, in which experimental units are kept in pairs from the beginning. One unit per pair is randomly assigned to the treatment group, the other to the control; the difference between units is then calculated directly, automatically accounting for sources of noise that are shared within the pair (Fig. 2B, bottom panel). A possible downside of highly-structured blocking designs is that they can complicate the re-use of the data to answer questions other than the one for which the design was optimized, whereas a simple randomized design is highly flexible but not optimized for any particular purpose.
When sources of noise are known beforehand, they ideally should be measured during the experiment so that they can be used as covariates. A covariate is any independent variable that is not the focus of an analysis but might influence the dependent variable. When using regression, ANOVA, or similar methods, one or more covariates may be included in the statistical model to control for their effects on the variable of interest (Fig. 2C). For instance, controlling for replicates’ spatial locations can dramatically reduce noise in field studies33,34. Similarly, dozens of covariates related to lifestyle and medical history contribute to the variation in human fecal microbiome composition35. The authors showed that by controlling for just three of these covariates, the minimum sample size needed to detect a microbiome contrast between lean and obese individuals would decrease from 865 to 535 individuals per group.
Finally, pooling–combining two or more replicates from the same group prior to measurement–can reduce within-group variance and the influence of outliers36. For example, pooling RNA can empower biologists to detect differential gene expression from fewer replicates, reducing costs of library preparation and sequencing37. This approach is especially helpful for detecting features that are low-abundance and/or have large within-group variance. Its main drawbacks are that it reduces sample size and results in the loss of information about specific individuals, which may be necessary for unambiguously connecting one dataset to another or linking the response variable to covariates. In fact, excessive or unnecessary pooling is a common experimental design error that can eliminate replication (Fig. 1C–D; Box 2), but is easily avoided by remembering that the pools themselves are the biological replicates.
Empowerment through inclusion of appropriate controls
Key to any experimental design is the inclusion of appropriate positive and negative controls to strengthen conclusions and enable correct interpretation of the experimental results (Box 2). Positive controls can confirm that experimental treatments and measurement approaches work as expected. Positive controls can, for example, be samples with known properties that are carried through a procedure from start to end alongside the experimental samples. For microbiome-related protocols, mock community samples can often be a good positive control38. In addition to serving as positive controls, spike-ins can serve as internal calibration and normalization standards39,40. Negative controls allow detection of artifacts introduced during the experiment or measurement. Negative controls can be samples without the addition of the organism/analyte/treatment of interest that are carried alongside the experimental samples through all experimental steps. They often reveal artifacts caused by the matrix/medium in which samples are embedded; for example, to analyze secreted proteins in the culture supernatant of a bacterium, measuring the proteins in non-inoculated culture medium would be a critical negative control. In microbiome and metagenomics studies, negative controls are particularly crucial when working with low-biomass samples because contaminants from reagents can become prominent features in the resulting datasets41.
Empowerment through randomized design
The random arrangement of biological replicates with regards to space, time, and experimental treatments is a crucial research practice for several reasons.
Randomization protects against confounding variables
For a given level of replication, the researcher must decide how to distribute those replicates in time and space. The importance of randomization—the arrangement of biological replicates in a random manner with respect to the variables being tested—has long been appreciated in ecology, clinical research, and other fields where external influences are impossible to control. Even in relatively homogenous lab settings, failure to randomize can lead to ambiguous or misleading results. This is because randomization minimizes the possibility that unplanned, unmeasured variables will be confounded with the treatment groups42. Unlike experimental noise—random deviations that decrease power by increasing variance—confounding variables cause a subset of replicates to deviate systematically from the others in a way that is unrelated to the intended experimental design. Thus, they cause biased results. Fortunately, confounding variables that are structured in time and/or space can be controlled through randomization (Fig. 3). Even for complex experimental designs, randomization can be easily achieved by sorting a spreadsheet based on a randomly generated number to assign the position of each replicate. In a fully randomized design, all replicates in an experiment are randomized as a single group. In structured designs such as an experiment that employs blocking (see “Empowerment through noise reduction”), however, the best approach is to randomize the replicates within each block, independently of the other blocks.
Fig. 3: Randomization of biological replicates across space, time, and batches can reduce experimental bias and reveal interactions between variables.
A Top: An undetected temperature gradient within a lab causes a false positive result. The mutant strain grows more slowly than the wild-type on the cooler side of the lab. Bottom: After randomizing the flasks in space, temperature is no longer confounded with genotype and the mutation is revealed to have no effect on growth. B Top: A chronological confounding factor causes a false negative result. When cells are counted in all of the rich-media replicates first, the poor-media replicates systematically have more time to grow, masking the treatment effect. Other external variables that can change over time include the lab’s temperature or humidity, the organism’s age or circadian status, and researcher fatigue. Bottom: Randomizing the order of measurement eliminates the confounding factor, revealing the treatment effect. C Left: Batch effects exaggerate the similarity between the yellow and green groups and between the purple and blue groups. Right: Randomization of replicates from all four groups across batches leads to a more accurate measurement of the similarities and differences among the groups. Inclusion of positive and negative controls (black and white) can help to detect batch effects. D Randomization is necessary to test for interactions. Left: In hypothetical Experiments 1 and 2, one variable (genotype) is randomized but the other (ampicillin) is not. These observations, separated in time and/or space, cannot be used to conclude that ampicillin influences the effect of the mutation. Right: Both variables (genotype, ampicillin) are randomized and compared within a single experiment. A 2-way ANOVA confirms the interaction and the conclusion that ampicillin influences the mutation’s effect on growth is valid. The two plots displaying the interaction are equivalent and interchangeable; the first highlights that the effect of the mutation is apparent only in the presence of ampicillin, while the second highlights that ampicillin inhibits growth only for the wild-type strain. E–H illustrate other patterns that can only be revealed in a properly randomized experiment. E A low-protein diet reduces respiration rate overall, but that effect is stronger for female than male animals. F Two plant genotypes show a rank change in relative fitness depending on the environment in which they are grown. G Two genes have epistatic effects on a phenotypic trait: a mutation in Gene 1 can either increase or decrease trait values depending on the allele at Gene 2. H This plot shows a lack of interaction between the pathogen strain and the host immunotype, as indicated by the (imaginary) parallel line segments that connect pairs of points. In contrast, the line segments that would connect pairs of points would not be parallel if an interaction were present (see E–G). In H, host immunotype A is more susceptible to both pathogen strains than host immunotype B, and pathogen strain 1 causes more severe disease than strain 2 regardless of host. Image credits: vector art courtesy of NIAID (orbital shaker) and Servier Medical Art (reservoir).
Projects using -omics techniques are particularly vulnerable to batch effects, a common type of confounding variable with the potential to invalidate an experiment1,43,44,45,46 (Fig. 3C). When not all replicates can be processed simultaneously, they must be divided into batches that often vary in their exposure to not only chronological factors but also variation among reagent lots, sequencing runs, and other technical factors. While batch effects can be minimized through careful control of experimental conditions, they are difficult to avoid entirely. Although some tools are available to cleanse datasets of batch-related patterns47, the biological effect cannot be disentangled from the batch effect if the two are severely confounded. Therefore, it is always wise to randomize replicates among analytical batches (Fig. 3C).
Randomization allows testing for interactions
Finally, randomization is necessary to rigorously test for interactions between experimental factors (Fig. 3D–H). An interaction is present if one independent variable moderates the effect of another independent variable on the dependent variable. Examples of interactions include epistasis between genes, temperature influencing the phenotypic effect of a mutation, and host genotypes differing in susceptibility to a pathogen.
Many biologists are interested in such context-dependency, but proper experimental design is crucial for testing interactions rigorously. It can be tempting to conduct a series of simple trials that alter one variable at a time and then compare the results of those trials; however, this approach is invalid for testing interactions48. Instead, when multiple independent variables are of interest, the biological replicates must be randomized with respect to all of them, simultaneously (Fig. 3D).
Occasionally full randomization is impossible and the experiment may require a split-plot design, where one variable is randomly assigned to groups of replicates rather than individual replicates. For instance, tubes cannot be randomized with respect to temperature within a single incubator; therefore, they must be distributed across multiple incubators, each of which is randomly assigned to a temperature treatment. The main challenge of split-plot designs is that two plots can differ from each other in ways other than the applied treatment, leading to uncertainty about any observed effects of the treatment. In the above scenario, to minimize the possibility that unintentional differences between incubators are the true cause of any observed differences, ideally at least 2 incubators would be used per temperature (either in parallel or in sequential replications of the experiment). As long as the tubes are randomized within incubators with respect to other variables, tests for interactions are valid. However, split-plot designs have less statistical power than fully-randomized designs.
Simulating “biology, chemistry, and physics coming together” in even one pocket along Greenland’s 27,000 miles (43,000 kilometers) of coastline is a massive math problem, noted lead author Michael Wood, a computational oceanographer at San José State University. To break it down, he said the team built a “model within a model within a model” to zoom in on the details of the fjord at the foot of the glacier.
Using supercomputers at NASA’s Ames Research Center in Silicon Valley, they calculated that deepwater nutrients buoyed upward by glacial runoff would be sufficient to boost summertime phytoplankton growth by 15 to 40% in the study area.
More Changes in Store
Could increased phytoplankton be a boon for Greenland’s marine animals and fisheries? Carroll said that untangling impacts to the ecosystem will take time. Melt on the Greenland ice sheet is projected to accelerate in coming decades, affecting everything from sea level and land vegetation to the saltiness of coastal waters.
“We reconstructed what’s happening in one key system, but there’s more than 250 such glaciers around Greenland,” Carroll said. He noted that the team plans to extend their simulations to the whole Greenland coast and beyond.
Some changes appear to be impacting the carbon cycle both positively and negatively: The team calculated how runoff from the glacier alters the temperature and chemistry of seawater in the fjord, making it less able to dissolve carbon dioxide. That loss is canceled out, however, by the bigger blooms of phytoplankton taking up more carbon dioxide from the air as they photosynthesize.
Wood added: “We didn’t build these tools for one specific application. Our approach is applicable to any region, from the Texas Gulf to Alaska. Like a Swiss Army knife, we can apply it to lots of different scenarios.”
United Launch Alliance’s (ULA) new Vulcan Centaur rocket will conduct its first-ever national security launch next week, if all goes according to plan.
ULA announced on Tuesday (Aug. 5) that it’s targeting Aug. 12 for USSF-106, a U.S. Space Force mission that will lift off from Cape Canaveral Space Force Station in Florida.
“This is the first national security space launch aboard the certified Vulcan rocket. The Vulcan rocket will deploy the USSF-106 mission directly to geosynchronous (GEO) orbit using the high-performance Centaur V upper stage,” ULA said via X on Tuesday.
Vulcan Centaur — the replacement for ULA’s venerable Atlas V rocket — has two flights under its belt to date, both of which have been successful.
The first one, which flew in January 2024, sent Astrobotic’s robotic Peregrine moon lander to Earth orbit. (Peregrine suffered a crippling anomaly shortly after it deployed from the rocket’s Centaur upper stage and ended up crashing back to Earth.)
Vulcan’s second flight, in October 2024, was a test mission that flew with an inert mass simulator as a payload. The mass simulator took the place of Sierra Space’s Dream Chaser space plane, the originally planned payload, which wasn’t ready in time for the launch.
Vulcan powered through a problem on that second flight — the failure of of an engine nozzle on one of its two solid rocket boosters (SRBs). Its performance on those two missions impressed the Space Force enough to certify Vulcan Centaur for national security missions, a huge milestone for ULA that was announced in March.
Breaking space news, the latest updates on rocket launches, skywatching events and more!
The launch of a United Launch Alliance Vulcan rocket carrying the U.S. Space Force (USSF)-106 mission for the United States Space Force’s Space Systems Command (SSC) is planned for Tuesday, Aug. 12, 2025, from Space Launch Complex (SLC) 41 at Cape Canaveral Space Force Station,… pic.twitter.com/OkWwEAxSUzAugust 5, 2025
The decision doubled the number of currently certified U.S. national security launch providers; SpaceX had been the only company that could loft such payloads. (ULA’s Atlas V launched many national security missions over the years, but no such payloads are on its docket ahead of its retirement in 2030 or so.)
“Assured access to space is a core function of the Space Force and a critical element of national security,” Brig. Gen. Kristin Panzenhagen, the Space Force’s program executive officer for assured access to space, said in a statement in late March. “Vulcan certification adds launch capacity, resiliency and flexibility needed by our nation’s most critical space-based systems.”