Baseline characteristics of patients with colorectal cancer
Based on the inclusion and exclusion criteria, we initially enrolled a total of 256 patients in the study. Following a rigorous screening process, the final study cohort consisted of 44 patients in the colorectal cancer with liver metastasis (metastasis) group and 85 patients in the colorectal cancer without liver metastasis (non-liver metastasis) group. We retrospectively collected data on tumor markers, immunohistochemical indicators, maximum tumor diameter, number of metastatic lymph nodes, and lymphocytes in serum, as well as follow-up data. We analyzed the colorectal cancer tissue samples from these patients for intratumoral microbiota (Table 1). Comparative analysis of the clinical characteristics between the two groups revealed significantly higher levels of carcinoembryonic antigen (CEA) and cancer antigen 19-9 (CA19-9) in the metastasis group compared with the non-metastasis group. The metastasis group also exhibited enhanced proliferation, invasion, and malignancy, as evidenced by the significantly increased expression of Ki-67 and p53 in the tumor tissue (p < 0.05). Analysis of the serum metabolic markers related to liver and renal function showed that the metastasis group had significantly higher levels of fasting blood glucose, albumin, triglyceride, and creatinine compared with the non-metastasis group, suggesting active metabolism in colorectal cancer with liver metastasis. Additionally, immune cell profiling in patient sera showed increased proportions of T cells and natural killer (NK) cells, indicating co-activation of the metabolic and immune systems in colorectal cancer with liver metastasis.
As microorganisms play a crucial role in the onset and progression of colorectal cancer, being closely associated with host immunity and metabolism, we measured the intratumoral microbiota of the patients. Alpha diversity analysis indicated that the colorectal cancer patients with and without liver metastasis had significantly different intratumoral microbial compositions, specifically significantly different Chao1 and Shannon indices (p < 0.05; Fig. 1A–C). Beta diversity analysis using nonmetric multidimensional scaling and principal coordinate analysis did not reveal any significant differences between samples (Fig. 1D, E), likely due to the similarity in intratumoral microbiota across colorectal cancer tissue samples. These results suggest that the sources of intratumoral microbiota are generally consistent among patients with colorectal cancer. Nevertheless, the significant differences in intratumoral microbiota and survival outcomes between the metastasis and non-metastasis groups indicate that variations in intratumoral microbiota may play an important role, warranting further investigation. Despite the differences, the small number of microbial communities that are similar among the different samples can be further analyzed.
A–C Analysis of α-diversity between colorectal cancer liver metastasis and non liver metastasis groups (Chao1, Shannon, and Simpson indices). D, E Analysis of β-diversity between colorectal cancer liver metastasis and non liver metastasis groups (NMDS and PCoA analysis). (*p < 0.05, **p ≤ 0.01, ***p < 0.001, ****p ≤ 0.0001, indicating statistical significance).
Intratumoral microbiota composition in colorectal cancer patients with liver metastasis and non-liver metastasis
To analyze the differences of intratumoral microbiota composition between colorectal cancer patients with liver metastasis and those without liver metastasis, we performed differential abundance analysis of microbial taxa. We focused on the top ten microbial taxa with significant relative abundance (Fig. 2A–F). As expected, we observed significant changes in the phylum-level abundance between tumor and normal states. Specifically, in the liver metastasis group, the relative abundance of Actinobacteria, Thermi, and Firmicutes increased, while Fusobacteria, Proteobacteria, and Bacteroidetes decreased. At the genus level, Deinococcus, Bacillus, and Corynebacterium showed increased abundance in the metastasis group, whereas Pseudomonas, Burkholderia, and Enterobacter were less abundant. At the species level, in the non-liver metastasis group, the relative abundance of Isoptericola variabilis, Bacillus flexus, Pseudomonas aeruginosa, and Enterobacter cancerogenus increased, whereas in the liver metastasis group, Enterococcus casseliflavus, Bacillus niabensis, and Deinococcus murrayi exhibited higher abundance, indicating some structural differences. We further explored differential bacteria between the liver metastasis and non-liver metastasis groups. Using library size-normalized counts and non-parametric significance tests, we identified bacterial taxa that differed significantly between groups. At the genus level, Odoribacter, Leptothrix, Clavibacter, and Caulobacter were significantly more abundant in the liver metastasis group (p < 0.05), with Odoribacter and Leptothrix showing the most significant differences. In contrast, Agrobacterium, Fusobacterium, Methylobacterium, and Faecalibacterium were more abundant in the non-liver metastasis group (p < 0.05), with Fusobacterium showing the most notable decrease (Fig. 2G). At the species level, common bacteria such as Faecalibacterium prausnitzii were significantly more abundant in the non-liver metastasis group, while in the liver metastasis group, it was significantly reduced (p < 0.05) (Fig. 2H). These differences provide important evidence for further investigation into the role of intratumoral microbiota in the development of liver metastasis in colorectal cancer.

A–F The proportion of dominant intratumoral microbiota communities at the phylum, class, order, family, genus, and species between colorectal cancer liver metastasis and non-liver metastasis groups. G, H Differentiating the genus and species of intratumoral microbiota between colorectal cancer liver metastasis and non-liver metastasis groups. (*p < 0.05, **p ≤ 0.01, ***p < 0.001, ****p ≤ 0.0001, indicating statistical significance).
Functional prediction analysis of intratumoral microbiota in colorectal cancer patients with liver metastasis and non-liver metastasis
We used PICRUSt2 (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States 2) to further analyze sequencing data and predict the functional potential of the microbiota. By associating the microbiota with functional information, we can better understand the functional composition of the microbiome, elucidating the potential mechanisms of interaction between the microbiota, host, and environment. This approach provides valuable insights for subsequent, deeper investigations. Using gene function annotation from databases such as COG, KO, PFAM, and TIGRFAM, we conducted differential analysis with the STAMP (STatistical Analysis of Metagenomic Profiles) software to identify the top 10 significantly different gene functions between the liver metastasis and non-liver metastasis groups. KO analysis revealed differences between the two groups in pathways related to Fatty Acid Biosynthesis, Secondary Bile Acid Biosynthesis, and other carbohydrate synthesis metabolic pathways, with statistically significant differences (Fig. 3A). In COG database analysis, we identified significant differences in enzymes related to pyruvate kinase, alpha-amylase/alpha-mannosidase, and amino acid synthesis between the two groups (Fig. 3B). Additionally, we predicted protein families and domains in the liver metastasis group (Fig. 3C, D). In the PFAM database, we found that Oxaloacetate Decarboxylase and Immunity Protein 17 were significantly more abundant in the liver metastasis group compared to the non-metastasis group, suggesting their roles as important proteases. These enzymes are potentially linked to the tricarboxylic acid cycle and immune responses. In the TIGRFAM database, we observed an increase in Pyruvate Kinase in the liver metastasis group, which may be related to carbohydrate metabolism. In summary, by performing functional predictions on the intratumoral microbiota in both liver metastasis and non-liver metastasis colorectal cancer groups, we found that the microbiota is associated with the host’s metabolic and immune systems, showing significant statistical differences. These findings provide important direction for further research and lay the groundwork for future studies.

A Differential analysis of functional prediction between colorectal cancer liver metastasis group and non-liver metastasis group using KEGG(Kyoto Encyclopedia of Genes and Genomes) database. B CGO(Clusters of Orthologous Groups of proteins) database predicts protein function between two groups. C PFAM (Protein Families Database) database predicts protein families between two groups. D TIGRFAM(TIGR defined protein families) database predicts protein families between two groups. (*p < 0.05, **p ≤ 0.01, ***p < 0.001, ****p ≤ 0.0001, indicating statistical significance).
Identification of intratumoral microbiota with predictive value in colorectal cancer patients with liver metastasis and non-liver metastasis
We further explored the intratumoral microbiota in colorectal cancer patients with liver metastasis and non-liver metastasis to identify microbiota with predictive potential. Through clustering and correlation analysis of the microbiota between the two groups (Fig. 4A), we identified several genera with significant correlations. Methylobacterium, Agrobacterium, Faecalibacterium, and Fusobacterium were more highly expressed in the non-metastasis group and showed lower expression in the metastasis group. Conversely, Caulobacter, Odoribacter, Leptothrix, and Clavibacter exhibited the opposite pattern, with higher expression in the metastasis group and lower expression in the non-metastasis group. At the species level (Fig. 4B), the clustering of microbial populations was not as pronounced. The Faecalibacterium prausnitzii cluster was less prominent, and the 16S RNA analysis at the genus level was more precise. Therefore, we chose to focus on genus-level analysis for further investigation. We then analyzed the diagnostic potential of the two intratumoral microbiota clusters using ROC (Receiver Operating Characteristic) curve analysis. When analyzing the four genera (Caulobacter, Odoribacter, Leptothrix, and Clavibacter) with high expression in colorectal cancer, individual ROC analysis showed that the diagnostic ability of these four bacteria for liver metastasis vs. non-liver metastasis was poor (p > 0.05) (Fig. 4C and Table 2), with AUC values all below 0.7. Moreover, when performing a combined diagnostic ROC analysis of these four clustered bacteria (Fig. 4D), we found the AUC to be 0.67 (p = 0.05263), with a sensitivity of 52.27% and specificity of 83.53%. However, this combination of microbiota was not an ideal diagnostic model for distinguishing between liver metastasis and non-liver metastasis.

A, B Cluster analysis of intratumoral microbiota in two groups at the genus and species levels (C) Individual ROC survival curve analysis of Odorobacter, Leptothrix, Clavibacter, and Clavibacter in colorectal cancer liver metastasis and non-liver metastasis groups. D Combined ROC survival curve analysis of Odorobacter, Leptothrix, Clavibacter, and Clavibacter between two groups. E Individual ROC survival curve analysis of Agrobacterium, Fusobacterium, Methylobacterium, and Faecalibacterium in colorectal cancer liver metastasis and non-liver metastasis groups. F Combined ROC survival curve analysis of Agrobacterium, Fusobacterium, Methylobacterium, and Faecalibacterium between two groups. G Correlation analysis between Agrobacterium, Fusobacterium, Methylobacterium, and Faecalibacterium. (*p < 0.05, **p ≤ 0.01, ***p < 0.001, ****p ≤ 0.0001, indicating statistical significance).
On the other hand, when analyzing the combined genera Methylobacterium, Agrobacterium, Faecalibacterium, and Fusobacterium, we found a significant correlation, with these genera being less expressed in the liver metastasis group. Individual ROC analysis of each genus showed a better diagnostic ability in the liver metastasis group compared to the non-metastasis group (p < 0.05). Specifically, Methylobacterium (AUC = 0.63, sensitivity 81.82%, specificity 44.71%), Agrobacterium (AUC = 0.63, sensitivity 72.73%, specificity 55.29%), Faecalibacterium (AUC = 0.61, sensitivity 52.27%, specificity 64.71%), and Fusobacterium (AUC = 0.62, sensitivity 93.18%, specificity 31.76%) (Fig. 4E and Table 3). Furthermore, when we performed a combined diagnostic ROC analysis for these four genera, the AUC increased to 0.78 (p < 0.0001), with sensitivity of 68.18% and specificity of 74.12% (Fig. 4F). Additionally, correlation analysis of Methylobacterium, Agrobacterium, Faecalibacterium, and Fusobacterium (Fig. 4G) revealed significant positive correlations between these genera (p < 0.05), suggesting that these four intratumoral microbiota have a strong positive correlation. Therefore, these genera could be used as a panel for predicting liver metastasis in colorectal cancer.
Identification of microbial community subtypes in colorectal cancer patients with liver metastasis
Next, we investigated the changes in intratumoral microbiota between colorectal cancer patients with liver metastasis and those without liver metastasis. By combining host and tumor microbiota data, we hypothesized whether the intratumoral microbiota composition in liver metastasis patients could be classified into distinct community subtypes. Based on previous analyses, we identified four microbiota most likely associated with liver metastasis: Methylobacterium, Agrobacterium, Faecalibacterium, and Fusobacterium. We analyzed the intratumoral microbiota expression profiles of each liver metastasis patient and performed dimensionality reduction clustering via Principal Component Analysis (PCA). As a result, colorectal cancer liver metastasis samples were broadly classified into three intratumoral microbial community subtypes (IMCSs), namely IMCS1, IMCS2, and IMCS3 (Fig. 5A, B).

A Principal Component Analysis among IMCS1, IMCS2, and IMCS3. B Classification features of IMCS1, IMCS2, and IMCS3.
IMCS1 subtype accounted for 57.0% of the total tumor samples. This subtype is characterized by either the absence of the four identified pathogens or the predominance of a single intratumoral pathogen. It includes samples where none of the four bacteria were detectable or where only Methylobacterium was present, labeled as “Methylobacterium+“ or “All negative.”
IMCS2 subtype represented 32.0% of the tumor samples. This subtype is defined by the presence of two intratumoral pathogens. It includes profiles such as Methylobacterium+ + Agrobacterium+, Methylobacterium+ + Fusobacterium+, and Methylobacterium+ + Faecalibacterium+.
IMCS3 subtype comprised 11.0% of the tumor samples. This subtype is characterized by the presence of three or all four of the identified intratumoral pathogens, including Methylobacterium+ + Fusobacterium+ + Faecalibacterium+, Methylobacterium+ + Agrobacterium+ + Fusobacterium+, and “All positive” (all four pathogens present).
Correlations between IMCSs and clinical characteristics in patients with colorectal cancer liver metastasis
We conducted a comprehensive retrospective analysis to further elucidate the associations between intratumoral microbiota and the clinicomolecular characteristics of patients with colorectal cancer liver metastasis. Building upon the previous functional enrichment analysis, which suggested the involvement of intratumoral microbiota in colorectal cancer liver metastasis, we collected the tumor markers, immunohistochemical indicators, and lymphocytes in serum as well as follow-up information from patients with IMCS1, IMCS2, and IMCS3 (Table 4). We focused on clinical indicators reflecting the metabolism of three major nutrients: sugars (fasting blood glucose), lipids (triglyceride, cholesterol, low-density lipoprotein, high-density lipoprotein), and proteins (albumin and globulin). Additionally, we examined the immunohistochemical markers Ki-67 and p53 associated with tumor proliferation and invasion and the relationships between intratumoral microbiota and host immune cells, including T cells, B cells, T helper cells, T suppressor cells, and NK cells.
To examine the relationships between the three IMCSs and metabolic indicators, we performed correlation analysis (Fig. 6A). After excluding patients with diabetes and potential measurement errors in the metastasis group, IMCS1 exhibited a strong positive correlation with fasting blood glucose (r = 0.85, p < 0.0001) and a negative correlation with albumin (r = −0.39, p < 0.05). IMCS2 demonstrated positive correlations with albumin (r = 0.70, p < 0.0001) and globulin (r = 0.61, p < 0.0001) but a negative correlation with fasting blood glucose (r = −0.64, p < 0.0001). IMCS3 showed positive correlations with triglyceride (r = 0.36, p < 0.05), cholesterol (r = 0.37, p < 0.05), and low-density lipoprotein (r = 0.52, p < 0.001).

A Related characteristics of metabolic subtypes IMCS1, IMCS2, and IMCS3. B Redundancy analysis (RAD) among IMCS1, IMCS2, and IMCS3. C–G Differences in clinical metabolic characteristics among IMCS1, IMCS2, and IMCS3. H Related immune infiltration characteristics of IMCS1, IMCS2, and IMCS3. I, J Quantification of differential expression of p53 and Ki-67 in IMCS1, IMCS2, and IMCS3. K Immunohistochemical staining of differential expression of p53 and Ki-67 in IMCS1, IMCS2, and IMCS3. L Survival curve analysis between IMCS1, IMCS2, and IMCS3. (*p < 0.05, **p ≤ 0.01, ***p < 0.001, ****p ≤ 0.0001, indicating statistical significance).
We also performed RAD analysis based on a linear model to investigate the associations of IMCSs with environmental and metabolic factors and further examine the relationships among environmental/metabolic factors, samples, and intratumoral microbiota, as well as their pairwise interactions. This analysis demonstrated that IMCS1, IMCS2, and IMCS3 were related to sugar, protein, and lipid metabolism, respectively, and correlated with corresponding environmental factors. These three subtypes were effectively distinguishable in the dimensionality reduction analysis (Fig. 6B). When we next compared the metabolic profiles of patients with different IMCSs, we observed a progressive decrease in fasting blood glucose levels from IMCS1 to IMCS2 to IMCS3. Moreover, IMCS2 showed significantly elevated albumin and globulin levels, while IMCS3 exhibited markedly increased triglyceride and cholesterol concentrations (Fig. 6C–G). These findings indicate that the three IMCSs in patients with colorectal cancer liver metastasis are associated with distinct host metabolic states, and the patients with IMCS1 and IMCS2 display distinct metabolic profiles.
Regarding the host immune responses (Fig. 6H), IMCS1 was positively correlated with T cell activation (r = 0.19, p < 0.05) and was thus designated the T cell-activated immunity subtype. IMCS2 was positively correlated with NK cell activation (r = 0.33, p < 0.01) and was thus designated the NK cell activated-immunity subtype. In contrast, IMCS3 was neither positively nor negatively correlated with T cells, B cells, T helper cells, T suppressor cells, or NK cells and was therefore classified as the pauci-immune subtype. To investigate the proliferation and invasion of colorectal cancer, we compared the immunohistochemical markers p53 and Ki-67 across the three IMCSs (Fig. 6I–K). We found that their expression progressively increased in IMCS1, IMCS2, and IMCS3. This finding indicates that IMCS3 has increased invasion and proliferation rates compared with IMCS1 and IMCS2; IMCS3 may be associated with high malignancy and poor prognosis. In contrast, IMCS1 appears to be associated with a more favorable prognosis.
Moreover, we tracked the occurrence of liver metastasis in colorectal cancer patients, including those without initial liver metastasis who achieved clinical remission after initial treatment. We continued to follow up with the patients without liver metastasis to record any development of liver metastasis using pathological analysis, computed tomography, or positron emission tomography-computed tomography, the standard methods of metastasis monitoring. When we collected follow-up data and performed a stratified log-rank test to evaluate disease-free survival (DFS), we found significant differences in DFS among IMCS1, IMCS2, and IMCS3 (log-rank P = 0.0176). The median DFS (mDFS) was 22 months for IMCS1, 12 months for IMCS2, and 10 months for IMCS3. These findings suggest a progressive decline in prognosis from IMCS1 to IMCS3, as indicated by decreasing survival durations. IMCS3 exhibited a higher degree of malignancy in colorectal cancer liver metastasis compared with IMCS1. The mDFS between IMCS2 and IMCS3 differed by only two months, indicating a similarly poor prognosis for both subtypes. In contrast, IMCS1 demonstrated the most favorable prognosis in terms of malignancy and survival, indicating greater therapeutic benefits. Building on these findings, we propose a clinicomolecular and prognostic subtyping system based on intratumoral microbiota for patients with colorectal cancer liver metastasis.