Diversity of K. pneumoniae isolates from various host sources worldwide
This study encompasses 2809 strains of K. pneumoniae isolated from 8 different hosts across 57 countries and regions (Fig. 1a), spanning from the 1900s to the 2020 s (Fig. 1b). Their host origins included 2369 human strains, 135 pig strains, 126 poultry strains, 54 cattle strains, 48 dog strains, 23 cat strains, and 21 blowfly strains, with an additional 33 samples collected from the environment (Fig. 1c). Detailed sample information is available in Supplementary Data 1 and 2.
a Distribution of samples across 57 countries globally. b Temporal distributions of the samples. c Host distributions of the samples. d Proportions of the top 20 predominant sequence types (STs). Note: The names and materials used in this map do not represent any opinion of Research Square on the legal status of any country, region, city, or territory, nor do they involve the delineation of its borders or boundaries. This map is provided by the authors.
Although all 2809 isolates were originally labeled as K. pneumoniae based on their metadata, species re-identification using Kleborate revealed that 2788 strains (99.4%) were K. pneumoniae, while 8 strains (0.3%) were K. quasipneumoniae and another 8 strains (0.3%) were K. variicola (Supplementary Data 3). These findings indicate that a small number of isolates may have been inaccurately annotated in the NCBI database. Therefore, we did not restrict our analysis strictly to K. pneumoniae sensu stricto but included all isolates belonging to the K. pneumoniae complex.
MLST analysis of the 2809 strains revealed that despite being classified into over 500 STs, there were no significant distinctions between strains of human and non-human origin. Notably, known multidrug-resistant (MDR) STs and hypervirulent STs were found to be prevalent among both human and non-human isolates. Specifically, the 2809 isolates were distributed across 500 known STs, with ST11, ST258, ST15, and ST16 emerging as the predominant types, constituting 16.1%, 8.3%, 6.4%, and 4.1% of the total, respectively (Fig. 1d and Supplementary Data 3). The primary STs in human isolates were ST11 (16.1%) and ST258 (8.3%), while in non-human isolates were ST11 (7.3%) and ST37 (7.3%). Noteworthy MDR STs, including ST11, ST258, ST37, ST15, and ST307, are strains of global public health concern that rank among the top three STs across all animal sources. This observation indicates a significant trend in their prevalence (Supplementary Fig. 1). Moreover, hypervirulent STs that have recently garnered attention, including ST65 and ST86, were identified in samples from both human and non-human isolates, including cats and cattle, whereas hypervirulent ST23 was exclusively found in human isolates (Supplementary Data 3).
K. pneumoniae strains exhibit the potential for cross-species infection
To investigate the cross-species infection potential of K. pneumoniae, we selected three representative strains from different hosts, KP9 (ST11, human origin), KP61 (ST258, canine origin), and KP214 (ST25, porcine origin), based on their MLST types and epidemiological relevance. ST11 is a globally disseminated lineage frequently associated with hypervirulence and multidrug resistance in human clinical infections, particularly in Asia, and has also been reported in animals, suggesting potential for cross-host transmission. ST258 is another globally prevalent lineage, primarily found in humans and associated with carbapenem resistance; its detection in a canine host in this study suggests potential host barrier crossing and adaptation. ST25, represented by KP214, has been linked to septicemia in piglets, making it a relevant pig-associated pathogenic strain. We evaluated the pathogenicity of these three strains in both human and pig cell lines as well as in a piglet infection model. Additionally, we investigated the genomic variances of K. pneumoniae strains from different host sources through population structure and phylogenetic analyses. Initially, adhesion and invasion assays were conducted on five human and pig cell lines, comprising human lung adenocarcinoma epithelial cells (A549), bronchial epithelial cells (BEAS-2B), colon adenocarcinoma cells (Caco-2), swine tracheal cells (NPTr), and intestinal porcine epithelial cells (IPEC). The experimental findings indicated that the three K. pneumoniae strains from diverse origins displayed no host preference in adhesion and invasion capabilities. Notably, there were no significant differences in the adhesion abilities of KP9, KP61, and KP214 to A549 and NPTr, or to IPEC and Caco-2. However, KP9 showed stronger adhesion to A549 compared to KP214 (Fig. 2a). Furthermore, the invasion abilities did not correlate with host origin. KP61 demonstrated higher invasion abilities towards A549 and BEAS-2B cells compared to KP9, while KP214 exhibited stronger invasion of Caco-2. In contrast, KP61 exhibited higher invasion abilities against IPEC cells than KP214 (Fig. 2b), suggesting the potential for cross-species transmission of K. pneumoniae strains.

a Bacterial adherence and invasion to respiratory and intestinal epithelial cells derived from humans and/or pigs. b Bacterial invasion of respiratory and intestinal epithelial cells derived from humans and/or pigs. Each experiment was performed with four technical replicates. c Clinical sign scores of pigs challenged with K. pneumoniae strains originating from humans, dogs, and pigs. d The number of K. pneumoniae strains recovered from different organs of challenged pigs at 48 hours post-challenge. e The number of K. pneumoniae strains recovered from different organs of challenged pigs at 7 days post-challenge. The error bar represents the standard deviation. The significance level was set at p < 0.05 (*), or p < 0.01 (**), non-significant comparisons (p > 0.05) are not labeled. The error bar represents the standard deviation. N = 5 biologically independent animals per group. f Histological examinations of different organs from challenged pigs are shown. (Scale bars = 50 μm) The lungs of challenged pigs exhibit extensive thickening of the alveolar walls, with unclear alveolar wall structure. A small amount of inflammatory cell infiltration is observed on the alveolar walls. The bronchial structures appeared normal without any evident abnormalities. There were no significant interstitial proliferations or other notable abnormalities observed in the interstitium. The livers showed hepatocellular focal necrosis with hemorrhage, accompanied by a small amount of inflammatory cell infiltration within the necrotic lesion. Additionally, there was a small amount of hepatocellular hydropic degeneration, cellular swelling, and pale staining of the cytoplasm. The spleens of challenged pigs exhibited a decrease in white pulp volume, occasionally accompanied by mild hemorrhage within the white pulp, along with a small amount of extravasated red blood cells. There was no significant abnormality in the number of parenchymal cells within the red pulp, and the splenic sinuses do not show significant dilation. However, there was notable infiltration of granulocytes. The renal tissue of the kidneys showed widespread tubular necrosis, with necrosis and dissolution of the renal tubular epithelial cells, resulting in an unclear tubular structure. Acidophilic material can be seen within the glomerular capillaries. Focal hemorrhage was observed in multiple areas of the interstitium, along with scattered inflammatory cell infiltration. The lymph nodes exhibited abundant and well-defined lymph follicles within the cortex, without any evident abnormalities. Focal hemorrhage is observed in multiple areas of the medulla.
Subsequent to the cell experiments, pathogenicity assessments were performed using a pig model. When infected at equal doses, the three K. pneumoniae strains infiltrated multiple organs in pigs and induced similar clinical symptoms, including elevated body temperature (Supplementary Fig. 2a), depression, reduced appetite and activity, coughing, difficulty breathing, and diarrhea. In terms of clinical symptom scores, no significant statistical differences were observed between KP9 and KP214, while KP61 exhibited higher scores after one day of inoculation (Fig. 2c). Additionally, except at 6- and 24-hours post-inoculation, the three K. pneumoniae strains displayed no significant differences in bacterial loads in the blood (Supplementary Fig. 2b). Further analysis of bacterial loads at different time points and in various organs revealed significant differences in the invasion abilities of the three strains across diverse organs, with no host dependency observed (Fig. 2d, e). Histological examinations affirmed that these strains inflicted notable damage to the lungs, liver, kidneys, spleen, and lymph nodes (Fig. 2f).
The population structure analysis based on Bayesian models also unveiled similar population structures between human and non-human strains. The analysis categorized the strains into 23 BAPS groups (Fig. 3a, b), with only BAPS groups 1, 3, and 5 exclusively containing human strains, while the remaining 20 groups comprised a mix of human and non-human strains. Additionally, phylogenetic analysis using whole-genome sequences indicated no significant correlation between the phylogenetic relationships of the strains and host preferences (Fig. 3a). These results suggest that there is no distinct genetic boundary separating human and non-human strains of K. pneumoniae. These findings collectively demonstrated no host specificity or dependency in the pathogenicity and infectivity of the K. pneumoniae strains. Genomic analysis further indicated that these strains did not exhibit significant host preferences at the genetic level. These results support the potential for K. pneumoniae to have cross-species infection capabilities.

a The whole genome sequence single nucleotide polymorphisms are utilized to generate a phylogenetic tree. From the inner to the outer circles are 1. Sample source, 2. Baps group, 3. Hypervirulent K. pneumoniae (hvKP), 4. Carbapenem-resistant (CRKP), 5. Hypervirulent carbapenem-resistant (hv-CRKP), 6. Top 10 predominant sequence types (STs). b The number of K. pneumoniae strains from different host species belonging to different BAPS groups.
Temporal trends in K. pneumoniae antimicrobial resistance
To investigate the trends in AMR in K. pneumoniae over time and the impact of policy implementation, we analyzed ARGs in 2809 K. pneumoniae strains spanning from the 1900s to the 2020s using whole genome sequencing data. A total of 414 AMR genotypes associated with 15 antimicrobial agents, including quaternary ammonium salts (QAS), were identified (Supplementary Data 4). Overall, the resistance burden in K. pneumoniae exhibited an upward trend over this period, influenced by both the expansion of MDR clones and the antimicrobial regulation policies adopted in multiple countries.
In terms of ARG carriage, the average number of ARGs carried by K. pneumoniae strains reached its peak in the 2000s, subsequently declined, and then stabilized throughout the 2010s and 2020 s (Fig. 4a). This trend coincided with the introduction of major antimicrobial restriction policies, such as the European Union’s 2006 ban on antibiotic growth promoters in animal feed, the 2015 Global Action Plan on AMR issued by the WHO, and national-level guidelines implemented in the United States (2007) and China (2019) (Supplementary Fig. 3a). Statistical analyses indicated a significant reduction in the average number of ARGs per strain across different regions following the introduction of these policies (p ≤ 0.05; Supplementary Fig. 3b–d). Notably, the reduction in ARG carriage occurred during a period when several antibiotic stewardship policies were introduced globally, although the direct impact of these interventions remains to be determined. These findings suggest that policy-driven reductions in antibiotic usage may help curb the horizontal spread of ARGs and potentially delay the emergence of novel resistance mechanisms. However, this ostensibly “positive trend” requires careful and nuanced interpretation. Although the number of ARGs per strain has decreased, the overall level of AMR has continued to escalate, most notably characterized by high resistance rates to critical antibiotics such as extended-spectrum β-lactamases (ESBLs) and carbapenems.

a Distribution of numbers of ARGs per genome in different sampling periods: b AMR scores in different sampling periods, calculated using Kleborate in different sampling periods. Before 2000 (n = 121), 2000s (n = 393), 2010s (n = 1959), and 2020 s (n = 336). The error bar represents the standard deviation. The error bar represents the standard deviation. The significance level was set at p < 0.05 (*), p < 0.01 (**), or p < 0.001 (***). c Proportions of hypervirulent sequence types (hvSTs) and MDR STs in each sampling period (n = 2809). d Temporal changes in AMR genotypes. Different colors represent the different antimicrobial agents (n = 2809). AMR genotypes were annotated based on the NCBI Bacterial Antimicrobial Resistance Reference Gene Database (BioProject ID: PRJNA313047). The values are presented in Supplementary Data 5. e Temporal variation in the prevalence and abundance of ARGs (n = 2809). The heatmap displays ARG prevalence (left blue) and ARG abundance within categories (right green), categorized by sampling period. Only ARGs with a prevalence exceeding 5% in any sampling group are shown. Colored bars indicate the predicted resistance phenotypes associated with the ARGs.
To further assess resistance levels comprehensively, we reanalyzed all isolates using Kleborate. The results indicate that calculated resistance scores have exhibited a consistent year-on-year increase (Fig. 4b). This upward trend aligns with the proliferation of specific highly antibiotic-resistant clones (Fig. 4c; Supplementary Fig. 4). Additionally, temporal variations in antibiotic resistance patterns were observed to differ significantly across various antibiotic classes. For instance, resistance to commonly used antibiotics, including aminoglycosides, macrolides, tetracyclines, sulfonamides, and trimethoprim, remained relatively low (<20%) prior to the 2000s (Supplementary Data 5). However, it increased sharply thereafter and stabilized at elevated levels (25–80%) in subsequent decades. Resistance to extended-spectrum ESBLs and carbapenems has risen continuously since the 2010s, reaching 60.4% and 50.6%, respectively. In contrast, resistance to β-lactams, quinolones, and fosfomycin remained consistently high (80.7–100%) throughout the entire period. Notably, polymyxin resistance, despite its classification as a last-resort antibiotic, has persisted at low levels (0–3.4%) across all examined timeframes (Fig. 4d).
Furthermore, we examined the variations in the prevalence of specific ARGs among strains over this period and their relative proportions within the same ARG category (Fig. 4e). Notably, the aminoglycoside resistance gene aac(6’)-Ib-cr showed a continuous increase, whereas aadA1 exhibited a decline in the 2010s and 2020s. The predominant aminoglycoside resistance genes transitioned from aadA1 (conferring streptomycin resistance) to aac(6’)-Ib-cr (conferring amikacin resistance). Chloramphenicol has been prohibited in developed nations due to severe adverse effects, except for its topical or ophthalmic applications, leading to a reduction in its clinical use in human medicine in recent years32. However, its derivative florfenicol remains extensively utilized in veterinary medicine33. The prevailing chloramphenicol resistance genes shifted from catA1 and catB4 to floR, which confers resistance to both chloramphenicol and florfenicol. The primary resistance genes associated with fosfomycin shifted from fosA to fosA6.
Temporal dynamics of virulence in K. pneumoniae
By identifying virulence-associated genes in K. pneumoniae (Supplementary Data 6), we observed a progressive increase in the overall virulence potential of the K. pneumoniae population over time. This trend was primarily driven by the expansion of MDR clones (e.g., ST11) that have acquired key virulence factors, rather than by the spread of classical hypervirulent clones. To investigate the basis of this temporal shift, we systematically evaluated virulence levels across time periods. Both the virulence gene burden (Supplementary Fig. 5a) and Kleborate-assigned virulence scores (Supplementary Fig. 5b) showed a marked upward trend from the 2000s to the 2020s. Notably, a subset of isolates collected before 2000 displayed unexpectedly high virulence scores, potentially due to sampling bias in earlier studies. We further examined the temporal dynamics of key virulence determinants, including siderophore systems (yersiniabactin, aerobactin, enterobactin, salmochelin), the “regulators of mucoid phenotype” (rmpA and rmpA2), and the genotoxin colibactin. With the exception of colibactin, carriage rates of these factors steadily increased over time (Fig. 5a), in parallel with a rise in predicted siderophore gene counts (Fig. 5b).

a Frequency of virulence gene clusters across different sampling periods. b Distribution of the number of siderophore gene clusters per isolate across sampling periods. The siderophore gene clusters include yersiniabactin, aerobactin, salmochelin, and enterobactin. Before 2000 (n = 121), 2000s (n = 393), 2010s (n = 1959), and 2020 s (n = 336). c Temporal dynamics in the carriage of virulence scores predicted by Kleborate for strains belonging among major K. pneumoniae sequence types. Only the top 24 most prevalent STs are shown (n = 2809). d Temporal prevalence of the top 24 most common STs across the four sampling periods. VFGs categories were annotated based on the Virulence Factor Database (VFDB), n = 2809.
Despite the rising virulence trend, the prevalence of traditional hypervirulent clones (e.g., ST23) remained relatively stable (Fig. 4c), suggesting that the observed increase in population-level virulence may be attributable to the stepwise acquisition of virulence genes within certain STs. Supporting this, several MDR clones (e.g., ST11, ST15, ST16, ST101, ST147) demonstrated significant concurrent increases in both Kleborate scores (Fig. 5c), enhanced carriage of key virulence genes (aerobactin, yersiniabactin, salmochelin, and rmpA/rmpA2; Supplementary Fig. 6), and a progressive rise in population-wide relative abundance (Fig. 5d). These findings suggest that population-wide virulence enhancement is not being driven by the expansion of traditional hypervirulent clones, but rather by the expansion of MDR clones that have acquired virulence traits, reflecting convergent evolution driving the emergence of dual-risk clones with both resistance and virulence.
To further evaluate this convergence and its epidemiological implications, we focused on a representative convergent phenotype: carbapenem-resistant hypervirulent K. pneumoniae (CR-hvKP). Based on previously established criteria involving the co-presence of rmpA/rmpA2 and iucABCD-iutA virulence markers34, we identified 415 hvKP strains across the dataset. To validate this classification, we assessed their virulence scores, siderophore gene content, capsule serotypes (notably K1 and K2), and the representation of hypervirulent STs (e.g., ST23 and ST65) among the identified hvKP strains. These strains consistently exhibited classical hypervirulence features, confirming the robustness of our definition (Supplementary Fig. 7). Among these, 193 strains were further classified as CR-hvKP. The number of CR-hvKP isolates showed a clear upward trend from the 2000s to the 2020s, despite a minor drop in 2021, likely due to sampling bias. Clonal distribution analysis revealed striking geographic differences in the clonal structure of CR-hvKP: ST11 dominated in China (78.6%), while ST23 was more prevalent outside China (25%) (Supplementary Fig. 8).
Human-derived strains exhibit higher antimicrobial resistance and virulence than non-human strains
To examine the distribution patterns of ARGs, VFGs, and plasmids in K. pneumoniae strains, we compared WGS data from strains originating from eight distinct hosts: humans, cats, dogs, poultry, pigs, cattle, blowflies, and the environment. Overall, non-human isolates exhibited lower levels of antibiotic resistance and virulence abundance compared to human-derived strains. In terms of ARG counts, isolates from poultry and cats harbored significantly more ARGs than those from humans, while cattle-derived strains exhibited the lowest ARG counts (Fig. 6a). Isolates from dogs, pigs, blowflies, and the environment showed no significant differences compared to human strains. However, AMR scores predicted from WGS data using Kleborate revealed that, with the exception of blowfly isolates, all non-human strains exhibited significantly lower resistance scores than human isolates (Fig. 6c). This discrepancy between ARG count and AMR score is largely due to the fact that AMR scores reflect only resistance to carbapenems, ESBLs, and colistin. Consistent with this, non-human isolates showed higher prevalence of resistance to tetracyclines, macrolides, and rifampicin, while human-derived strains exhibited markedly higher resistance to carbapenems and ESBLs (Fig. 6e). Epidemiological trends of specific ARGs also varied by host source, with blaKPC-2 being predominant in human isolates, while blaNDM-5 (in pigs, poultry, and the environment), blaNDM-7 (in blowflies), and blaOXA-48 (in dogs and cats) were prevalent in non-human isolates. Colistin ARGs predominantly included mcr-1 in human strains, mcr-8 in poultry and blowfly strains, and chloramphenicol ARGs circulated as floR in farm animals and the environment, with catA/B predominantly found in human and pet isolates (Supplementary Fig. 9).

a The distribution of the number of ARGs in different sample hosts. b The distribution of the number of VFGs in different sample hosts. c AMR scores predicted by Kleborate for isolates from different host sources. d Virulence scores predicted by Kleborate for isolates from different host sources. The error bar represents the standard deviation. The significance level was set at p < 0.05 (*), p < 0.01 (**), or p < 0.001 (***). e Distribution of AMR and hypervirulence-associated genes among K. pneumoniae. fim type I fimbriae, mrk type 3 fimbriae, ent enterobactin, K-locus capsule, O-locus lipopolysaccharide (LPS), clb colibactin locus; iro salmochelin locus, iuc aerobactin locus, rmpA/A2 regulators of mucoid phenotype genes, ybt yersiniabactin locus. The numbers of strains isolated from different hosts are: humans (n = 2369), pigs (n = 135), poultry (n = 126), cattle (n = 54), dogs (n = 48), environments (n = 33), cats (n = 23), blowflies (n = 21).
The distribution of VFGs also differed markedly between human and non-human sources. Both the number of VFGs and virulence scores predicted by Kleborate indicated that human-derived strains exhibited higher virulence than those from non-human hosts (Fig. 6b, d). This difference was particularly evident in accessory virulence-associated genes such as clb, ybt, iro, rmpA/A2, and iuc (Fig. 6e). The clb gene was exclusively detected in human isolates. In addition, ybt, iro, and rmpA/A2 were significantly more prevalent in human strains than in non-human strains. As for hvKP strains, the vast majority were identified in human-derived strains (408 strains), with only a few detected in strains from pigs (5 strains), cats (1 strain), and cattle (1 strain). Notably, CR-hvKP strains were exclusively found in human-derived isolates.
Transfer of antimicrobial resistance genes and virulence factor genes
We investigated the correlations among plasmids, ARGs, and VFGs in isolates sourced from various hosts. A total of 9621 plasmid replicons were identified and classified into 111 distinct types. Notably, IncFIB(K), IncR, ColRNAI, IncHI1B(pNDM-MAR), and IncFII(pHN7A8) emerged as the most prevalent, constituting 40.62%, 35.39%, 34.46%, 20.68%, and 20.08%, respectively, of the 2809 isolates (Supplementary Data 7). To explore potential associations among plasmid replicons, ARGs, and VFGs, we conducted a Pearson correlation analysis. Although short-read sequencing data are insufficient for inferring direct physical linkage or horizontal transmission events, the co-occurrence patterns provide preliminary indications of potential relationships between certain plasmid types and ARGs or virulence genes. Notably, several plasmid replicons exhibited concurrent associations with both ARGs and major VFGs. For instance, IncFIB plasmids were correlated with the high virulence marker gene iroE and ARGs such as blaACT-16, catB4, and fosA, whereas IncHI1 plasmids were significantly associated with the high virulence marker gene rmpA and ARGs like tet(X4), lnu(G), and rmtB2 (Fig. 7). In our analysis grouping strains by human and non-human origins, we observed that the relationships among ARGs, VFGs, and plasmids in human isolates reflected the broader trends (Supplementary Fig. 10). Conversely, non-human isolates displayed a higher number of plasmid incompatibility groups associated with colistin resistance genes mcr-1 or mcr-8, including ColRNAI, IncFIB, and IncX4 (Supplementary Fig. 10).

Pairwise values with Pearson correlation coefficients greater than 0.5 (red) or less than −0.5 (blue) are displayed. The blue box labeled “ARG-ARG” illustrates the pairwise co-occurrence and mutual exclusivity among ARGs, while the green box labeled “VFG-VFG” showcases the same for VFGs. The green box marked “Plasmid-Plasmid” represents the pairwise co-occurrence and mutual exclusivity among plasmid replicons. Gold boxes with numerical labels denote the co-occurrence and mutual exclusivity between ARGs/VFGs and plasmid replicons, with annotations provided in the bottom left corner.
Overall, a positive correlation was evident between the count of ARGs and the number of plasmid replicons (rPearson = 0.39, p < 0.0001, Supplementary Fig. 11a), while no significant correlation was observed between VFGs and the count of plasmid replicons (rPearson = 0.07, p < 0.0001, Supplementary Fig. 11b). Additionally, there was no substantial correlation detected between ARGs and VFGs (rPearson = −0.07, p < 0.0001, Supplementary Fig. 11c).