Molecular characterization of bla NDM -harboring plasmids reveal its rapid adaptation and evolution in the Enterobacteriaceae

Carbapenem is one of the few available drugs to treat multidrug-resistance Gram-negative bacteria infections. Recently, the plasmid-mediated spread of the carbapenem resistance gene bla NDM poses a significant threat to public health, requiring global monitoring and surveillance. Here, we used both short-read ( n = 2461) and long-read ( n = 546) sequencing data to characterize the global distribution of bla NDM . We analyzed the replicon type of bla NDM - positive plasmids and found that the dominant plasmid type was different in diverse geographical locations. Although bla NDM gene has been transferred across diverse countries, its genetic backgrounds are highly conserved, and the mobile genetic element ISA ba125 , IS 5 , and IS 26 may play an important role in the mobilization of bla NDM .


Introduction
Carbapenem resistance has become increasingly prevalent worldwide in the last two decades, resulting in limited therapeutic options for severe clinical infections caused by Gram-negative bacteria [1][2][3].New Delhi Metallo-β-lactamase (NDM), a carbapenemase first described in 2009, can hydrolyze nearly all β-lactams, including carbapenems.NDM-encoding gene bla NDM is of particular concern because of its alarming global spread features, including (1) a broad host range and frequent acquisition among Escherichia coli and Klebsiella pneumoniae; (2) contributing greatly to the prevalence of Carbapenem-resistant Enterobacteriaceae in diverse host and environment, such as livestock, pets, food animal products, and other natural environment; (3) widely transmission in Indian subcontinent, Southeast and East Asia; and (4) co-occurrence with other resistance genes on plasmids, particularly with genes of great public health concern (e.g., mcr-1) [4][5][6][7].Recently, many studies have described the above features, however, few researches have focused on the transmission mode and underlying characteristics of bla NDM at the genomic level [8][9][10].
The vast majority of bla NDM are located on plasmids, which play a key role in the transmission of bla NDM [7].Three genetic tiers should be considered in understanding plasmid-borne gene molecular epidemiology, including (1) bacterial spread among diverse hosts; (2) inter-bacterial plasmid conjugation; (3) inter-plasmid gene module transposition.Previous researches were usually restricted to a specific region, bacteria, or host, and the number of plasmids involved was usually less than 10 [6, 11, 12].However, with the accumulation of plasmid sequencing data, there is an urgent need for more comprehensive and intensive studies on the global dissemination of bla NDM .
With the improvements in sequencing technology and reduction in cost, the number of available plasmid sequences in public databases has been increasing.However, numerous repetitive elements complicate the plasmid reconstruction from short-read sequencing data which take up a large portion of databases [13].To obtain the complete sequences of plasmids, long-read sequencing data (~ 10 kb in one single read) can be a reliable solution [14].With the faster accumulation of long-read sequencing data, recently, some researchers have built the databases of complete plasmids sequences, such as pATLAS [15], PLSDB [16], COMPASS [17].This advance contributes significantly to large-scale plasmids comparison.
Previous studies have mostly focused on bla NDM -harboring contigs, which have led to a better understanding of evolution of its genetic contexts.However, a current problem is lacking comprehensive and large-scale analysis based on complete bla NDM -harboring plasmids.Here, we collected all long-reads and short-reads sequencing data of bla NDM -harboring contigs from 2011 to 2021 to understand the global dissemination and genetic characteristics of bla NDM .In this study, we investigated the geographical dominance and source specificity of bla NDM -harboring plasmids, and evaluated their transfer features between different clonal lineages to further understand its evolution and adaptation.

bla NDM -harboring plasmids showed extensive modularity and geographical dominance
In total, 546 complete plasmids (long-read sequencing data) carrying bla NDM retrieved from NCBI database as of 2021 were included in the subsequent analysis, as the remaining were judged as incomplete, mislabeled, repeated or bla NDM -negative (Table 1).These complete bla NDM -harboring plasmids were found to have a diverse length ranging from 46 kb to 131.67 kb (Table 1).Besides, we combined all the bla NDM -carrying contigs (shortread sequencing data) in the NCBI database to make our results more comprehensive and reliable.All long-read and short-read sequencing data collected in this study covered 71 countries (Supplementary Table 1).In total, 1,586 contigs from E. coli and 875 contigs from K. pneumoniae carrying bla NDM gene were downloaded from the NCBI website (Supplementary Table 2).1, 225 (77.24%) contigs from E. coli were predicted to be located in plasmids and 839 (79.11%) contigs from K. pneumoniae were assigned to plasmids.Among all the bla NDM -harboring plasmids (complete and contigs), a total of 15 Inc types have been identified, suggesting NDM could be easily adapted to multiple plasmid genetic environment, without raising the fitness cost.In total, IncFI (n = 903, 35.98%) was the most dominant maintaining a high proportion during the 11 years (Fig. 1B), followed by IncX3 (n = 749, 29.84%) and IncA/C (n = 225, 8.96%).IncX3 shows an upward trend from 2011 to 2021, while IncA/C shows a downward trend (Fig. 1B).IncX3 is the dominant Inc in countries such as China, India and South Korea, while IncFI has a dominant position in countries such as Canada and the United States (Fig. 1A).2).These findings suggested that most transmission of NDM has been limited within the continent region.
We further predicted the oriT sites of the complete plasmids.Mobility (MOB) types and Mating pair formation (MPF) types of the complete plasmids were also analyzed, which were based on relaxase proteins and T4SS systems, respectively.39.9% of the complete plasmids (n = 218) predicted to harbor oriT site, MOB and MPF simultaneously, which can be classified as conjugative.Although the typical oriT site were not predicted on the IncX3 plasmids (n = 179, 32.8%), its high conjugation transfer frequency has been confirmed in previous studies [18,19].30 different Inc-MOB-MPF combinations were observed and the top 3 combinations in quantity were highlighted and displayed with date in Supplementary Fig. 1.IncX3-MOB P -MPF T was found to be the most popular group (n = 171, 31.3%).Previous study has shown that MOB P is the most represented of the six relaxases families (MOB F , MOB H , MOB Q , MOB C , MOB P , and MOB V ) and usually associated with MPF T [20,21].However, the reasons for this preference are still unclear, possibly because of interactions of the two conjugative modules.

Genetic characteristics of bla NDM gene contexts
We recruited 546 complete and putative plasmid sequences and identified the flanking genes of bla NDM .In total, 149 clusters of the flanking of bla NDM in plasmids were identified, among which the largest cluster contained 119 (21.8%) sequences, followed by 13 other clusters contained 10 or more sequences (Fig. 2).Multiple clusters of bla NDM gene contexts have been wide spread across different plasmids, bacteria, host, and geographical regions.IncX3 type plasmids showed significant dominance in the largest two clusters (cluster 1 and 2), and displayed a preference for cluster 1, 2, and 5 (Fig. 2).IncFI plasmids showed the most diverse genetic background (11 of the 14 clusters), followed by IncA/C (n = 8) (Fig. 2).Plasmids collected from Escherichia and Klebsiella were widely assigned to almost all clusters, indicating the importance of the two genera in the bla NDM transposition.All 14 clusters contained plasmids of human origins and half clusters contained plasmids of animal origins.Besides, 6 clusters covered plasmids from 4 continents (Fig. 2).Plasmids from Acinetobacter (serving as the intermediate source for the mobilization of bla NDM into the Enterobacteriaceae [7]), were mainly classified into cluster 6, which also contained plasmids from Escherichia, Klebsiella, and Providencia.This cluster was predominately composed of IncHI1 plasmids and involved multiple host sources and regions.Multiple clusters of bla NDM gene contexts have been wide spread across different host, environments, and even five continents, suggesting its high adaptability to diverse stress or environmental conditions.
The upstream of bla NDM gene is always insertion sequence, while the downstream is always a complete or remnant form of a gene cluster, including ble MBL , trpF, dsbD, cutA, groES-groEL, and insertion sequence, which has confirmed in previous study [7].In this study, gene cluster "ble MBL , trpF, dsbD" were identified in 13 clusters (92.9%).Six of 14 clusters harbored ISAba125 and an insert element was observed between ISAba125 and bla NDM in three clusters (Fig. 2).For putative plasmid contigs assembled from short-read sequencing data, ~ 50% (n = 584) included Tn3 family transposase, suggesting Tn3 as an important transposon of bla NDM genes among diverse plasmids.ISAba125 (371 contigs), IS5 (338 contigs), and IS26 (75 contigs) are the three most common mobile genetic elements identified (Fig. 2, Supplementary Table 3), indicating their important role in the mobilization of bla NDM .The genetic environment of bla NDM gene is highly conserved which can be transferred in different host bacteria through different mobile genetic elements.
Few studies have been conducted to reveal the variation of bla NDM -harboring plasmid, and its evolution during the transmission of bla NDM among different host.IncX3 type plasmids occupied a large fraction in recent years, even up to 64.36% in 2021, exceeding IncFI (Fig. 1B).In this study, we extracted the corresponding nucleotide sequences of the coding gene regions where core plasmid-genome SNPs located in (complete IncX3 plasmids).Most of the bla NDM -IncX3 plasmids were highly similar, and the overall identity was over 99%, which has been confirmed by several epidemiological studies [22][23][24].Only a few point mutations were observed in adjacent insertion elements.Specially, two adjacent base substitutions were observed in the IS26 transposase Tnp26 in 3 complete plasmids (Accession Number: NZ_CP048028, NZ_MK033579, NZ_ MK033583), which resulted in G184N substitutions, as previously reported [25].The G184N substitutions can generate IS26**, the third IS26 variant, which showed a 10-fold increase in the cointegration frequency in previous study [25], suggesting that point mutations may be important in the adaption and persistence of plasmids in different host.

Dissemination characteristics of bla NDM -harboring plasmids across diverse host lineages and environment
We focused on complete plasmids from Enterobacteriacea species, which make up over 50% of the complete plasmids.66 distinct sequence types (STs) and 25 STs were identified in E. coli (n = 192) and K. pneumoniae (n = 55) (Fig. 3).19 shared core plasmid-genome SNPs were identified from E. coli-harboring IncX3 plasmids (core plasmid-genome: 16,600 bp, 36%) and 72 shared SNPs were detected among K. pneumoniae-harboring IncX3 plasmids (core plasmid-genome: 18,967bp, 41.1%) (Fig. 3), which further confirmed the high conservation Fig. 2 Clusters of diverse bla NDM genetic contexts on plasmids.The left listed the representative genetic contexts of every cluster (n ≥ 10).ORFs are color-coded and the direction of transcription indicated by arrowheads.The right pie graph showed the distribution of genus, Inc type, host source, and location situation of plasmids in the corresponding cluster.The missing part of circles represented the loss of information of bla NDM -positive plasmids.bla NDM -harboring IncX3 plasmids from K. pneumoniae had higher SNPs (n = 72) than those from E. coli (n = 19) (Fig. 3).
For E. coli, many IncX3 plasmids were associated with clone expansions of isolates, which were described as two or more isolates of the same ST clustered in the phylogenetic tree based on core plasmid-genome SNPs.In particular, 105 (54.7%) bla NDM -harboring IncX3 plasmids belonged to E. coli isolates of 11 clonal expansions and isolates belonged to ST167, ST410, and ST10 accounted for the majority (19.8, 5.2, and 4.7%, respectively).For K. pneumoniae, ST20 (16.4%),ST1 (5.4%), ST11 (5.4%) were the most common lineages.IncX3 plasmids from E. coli ST167 showed little variation with other 27 STs (G ST = 6.6×10 −4 ) and large amount of variation was found among 414 ST pairs (G ST = 1).The pairwise SNP differences of core-genome of IncX3 plasmids from different E. coli STs were typically low, from 0 to 3. The IncX3 plasmids from ST16 Klebsiella pneumoniae was extremely similar to IncX3 plasmids from ST147 Klebsiella pneumoniae (G ST = 0) and large amount of variation was found among 134 ST pairs (G ST = 1).The maximum pairwise SNP difference of IncX3 plasmids from Klebsiella pneumoniae was 6 Fig. 3 Tanglegram links phylogenetic trees constructed using core genome SNPs and IncX3 plasmid.A A tanglegram of IncX3 plasmid conserved sequence (16,600bp, 36.0%)phylogeny with the core genome phylogeny of E. coli (n = 192).Lines have been drawn between tips in the trees representing the same isolate.B A tanglegram of IncX3 plasmid conserved sequence (18,967bp, 41.1.0%)phylogeny with the core genome phylogeny of K. pneumoniae (n = 55).The color of tree nodes and links represented the ST type of the isolates.The inner bar showed the host sources of the isolates and the outer bar showed the continent they were collected from (ST20), 2 (ST1), and 5 (ST11).Our results indicated that plasmid-mediated bla NDM disseminated by a multiple plasmids/multiple host lineages pattern.
Extensive differences in plasmid length were observed among different bacteria hosts 4A), suggesting correlations between the host and plasmid types.bla NDM -harboring plasmid has been identified in a variety of bacteria host, except Enterobacteriaceae in most complex environment, further indicating the capability of adaptation to multiple environments (Fig. 4A).Further, we built Discriminant Analysis Components (DAPC) models based on single-nucleotide polymorphism (SNPs) in conserved regions and genes presence/absence variations, respectively, to investigate the genomic characterization of different host origins.No significant association were observed between core-genome SNPs and host origins for IncX3 plasmids (Fig. 4B).Only complete IncX3 plasmids were recruited for genes presence/absence variation analysis to avoid the errors introduced by draft contigs.DAPC suggested that the genes content variation in plasmids could be used to identify its original host source (Fig. 4C).Gain and loss of genetic material has long been recognized to be an important process in bacterial evolution [26], our results indicates that gain or loss of genes in bla NDM -positive IncX3 plasmids may be a key factor in adaption improvement to different environments.
More Inc types were identified in the bla NDM-1 involved clusters than bla NDM-5 involved clusters.All bla NDM-1 involved clusters were related to more than seven Inc types, except for the combination of bla NDM- 1 -sul1 with aph(3)-VI, rmtC or aac(6)-lb, which was only occurred in four Inc types.Among other bla NDM-1 involved clusters, IncA/C and IncX3 type plasmids account for over 60%, especially, 88.2% of plasmids harboring bla NDM-1 -sul1-aac(6)-lb belonged to IncA/C.For bla NDM-5 involved networks, IncA/C (>85%) was the predominant Inc type, followed by IncX3 (~5%) and IncFI.The co-existence of bla NDM with multiple ARGs and plasmid Inc type in the same host indicated its strong ability of adaption to the multiple drug resistance environment, without significantly rising the fitness cost.

Discussion
Plasmids are the primary carriers of ARGs which usually are significant threats to global public health.Horizontal gene transfer via plasmids is widely recognized as one of the most important ways for the transmission of ARGs, such as bla NDM .Recently, investigation on ARGs is mainly based on genetic backgrounds and clonally evolving lineages, and analysis of plasmids is usually excluded or only evaluated by low-resolution techniques (such as Inc typing).In this study, we collected 546 bla NDMharboring plasmids with complete sequence and 2,352 bla NDM -harboring CRE draft genomes (putative plasmid contigs, n = 2,064; putative chromosome contigs, n = 397), to investigate the diversity, distribution, and transmission of bla NDM -harboring plasmids from 2011 to 2021 in a global perspective.Our study highlighted the importance of analyzing bla NDM -harboring plasmids to understand the evolution and adaptation of bla NDMharboring plasmids and its coevolution of with bacteria genome (resistome).
First, plasmid classification enables better understanding of the characterization and transmission mode of diverse global plasmids.Transferable plasmids can be spread between stains and even species, thus, understanding global geographic distribution of bla NDMharboring plasmids is pivotal to study the mobilization events [28].Inc type was still selected as the classification criteria for bla NDM -harboring plasmids.In total, several Inc types were found to be significantly associated with specific continents.bla NDM -harboring IncFI-type plasmid was first identified in an E. coli strain isolated from a patient from India in 2009, and its transmission was highly identical with population migration, which was primary dominant type in south, southeast, west Asia, and America (Fig. 1A).The discovery of the dominant plasmids (Inc type) in different countries can help to trace the emergence and global spread of the bla NDM gene.
Multiple studies had described bla NDM genetic contexts, while mostly aimed to characterize the transmission of bla NDM in limited areas [29][30][31].As we know, investigation of the association between plasmid backbones and antimicrobial resistance gene modules is vital important.In our study, we identified 14 dominant clusters of bla NDM genetic contexts.The upstream of bla NDM contained a high number of ISs (insertion sequences), while the downstream presents a high structural diversity (Fig. 2).The constitution of genetic environment of ARGs may be the result of pressure from local ecological and evolutionary pressures [32].Here, we further explored the geographical dominance, host source specificity, bacterial genera, and dominant Inc type of plasmid for a better understanding of complex transfer modes of specific gene background.According to the observations of currently available data, IncX3 plasmid occupied a large fraction of the dominant cluster, and was identified in diverse geographical locations, host sources, and bacterial genera suggesting the wide spread and low fitness cost of IncX3 plasmid, as previous reported [7,18,19,33,34].We hypothesized that IncX3 could be the dominant bla NDM -harboring plasmids in the next a few years.Among 14 dominant clusters, clusters 4, 7, 10, 11, and 14 were only found in clinical origin, illustrating host sources are also key factors restricting the transfer of ARGs.Cluster 4 and cluster 5 were discovered only in Asia and cluster 14 was discovered only in Europe, indicating that country boundaries may limit the co-transfer of ARGs and genetic background in flanking regions.Besides, multiple plasmid fusion events were observed in this study, especially in IncFI-type plasmids.An important evolutionary feature of plasmids is that they contain multiple transposable elements and undergo frequent genetic transposition, leading to plasmid fusions and possibly better adaptation to diverse stress (such as, antibiotics) and bacterial host [35].
We found that plasmid-mediated bla NDM disseminated by a multiple plasmids/multiple lineages pattern.NDMharboring plasmids are acquired by diverse lineages, such as the IncX3 plasmid that was found in 66 E. coli STs and 25 K. pneumoniae STs.Few SNP differences were identified among the conserved region of IncX3, IncFI, and IncA/C plasmids (three dominant types), respectively, suggesting the fast horizontal transfer of bla NDM -harboring plasmids, as a previous Europen-wide study showed [36].IncX3 plasmids from ST167 E. coli had the minimum SNP differences with those from the other 27 STs, which may suggest its expediated dissemination among different host lineages.Previous studies had reported that the E. coli ST167 type was emerging all around the world as a high-risk clone [37,3839].Our results also suggested the important role of ST167 E. coli in the global transmission of bla NDM -harboring plasmids.It seemed that ST167 E. coli acts as a powerful mediator to promote the horizontal transfer of plasmids carrying bla NDM among diverse clones.
bla NDM had been widely reported in clinical, animal, food, and environmental samples, and the Inc types of bla NDM -harboring plasmids showed no obvious bias to diverse samples types [404142].These findings indicated the broad transfer of bla NDM -harboring plasmids across humans, animals and environments, which is an alarming public health concern.According to our results, IncFI was the most dominant Inc type of all bla NDM -harboring plasmid, while most of them were incomplete contig sequences.The second dominant type was IncX3, which occupied a large fraction in recent years, even up to 64.36% in 2021, exceeding IncFI (Fig. 1B).Here, we chose the widely concerned IncX3 plasmids to investigate the genetic characterization which might contribute to host source specificity.Although we recruited data as much as possible, the SNPs identified were very few and could not be used to distinguish host sources.Notably, we found that differences in gene content across plasmids appear to make a higher contribution to host-source specificity than SNPs in conserved regions.These results suggested that gain or loss of genes in bla NDM -harboring IncX3 plasmids may play a key role in adaption improvement when transferred across different hosts.Acquisition and loss of genetic material has been recognized as an important process in bacterial evolution [43].We speculate that some dominant plasmids may show the popularity advantage by acquiring some key components through evolution, such as some conjugation components and Type IV secretion system, which could help them fit it the certain bacteria host.Liu et al. showed that outermembrane core complex (OMCCF) of a T4SS plays an important role in IncF plasmid dissemination and F fimbrial biogenesis [44].H-NS-like Protein on bla NDMpositive IncX3 plasmid can also affect the transmission of plasmid among different bacterial hosts [34].Study on the gene content of transferable plasmid enables better understanding of the fitness and evolution process of antimicrobial resistance.
The co-existence of multiple ARGs and bla NDM enables bacteria to adapt to different stresses or niches, and is an important driver of co-evolution of microbial populations.In theory, the co-existence of multiple ARGs might increase the fitness cost of bacteria.However, bacteria might take compensatory measures to reduce this cost, allowing the antimicrobial resistance trait to be maintained.In previous studies, low fitness cost was observed in carriage of bla NDM [18,19,34,35], indicating that bla NDM might be a dominant gene in bacterial adaptive evolution.The low fitness cost of bla NDM provide potential opportunities for its co-occurrence with other ARGs.In this study, we found diverse co-existence situations of other ARGs with bla NDM in strains carrying bla NDM -harboring plasmids, and identified the most common co-occurrence pattern, which had never been explored from a worldwide perspective before.The coexistence of bla NDM-1 and other ARGs were occurred in multiple genera and the network combinations were different from bla NDM-5 , which was mainly identified in E. coli and K. pneumoniae.IncA/C was the predominant type in the co-occurrence networks we identified, especially for bla NDM-5 involved one, indicating its important role in the appearance and development of MDR bla NDM -harboring strains.IncA/C belonged to broadhost-range (BHR) plasmid families [46], and monitoring of MDR IncA/C plasmids needs to be strengthened in the future studies.Exploration of other ARGs co-existed with bla NDM provided a clearer insight into the risk of multidrug resistance (MDR) in bla NDM positive isolates.
We acknowledge several limitations of this study.Firstly, to ensure the accuracy, most analyses were based on complete plasmid sequences and short-read assemblies were added as supplementary.Although the data was far beyond other similar studies, the bias was still inevitable.Secondly, we tried to investigate the transmission characterization of bla NDM based on the metadata (sample source, location, collection date, etc.) we collected.However, some information was missed and relevant papers or records were unavailable.Besides, although we had selected statistical models or methods which could reduce the influence from other factors, the bias would not be eliminated.Finally, as we focused on the most popular types whether about other ARGs or genetic contexts associated with bla NDM , other situations (even though partly listed and submitted) were not detailed depicted.
In conclusion, we have highlighted major characterizations of bla NDM -harboring plasmids, which will continue to interact and co-evolve with strains and other genetic elements.Our findings advanced the understanding of the global spread of bla NDM .Strains, plasmids, and other key mobile genetic elements should be included in the continuously monitoring of carbapenem resistance gene bla NDM in the future.

Materials and methods
Complete bla NDM -harboring plasmids bla NDM -harboring plasmid records were searched in the NCBI nucleotide database by typing the query "NDM plasmid", and then filtered by the PLSDB database [16] to remove incomplete or mislabeled chromosomal records.The relevant gb and fasta files were downloaded based on the filtered accession numbers and parsed using the module Biopython (v1.78) in Python (v3.8.5).All sequences were searched by BLASTn (v2.9.0+) to ensure the carry of bla NDM gene.Metadata including length, topology, organism, strain, isolate, country, host, isolation source, journal, etc. were extracted from the gb file.The corresponding paper title and abstract were obtained for missing entries based on the PMID numbers, and information was extracted and supplemented using custom python scripts.

Geographical distribution of the bla NDM -harboring plasmids
The Geographical distribution of complete bla NDM -harboring plasmids were visualized using ggplot (v3.3.6)package in R (v3.6.3)(https:// www.r-proje ct.org/).Constructed contingency tables were constructed using both complete plasmids and contigs to display the geographical distribution frequency of each Inc type.We performed the Fisher exact test (alternative = "greater") and adjusted naive p values using the Bonferroni correction which can avoid Type I errors (false positives) and chosen the adjusted p-value threshold of 0.001 to determine enrichment of Inc types for five continents.

Characterization of bla NDM -harboring sequences
Replicon and relaxase typing of complete bla NDM -harboring plasmids and mobility prediction was performed using MOB-suite (v3.0.0) [20].For putative bla NDM -harboring plasmid contigs, incompatibility (Inc) groups were first identified using Abricate (v1.0.1) with the Plasmid-Finder database [48].The contigs without outcomes were then mapped to every complete bla NDM -harboring plasmid using BLASTn and the genome coverage for each reference plasmid was calculated.The reference plasmid with the highest genome coverage (>90%) was defined as the most similar plasmid and its Inc type was identified as the Inc type of the contig.The combinations of different features were visualized using the UpSetR (v1.4.0) package in R (v3.6.3).

Phylogenetic analyses
The core genome-based phylogenetic of bla NDM -harboring isolates from E. coli and K. pneumoniae was constructed using Parsnp (v1.5.3) [51].In silico multilocus sequence typing (MLST) of E. coli and K. pneumoniae genomes was performed with the tool mlst (https:// github.com/ tseem ann/ mlst).Tanglegrams linking the core genome and plasmid phylogenies were generated using the dendextend package (v2.1.3)in R. Adobe Illustrator CC (v22.1) was used to add additional annotations and merge different parts of the figures.

Discriminant analysis of principal components
The .vcf file of 192 IncX3 plasmids (including complete plasmids and predicted contigs) from E. coli produced by Parsnp and HarvestTools [51] was imported into R using the package vcfR (v1.12.0).Core genome SNPs were counted.The package adegenet (v2.1.3)in R was used to implement discriminant analysis of principal components (DAPC) analysis based on the core genome SNPs.The .gff files of 98 complete IncX3 plasmids from E. coli produced by Prokka were used as input for Roary (v3.13.0) [52] with default settings.DAPC was done using a gene presence/absence matrix from Roary output.

Construction of ARGs co-occurrence network
ARGs on all bla NDM -harboring strains (complete plasmids) were detected using ResFinder (v4.0) [3] with minimal identity and coverage of 95%.The results were transformed into a binary matrix in Python to indicate presence/absence.A pairwise co-occurrence matrix of ARGs was constructed from the binary ARG presence/ absence matrix.The co-occurrence relationships between all pairs of ARGs were visualized using the heatmap function from the seaborn package (v0.11.0) in Python.The co-occurrence networks, in which the nodes represent ARGs and edges represent a frequency of pairwise co-occurrence (the threshold was set at ≥40, ≥50, ≥60, ≥70, ≥100 plasmids), were constructed by the networkx package (v2.4) in Python.For every network subgraph, the co-existence of multiple ARGs (≥ 3) was explored using the find_clusters function from the networkx package and the corresponding frequency of occurrence was calculated.

Fig. 1
Fig. 1 Distribution of bla NDM -harboring plasmids worldwide from 2011 to 2021.A Geographical distribution of complete bla NDM -harboring plasmids worldwide (n = 546).The size of circles represented the number of plasmids and the details were annotated in the legend.B Temporal changes of all bla NDM -harboring plasmids (complete plasmids and contigs, n = 2510)

Fig. 4
Fig. 4 Source specificity of bla NDM -harboring plasmids.A "violin + boxplot + jittered" composite graph showed the probability distribution, medians and confidence intervals of plasmid length derived per isolation source and bacteria genus.B DAPC (Discriminant analysis of principal components) based on the conserved region SNPs among all IncX3 plasmids (n = 192, both complete plasmids and putative plasmid contigs from E. coli).Inset shows the histogram of discriminant analysis eigenvalues.C DAPC based on the genetic content matrix (n = 98, complete plasmids from E. coli) constructed by the gene annotation of whole IncX3 plasmids

Table 1
The basic information of complete bla NDM -harboring plasmid sequences a Mode length refers to the most common length in each subtype b Not applicable Table 2 shows the dominant Inc types of different continents and the number of corresponding plasmids.Significant associations have been observed between several Inc types of bla NDM -harboring plasmids with geographic regions, including America with IncFI, Europe with IncA/C, IncR, and IncX1/X4/X6, East Asia with IncHI2, IncX3, and IncY, South Asia with IncFI, Southeast Asia with IncN and IncFI, and West Asia with IncFI (Table

Table 2
The significant associations between continents and plasmid Inc types (n = 882) a Plasmids that without continent information were excluded.The values in brackets represent the number of plasmids in corresponding continent b Plasmids that could not classified by Inc typing were excluded.The values in brackets represent the number of corresponding Inc type c This number represents the number of plasmids of this Inc type in the continent d Only data with p-value < 0.001 are shown in the table.The p-values were adjusted by Bonferroni correction