Blommaert J. Genome size evolution: towards new model systems for old questions. Proc Biol Sci. 2020;287(1933): 20201441.
Google Scholar
Elliott TA, Gregory TR. What’s in a genome? The c-value enigma and the evolution of eukaryotic genome content. Philos Trans R Soc Lond B Biol Sci. 2015;370(1678): 20140331.
Google Scholar
Lefebure T, Morvan C, Malard F, Francois C, Konecny-Dupre L, Gueguen L, Weiss-Gayet M, Seguin-Orlando A, Ermini L, Sarkissian C, et al. Less effective selection leads to larger genomes. Genome Res. 2017;27(6):1016–28.
Google Scholar
Gregory TR. Synergy between sequence and size in large-scale genomics. Nat Rev Genet. 2005;6(9):699–708.
Google Scholar
Pellicer J, Hidalgo O, Dodsworth S, Leitch IJ. Genome size diversity and its impact on the evolution of land plants. Genes. 2018;9(2):88. https://doi.org/10.3390/genes9020088
Google Scholar
Hidalgo O, Pellicer J, Christenhusz M, Schneider H, Leitch AR, Leitch IJ. Is there an upper limit to genome size? Trends Plant Sci. 2017;22(7):567–73.
Google Scholar
Yin D, Schwarz EM, Thomas CG, Felde RL, Korf IF, Cutter AD, Schartner CM, Ralston EJ, Meyer BJ, Haag ES. Rapid genome shrinkage in a self-fertile nematode reveals sperm competition proteins. Science. 2018;359(6371):55–61.
Google Scholar
Adams PE, Eggers VK, Millwood JD, Sutton JM, Pienaar J, Fierst JL. Genome size changes by duplication, divergence, and insertion in caenorhabditis worms. Mol Biol Evol. 2023;40(3):msad039. https://doi.org/10.1093/molbev/msad039
Google Scholar
Vitales D, Álvarez I, Garcia S, Hidalgo O, Nieto Feliner G, Pellicer J, Vallès J, Garnatje T. Genome size variation at constant chromosome number is not correlated with repetitive DNA dynamism in anacyclus (Asteraceae). Ann Bot. 2019;125(4):611–23.
Google Scholar
Agudo AB, Torices R, Loureiro J, Castro S, Castro M, Alvarez I. Genome size variation in a hybridizing diploid species complex in (Asteraceae: Anthemideae). Int J Plant Sci. 2019;180(5):374–85.
Google Scholar
Stein JC, Yu Y, Copetti D, Zwickl DJ, Zhang L, Zhang C, Chougule K, Gao D, Iwata A, Goicoechea JL, et al. Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet. 2018;50(2):285–96.
Google Scholar
Bozan I, Achakkagari SR, Anglin NL, Ellis D, Tai HH, Stromvik MV. Pangenome analyses reveal impact of transposable elements and ploidy on the evolution of potato species. Proc Natl Acad Sci U S A. 2023;120(31): e2211117120.
Google Scholar
Kress WJ, Soltis DE, Kersey PJ, Wegrzyn JL, Leebens-Mack JH, Gostel MR, Liu X, Soltis PS. Green plant genomes: what we know in an era of rapidly expanding opportunities. Proc Natl Acad Sci U S A. 2022;119(4): e2115640118. https://doi.org/10.1073/pnas.2115640118
Google Scholar
He W, Li X, Qian Q, Shang L. The developments and prospects of plant super pangenomes: demands, approaches and applications. Plant Commun 2024;6(2):101230.
Gregory TR, Nicol JA, Tamm H, Kullman B, Kullman K, Leitch IJ, Murray BG, Kapraun DF, Greilhuber J, Bennett MD. Eukaryotic genome size databases. Nucleic Acids Res. 2007;35(Database issue):D332-338.
Google Scholar
Pflug JM, Holmes VR, Burrus C, Johnston JS, Maddison DR. Measuring genome sizes using read-depth, k-mers, and flow cytometry: methodological comparisons in beetles (Coleoptera). G3: Genes|Genomes|Genetics. 2020;10(9):3047–60.
Google Scholar
Pfenninger M, Schonnenbeck P, Schell T. ModEst: accurate estimation of genome size from next generation sequencing data. Mol Ecol Resour. 2022;22(4):1454–64.
Google Scholar
Guenzi-Tiberi P, Istace B, Alsos IG, Coissac E, Lavergne S, Aury JM, Denoeud F. LocoGSE, a sequence-based genome size estimator for plants. Front Plant Sci. 2024;15: 1328966.
Google Scholar
Natarajan S, Gehrke J, Pucker B. Mapping-based genome size estimation. BMC Genomics. 2025;26(1): 482.
Google Scholar
Moeckel C, Mareboina M, Konnaris MA, Chan CSY, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J. 2024;23:2289–303.
Google Scholar
Hesse U. K-mer-based genome size estimation in theory and practice. Methods Mol Biol. 2023;2672:79–113.
Google Scholar
Hao F, Liu X, Zhou BT, Tian ZZ, Zhou LN, Zong H, Qi JY, He J, Zhang YT, Zeng P, et al. Chromosome-level genomes of three key allium crops and their trait evolution. Nat Genet. 2023;55:1976-1986. https://doi.org/10.1038/s41588-023-01546-0
Google Scholar
Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17(6):333–51.
Google Scholar
Wenger AM, Peluso P, Rowell WJ, Chang PC, Hall RJ, Concepcion GT, Ebler J, Fungtammasan A, Kolesnikov A, Olson ND, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37(10):1155–62.
Google Scholar
Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 2020;11:1432. https://doi.org/10.1038/s41467-020-14998-3
Google Scholar
Scarano C, Veneruso I, De Simone RR, Di Bonito G, Secondino A, D’Argenio V. The third-generation sequencing challenge: novel insights for the omic sciences. Biomolecules. 2024;14(5): 568. https://doi.org/10.3390/biom14050568
Google Scholar
Espinosa E, Bautista R, Larrosa R, Plata O. Advancements in long-read genome sequencing technologies and algorithms. Genomics. 2024;116(3): 110842.
Google Scholar
Zhao Z, Ng YK, Fang X, Li S. Eliminating heterozygosity from reads through coverage normalization. 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2016:174–177. https://doi.org/10.1109/BIBM.2016.7822514
Sun J, Zhang YF, Wang MH, Guan Q, Yang XJ, Ou JX, Yan MC, Wang CR, Zhang Y, Li ZH, et al. The biological significance of multi-copy regions and their impact on variant discovery. Genomics Proteomics Bioinformatics. 2020;18(5):516–24.
Google Scholar
Makino T, McLysaght A. Ohnologs in the human genome are dosage balanced and frequently associated with disease. Proc Natl Acad Sci U S A. 2010;107(20):9270–4.
Google Scholar
Nakatani Y, Takeda H, Kohara Y, Morishita S. Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res. 2007;17(9):1254–65.
Google Scholar
Dehal P, Boore JL. Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 2005;3(10): e314.
Google Scholar
McLysaght A, Hokamp K, Wolfe KH. Extensive genomic duplication during early chordate evolution. Nat Genet. 2002;31(2):200–4.
Google Scholar
Bowers JE, Chapman BA, Rong J, Paterson AH. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature. 2003;422(6930):433–8.
Google Scholar
Qiao X, Zhang SL, Paterson AH. Pervasive genome duplications across the plant tree of life and their links to major evolutionary innovations and transitions. Comput Struct Biotechnol J. 2022;20:3248–56.
Google Scholar
Li FW, Nishiyama T, Waller M, Frangedakis E, Keller J, Li Z, Fernandez-Pozo N, Barker MS, Bennett T, Blazquez MA, et al. Anthoceros genomes illuminate the origin of land plants and the unique biology of hornworts. Nat Plants. 2020;6(3):259–72.
Google Scholar
Lemieux C, Turmel M, Otis C, Pombert JF. A streamlined and predominantly diploid genome in the tiny marine green Alga. Nat Commun. 2019;10(1):4061. https://doi.org/10.1038/s41467-019-12014-x
Sun P, Jiao B, Yang Y, Shan L, Li T, Li X, Xi Z, Wang X, Liu J. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol Plant. 2022;15(12):1841–51.
Google Scholar
Rabier CE, Ta T, Ane C. Detecting and locating whole genome duplications on a phylogeny: a probabilistic approach. Mol Biol Evol. 2014;31(3):750–62.
Google Scholar
Kellis M, Birren BW, Lander ES. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature. 2004;428(6983):617–24.
Google Scholar
Vollger MR, Guitart X, Dishuck PC, Mercuri L, Harvey WT, Gershman A, Diekhans M, Sulovari A, Munson KM, Lewis AP, et al. Segmental duplications and their variation in a complete human genome. Science. 2022;376(6588):55–.
Google Scholar
Cheng H, Jarvis ED, Fedrigo O, Koepfli KP, Urban L, Gemmell NJ, Li H. Haplotype-resolved assembly of diploid genomes without parental data. Nat Biotechnol. 2022;40(9):1332–5.
Google Scholar
Cheng H, Asri M, Lucas J, Koren S, Li H. Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph. Nat Methods. 2024;21(6):967–70.
Google Scholar
Chor B, Horn D, Goldman N, Levy Y, Massingham T. Genomic DNA k-mer spectra: models and modalities. Genome Biol. 2009;10(10):R108.
Google Scholar
Cheng L, Wang N, Bao Z, Zhou Q, Guarracino A, Yang Y, Wang P, Zhang Z, Tang D, Zhang P, et al. Leveraging a phased pangenome for haplotype design of hybrid potato. Nature. 2025;640:408-417. https://doi.org/10.1038/s41586-024-08476-9
Google Scholar
Hardigan MA, Laimbeer FPE, Newton L, Crisovan E, Hamilton JP, Vaillancourt B, Wiegert-Rininger K, Wood JC, Douches DS, Farre EM, et al. Genome diversity of tuber-bearing solanum uncovers complex evolutionary history and targets of domestication in the cultivated potato. Proc Natl Acad Sci U S A. 2017;114(46):E9999-10008.
Google Scholar
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, et al. The complete sequence of a human genome. Science. 2022;376(6588):44-53. https://doi.org/10.1126/science.abj6987
Google Scholar
Wang B, Yang X, Jia Y, Xu Y, Jia P, Dang N, Wang S, Xu T, Zhao X, Gao S, et al. High-quality Arabidopsis thaliana genome assembly with nanopore and HiFi long reads. Genomics Proteomics Bioinformatics. 2022;20(1):4–13.
Google Scholar
Shang LG, He WC, Wang TY, Yang YX, Xu Q, Zhao XJ, Yang LB, Zhang H, Li XX, Lv Y, et al. A complete assembly of the rice Nipponbare reference genome. Mol Plant. 2023;16(8):1232–6.
Google Scholar
Xu S, Chen R, Zhang X, Wu Y, Yang L, Sun Z, Zhu Z, Song A, Wu Z, Li T, et al. The evolutionary tale of lilies: giant genomes derived from transposon insertions and polyploidization. Innovation (Camb). 2024;5(6): 100726.
Google Scholar
Healey AL, Garsmeur O, Lovell JT, Shengquiang S, Sreedasyam A, Jenkins J, Plott CB, Piperidis N, Pompidor N, Llaca V, et al. The complex polyploid genome architecture of sugarcane. Nature. 2024;628(8009):804–10.
Google Scholar
Schartl M, Woltering JM, Irisarri I, Du K, Kneitz S, Pippel M, Brown T, Franchini P, Li J, Li M, et al. The genomes of all lungfish inform on genome expansion and tetrapod evolution. Nature. 2024;624(8032):96-103. https://doi.org/10.1038/s41586-024-07830-1
Google Scholar
Shao C, Sun S, Liu K, Wang J, Li S, Liu Q, Deagle BE, Seim I, Biscontin A, Wang Q, et al. The enormous repetitive Antarctic Krill genome reveals environmental adaptations and population insights. Cell. 2023;186(6):1279–94. e1219. https://doi.org/10.1016/j.cell.2023.02.005
Peng Y, Yan H, Guo L, Deng C, Wang C, Wang Y, Kang L, Zhou P, Yu K, Dong X, et al. Reference genome assemblies reveal the origin and evolution of allohexaploid oat. Nat Genet. 2022;54(8):1248–58.
Google Scholar
Chen W, Yan M, Chen S, Sun J, Wang J, Meng D, Li J, Zhang L, Guo L. The complete genome assembly of Nicotiana benthamiana reveals the genetic and epigenetic landscape of centromeres. Nat Plants. 2024;10(12):1928–43.
Google Scholar
Zhang J, Qi Y, Hua X, Wang Y, Wang B, Qi Y, Huang Y, Yu Z, Gao R, Zhang Y, et al. The highly allo-autopolyploid modern sugarcane genome and very recent allopolyploidization in saccharum. Nat Genet. 2025;57:242-253. https://doi.org/10.1038/s41588-024-02033-w
Google Scholar
Huang HR, Liu X, Arshad R, Wang X, Li WM, Zhou Y, Ge XJ. Telomere-to-telomere haplotype-resolved reference genome reveals subgenome divergence and disease resistance in triploid Cavendish banana. Hortic Res. 2023;10(9): uhad153.
Google Scholar
Bao Z, Li C, Li G, Wang P, Peng Z, Cheng L, Li H, Zhang Z, Li Y, Huang W, et al. Genome architecture and tetrasomic inheritance of autotetraploid potato. Mol Plant. 2022;15(7):1211–26.
Google Scholar
Fernandez P, Amice R, Bruy D, Christenhusz MJM, Leitch IJ, Leitch AL, Pokorny L, Hidalgo O, Pellicer J. A 160 Gbp fork fern genome shatters size record for eukaryotes. iScience. 2024;27(6): 109889.
Google Scholar
Meyers LA, Levin DA. On the abundance of polyploids in flowering plants. Evolution. 2006;60(6):1198–206.
Google Scholar
Reis AC, Franco AL, Campos VR, Souza FR, Zorzatto C, Viccini LF, Sousa SM. rDNA mapping, heterochromatin characterization and AT/GC content of Agapanthus africanus (L.) Hoffmanns (Agapanthaceae). An Acad Bras Cienc. 2016;88(3 Suppl):1727–34.
Google Scholar
Ohri D, Fritsch RM, Hanelt P. Evolution of genome size in allium (Alliaceae). Plant Syst Evol. 1998;210(1):57–86.
Google Scholar
Ricroch A, Yockteng R, Brown SC, Nadot S. Evolution of genome size across some cultivated allium species. Genome. 2005;48(3):511–20.
Google Scholar
Greilhuber J, Dolezel J, Lysak MA, Bennett MD. The origin, evolution and proposed stabilization of the terms ‘genome size’ and ‘C-value’ to describe nuclear DNA contents. Ann Bot. 2005;95(1):255–260. https://doi.org/10.1093/aob/mci019
Google Scholar
Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM. De novo assembly of haplotype-resolved genomes with trio binning. Nat Biotechnol. 2018;36:1174-1182. https://doi.org/10.1038/nbt.4277
Google Scholar
Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18(2):170–5.
Google Scholar
Jia KH, Wang ZX, Wang LX, Li GY, Zhang W, Wang XL, Xu FJ, Jiao SQ, Zhou SS, Liu H, et al. Subphaser: a robust allopolyploid subgenome phasing method based on subgenome-specific k-mers. New Phytol. 2022;235(2):801–9.
Google Scholar
Wendel JF. Genome evolution in polyploids. Plant Mol Biol. 2000;42(1):225–49.
Google Scholar
Otto SP, Whitton J. Polyploid incidence and evolution. Annu Rev Genet. 2000;34(1):401–37.
Google Scholar
Soltis PS, Soltis DE. The role of hybridization in plant speciation. Annu Rev Plant Biol. 2009;60:561–88.
Google Scholar
del Pozo JC, Ramirez-Parra E. Whole genome duplications in plants: an overview from Arabidopsis. J Exp Bot. 2015;66(22):6991–7003.
Google Scholar
Eckardt NA. Two genomes are better than one: widespread paleopolyploidy in plants and evolutionary effects. Plant Cell. 2004;16(7):1647– 1649. https://doi.org/10.1105/tpc.160710
Google Scholar
Li H, Durbin R. Genome assembly in the telomere-to-telomere era. Nat Rev Genet. 2024;25(9):658-670. https://doi.org/10.1038/s41576-024-00718-w
Google Scholar
Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, et al. Technology dictates algorithms: recent developments in read alignment. Genome Biol. 2021. https://doi.org/10.1186/s13059-021-02443-7.
Google Scholar
Bates S, Dessimoz C, Nevers Y. OMAnnotator: a novel approach to Building an annotated consensus genome sequence. BioRxiv. 2024;626846.
Zeng XF, Yi ZL, Zhang XT, Du YH, Li Y, Zhou ZQ, Chen SJ, Zhao HJ, Yang S, Wang YB, et al. Chromosome-level scaffolding of haplotype-resolved assemblies using Hi-C data without reference genomes. Nat Plants. 2024;10: 1184-1200. https://doi.org/10.1038/s41477-024-01755-3
Google Scholar
Liu G, Chen L, Wu Y, Han Y, Bao Y, Zhang T. PDLLMs: A group of tailored DNA large Language models for analyzing plant genomes. Mol Plant. 2024;18(2):175-178 . https://doi.org/10.1016/j.molp.2024.12.006
Behera S, Catreux S, Rossi M, Truong S, Huang ZY, Ruehle M, Visvanath A, Parnaby G, Roddey C, Onuchic V, et al. Comprehensive genome analysis and variant detection at scale using DRAGEN. Nat Biotechnol. 2024. https://doi.org/10.1038/s41587-024-02382-1.
Google Scholar
Chen Y, Huang JH, Sun Y, Zhang Y, Li Y, Xu X. Haplotype-resolved assembly of diploid and polyploid genomes using quantum computing. Cell Rep Methods. 2024;4(5): 100754.
Google Scholar
Nurk S, Walenz BP, Rhie A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. 2020;30(9):1291–305.
Google Scholar
Nie F, Ni P, Huang N, Zhang J, Wang Z, Xiao C, Luo F, Wang J. De novo diploid genome assembly using long noisy reads. Nat Commun. 2024;15(1):2964. https://doi.org/10.1038/s41467-024-47349-7
Song L, Florea L, Langmead B. Lighter: fast and memory-efficient sequencing error correction without counting. Genome Biol. 2014;15:509. https://doi.org/10.1186/s13059-014-0509-9
Google Scholar
Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764–70.
Google Scholar
Kokot M, Dlugosz M, Deorowicz S. KMC 3: counting and manipulating k-mer statistics. Bioinformatics. 2017;33(17):2759–2761. https://doi.org/10.1093/bioinformatics/btx304
Google Scholar
Martayan I, Robidou L, Shibuya Y, Limasset A. Hyper-k-mers: efficient streaming k-mers representation. bioRxiv. 2024:2024.2011.2006.620789 . https://doi.org/10.1101/2024.11.06.620789
Chikhi R, Medvedev P. Informed and automated k-mer size selection for genome assembly. Bioinformatics. 2014;30(1):31–7.
Google Scholar
Sun H, Ding J, Piednoel M, Schneeberger K. FindGSE: estimating genome size variation within human and Arabidopsis using k-mer frequencies. Bioinformatics. 2018;34(4):550–7.
Google Scholar
Sarmashghi S, Balaban M, Rachtman E, Touri B, Mirarab S, Bafna V. Estimating repeat spectra and genome length from low-coverage genome skims with RESPECT. PLoS Comput Biol. 2021;17(11): e1009449.
Google Scholar
Chen S, Zhou Y, Chen Y, Gu J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884-90.
Google Scholar
Said SE, Dickey DA. Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika. 1984;71(3):599–607. https://doi.org/10.1093/biomet/71.3.599
Banerjee A, Dolado JJ, Galbraith JW, Hendry D. Co-integration, error correction, and the econometric analysis of Non-Stationary data. Oxford University Press; 1993.
Trapletti A, Hornik K. Tseries: time series analysis and computational finance: R package version 0.10–58: https://CRAN.R-project.org/package=tseries; 2024.
Tang H, Krishnakumar V, Zeng X, Xu Z, Taranto A, Lomas JS, Zhang Y, Huang Y, Wang Y, Yim WC, et al. JCVI: a versatile toolkit for comparative genomics analysis. Imeta. 2024;3(4): e211.
Google Scholar
Kielbasa SM, Wan R, Sato K, Horton P, Frith MC. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011;21(3):487–93.
Google Scholar