Blackwell GA, Hunt M, Malone KM, Lima L, Horesh G, Alako BTF, et al. Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences. PLoS Biol. 2021;19:e3001421 (Hanage WP, editor.).
Google Scholar
Wong ZSY, Zhou J, Zhang Q. Artificial intelligence for infectious disease big data analytics. Infect Dis Health. 2019;24:44–8.
Google Scholar
Ow GS, Tang Z, Kuznetsov VA. Big data and computational biology strategy for personalized prognosis. Oncotarget. 2016;7:40200–20.
Google Scholar
Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, et al. On the Opportunities and Risks of Foundation Models. arXiv; 2021 Available from: https://arxiv.org/abs/2108.07258. [cited 2025 Sept 2].
Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630:493–500.
Google Scholar
Pagès-Gallego M, De Ridder J. Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling. Genome Biol. 2023;24:71.
Google Scholar
Torres MDT, Brooks EF, Cesaro A, Sberro H, Gill MO, Nicolaou C, et al. Mining human microbiomes reveals an untapped source of peptide antibiotics. Cell. 2024;187:5453-5467.e15.
Google Scholar
Wan F, Torres MDT, Peng J, De La Fuente-Nunez C. Deep-learning-enabled antibiotic discovery through molecular de-extinction. Nat Biomed Eng. 2024;8:854–71.
Google Scholar
Iwashyna TJ, Liu V. What’s So Different about Big Data?. A Primer for Clinicians Trained to Think Epidemiologically. Annals ATS. 2014;11:1130–5.
Murphy KP. Probabilistic machine learning: an introduction. Cambridge, Massachusetts: The MIT Press; 2022.
Murphy KP. Probabilistic machine learning: advanced topics. Cambridge, Massachusetts: The MIT Press; 2023.
Breiman L. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statist Sci. 2001;16. Available from: https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling–The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.full. [cited 2025 Sept 2].
Bzdok D, Altman N, Krzywinski M. Statistics versus machine learning. Nat Methods. 2018;15:233–4.
Google Scholar
Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85–117.
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O. Scikit-learn: Machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems 32. Curran Associates, Inc; 2019;8024–35.
TensorFlow Developers. TensorFlow. Zenodo; 2024. Available from: https://zenodo.org/doi/10.5281/zenodo.12726004. [cited 2025 Sept 2].
Greene AC, Giffin KA, Greene CS, Moore JH. Adapting bioinformatics curricula for big data. Brief Bioinform. 2016;17:43–50.
Google Scholar
Wiemken TL, Kelley RR. Machine learning in epidemiology and health outcomes research. Annu Rev Public Health. 2020;41:21–36.
Google Scholar
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877–901.
Falush D, Wirth T, Linz B, Pritchard JK, Stephens M, Kidd M, et al. Traces of human migrations in Helicobacter pylori populations. Science. 2003;299:1582–5.
Google Scholar
Corander J, Marttinen P. Bayesian identification of admixture events using multilocus molecular markers. Mol Ecol. 2006;15:2833–43.
Google Scholar
Tonkin-Hill G, Lees JA, Bentley SD, Frost SDW, Corander J. Fast hierarchical Bayesian analysis of population structure. Nucleic Acids Res. 2019;47:5539–49.
Google Scholar
Lees JA, Tonkin-Hill G, Yang Z, Corander J. Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation. Phil Trans R Soc B. 2022;377:20210237.
Google Scholar
Jaillard M, Lima L, Tournoud M, Mahé P, Van Belkum A, Lacroix V, et al. A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic events. Didelot X, editor. PLoS Genet. 2018;14:e1007758.
Hoffman S, Podgurski A. Big bad data: law, public health, and biomedical databases. J Law Med Ethics. 2013;41:56–60.
Google Scholar
Wang Q, Ma Y, Zhao K, Tian Y. A comprehensive survey of loss functions in machine learning. Ann Data Sci. 2022;9:187–212.
Stone M. Cross-Validatory Choice and Assessment of Statistical Predictions. J Royal Statistic Soc Series B (Methodological. 1974;36:111–47.
Bzdok D, Krzywinski M, Altman N. Machine learning: a primer. Nat Methods. 2017;14:1119–20.
Google Scholar
Bashir D, Montañez GD, Sehra S, Segura PS, Lauw J. An Information-T. Cham: Springer International Publishing; 2020; 347–58. Available from: https://link.springer.com/10.1007/978-3-030-64984-5_27. [cited 2025 Sept 2].
Fix E, Hodges JL. Discriminatory analysis: Nonparametric discrimination: Consistency properties: (471672008–001). 1951 Available from: https://doi.apa.org/doi/10.1037/e471672008-001. [cited 2025 Sept 2].
Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inform Theory. 1967;13:21–7.
Yao Z, Ruzzo WL. A regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data. BMC Bioinformatics. 2006;7:S11.
Google Scholar
Mihelčić M, Šmuc T, Supek F. Patterns of diverse gene functions in genomic neighborhoods predict gene function and phenotype. Sci Rep. 2019;9:19537.
Google Scholar
Xu S. Bayesian naïve Bayes classifiers to text classification. J Inf Sci. 2018;44:48–59.
John GH, Langley P. Estimating Continuous Distributions in Bayesian Classifiers. arXiv; 2013 Available from: https://arxiv.org/abs/1302.4964. [cited 2025 Sept 2].
Webb GI. Naïve Bayes. In: Sammut C, Webb GI, editors. Encyclopedia of Machine Learning. Boston, MA: Springer US; 2011713–4. Available from: https://link.springer.com/10.1007/978-0-387-30164-8_576. [cited 2025 Sept 2].
Li F, Shen Y, Lv D, Lin J, Liu B, He F, et al. A bayesian classification model for discriminating common infectious diseases in Zhejiang province, China. Medicine. 2020;99:e19218.
Google Scholar
Zhao Z, Cristian A, Rosen G. Keeping up with the genomes: efficient learning of our increasing knowledge of the tree of life. BMC Bioinformatics. 2020;21:412.
Google Scholar
Sandberg R, Winberg G, Bränden C-I, Kaske A, Ernberg I, Cöster J. Capturing whole-genome characteristics in short sequences using a naïve Bayesian classifier. Genome Res. 2001;11:1404–9.
Google Scholar
Ben-Hur A, Ong CS, Sonnenburg S, Schölkopf B, Rätsch G. Support vector machines and kernels for computational biology. PLoS Comput Biol. 2008;4:e1000173 (Lewitter F, editor.).
Google Scholar
McIntyre ABR, Ounit R, Afshinnekoo E, Prill RJ, Hénaff E, Alexander N, et al. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers. Genome Biol. 2017;18:182.
Google Scholar
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.
Tsirigos A. A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes. Nucleic Acids Res. 2005;33:3699–707.
Google Scholar
Weimann A, Mooren K, Frank J, Pope PB, Bremges A, McHardy AC. From Genomes to Phenotypes: Traitar, the Microbial Trait Analyzer. Segata N, editor. mSystems. 2016;1:e00101–16.
Belman S, Pesonen H, Croucher NJ, Bentley SD, Corander J. Estimating Between Country Migration in Pneumococcal Populations. Epidemiology; 2023. Available from: http://medrxiv.org/lookup/doi/10.1101/2023.11.15.23298520. [cited 2025 Sept 2].
Lupolova N, Dallman TJ, Holden NJ, Gally DL. Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli. Microbial Genomics. 2017;3. Available from: https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000135. [cited 2025 Sept 2].
Quinlan JR. Induction of decision trees. Mach Learn. 1986;1:81–106.
Li M, Xu H, Deng Y. Evidential decision tree based on belief entropy. Entropy. 2019;21:897.
Google Scholar
Schrider DR, Kern AD. Supervised machine learning for population genetics: a new paradigm. Trends Genet. 2018;34:301–12.
Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
Statnikov A, Henaff M, Narendra V, Konganti K, Li Z, Yang L, et al. A comprehensive evaluation of multicategory classification methods for microbiomic data. Microbiome. 2013;1:11.
Google Scholar
Deneke C, Rentzsch R, Renard BY. Paprbag: a machine learning approach for the detection of novel pathogens from NGS data. Sci Rep. 2017;7:39194.
Google Scholar
Méric G, Mageiros L, Pensar J, Laabei M, Yahara K, Pascoe B, et al. Disease-associated genotypes of the commensal skin bacterium Staphylococcus epidermidis. Nat Commun. 2018;9:5034.
Google Scholar
Mageiros L, Méric G, Bayliss SC, Pensar J, Pascoe B, Mourkas E, et al. Genome evolution and the emergence of pathogenicity in avian Escherichia coli. Nat Commun. 2021;12:765.
Google Scholar
Chen ML, Doddi A, Royer J, Freschi L, Schito M, Ezewudo M, et al. Beyond multidrug resistance: leveraging rare variants with machine and statistical learning models in Mycobacterium tuberculosis resistance prediction. EBioMedicine. 2019;43:356–69.
Google Scholar
Li Y, Metcalf BJ, Chochua S, Li Z, Gertz RE, Walker H, et al. Validation of β-lactam minimum inhibitory concentration predictions for pneumococcal isolates with newly encountered penicillin binding protein (PBP) sequences. BMC Genomics. 2017;18:621.
Google Scholar
Arning N, Sheppard SK, Bayliss S, Clifton DA, Wilson DJ. Machine learning to predict the source of campylobacteriosis using whole genome data. PLoS Genet. 2021;17:e1009436 (Hughes D, editor.).
Google Scholar
Pascoe B, Futcher G, Pensar J, Bayliss SC, Mourkas E, Calland JK, et al. Machine learning to attribute the source of Campylobacter infections in the United States: a retrospective analysis of national surveillance data. J Infect. 2024;89:106265.
Google Scholar
Wheeler NE, Gardner PP, Barquist L. Machine learning identifies signatures of host adaptation in the bacterial pathogen Salmonella enterica. PLoS Genet. 2018;14:e1007333 (Didelot X, editor.).
Google Scholar
Zhang S, Li S, Gu W, Den Bakker H, Boxrud D, Taylor A, et al. Zoonotic Source Attribution of Salmonella enterica Serotype Typhimurium Using Genomic Surveillance Data, United States. Emerg Infect Dis. 2019;25. Available from: http://wwwnc.cdc.gov/eid/article/25/1/18-0835_article.htm. [cited 2025 Sept 2].
Beavan AJS, Domingo-Sananes MR, McInerney JO. Contingency, repeatability, and predictability in the evolution of a prokaryotic pangenome. Proc Natl Acad Sci USA. 2024;121:e2304934120.
Google Scholar
Mason L, Baxter J, Bartlett P, Frean M. Boosting Algorithms as Gradient Descent. Advances in Neural Information Processing Systems. MIT Press; 1999. Available from: https://proceedings.neurips.cc/paper/1999/hash/96a93ba89a5b5c6c226e49b88973f46e-Abstract.html.
Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Statist. 2001;29. Available from: https://projecteuclid.org/journals/annals-of-statistics/volume-29/issue-5/Greedy-function-approximation-A-gradient-boosting-machine/10.1214/aos/1013203451.full. [cited 2025 Sept 2].
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc; 2017;3149–57 17.
Anahtar MN, Yang JH, Kanjilal S. Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research. McAdam AJ, editor. J Clin Microbiol. 2021;59:e01260–20.
Ramoneda J, Stallard-Olivera E, Hoffert M, Winfrey CC, Stadler M, Niño-García JP, et al. Building a genome-based understanding of bacterial pH preferences. Sci Adv. 2023;9:eadf8998.
Google Scholar
Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci U S A. 1982;79:2554–8.
Google Scholar
Sheehan S, Song YS. Deep Learning for Population Genetic Inference. Chen K, editor. PLoS Comput Biol. 2016;12:e1004845.
Li Y, Huang C, Ding L, Li Z, Pan Y, Gao X. Deep learning in bioinformatics: introduction, application, and perspective in the big data era. Methods. 2019;166:4–21.
Google Scholar
Sejnowski TJ. The Deep Learning Revolution. The MIT Press; 2018 Available from: https://direct.mit.edu/books/book/4111/The-Deep-Learning-Revolution. [cited 2025 Sept 2].
Lugo L, Hernández EB. A recurrent neural network approach for whole genome bacteria identification. Appl Artif Intell. 2021;35:642–56.
Hasan MA, Lonardi S. Deeplyessential: a deep neural network for predicting essential genes in microbes. BMC Bioinformatics. 2020;21:367.
Google Scholar
Assaf R, Xia F, Stevens R. Detecting operons in bacterial genomes via visual representation learning. Sci Rep. 2021;11:2124.
Google Scholar
Wiatrak M, Weimann A, Dinan A, Brbić M, Floto RA. Sequence-based modelling of bacterial genomes enables accurate antibiotic resistance prediction. Microbiology; 2024 Available from: http://biorxiv.org/lookup/doi/10.1101/2024.01.03.574022. [cited 2025 Sept 2].
Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989;2:359–66.
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning requires rethinking generalization. arXiv; 2016. Available from: https://arxiv.org/abs/1611.03530. [cited 2025 Sept 2].
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017;30.
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, et al. Training language models to follow instructions with human feedback. Adv Neural Inf Process Syst. 2022;35:27730–44.
Holz HJ, Loew MH. Relative feature importance: A classifier-independent approach to feature selection. Machine Intelligence and Pattern Recognition. Elsevier; 1994;473–87. Available from: https://linkinghub.elsevier.com/retrieve/pii/B9780444818928500468. [cited 2025 Sept 2].
Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B. Definitions, methods, and applications in interpretable machine learning. Proc Natl Acad Sci USA. 2019;116:22071–80.
Google Scholar
House of Commons Science, Innovation and Technology Committee. 2023. The governance of artificial intelligence: interim report. Ninth Report of Session 2022–23. HC1769. https://committees.parliament.uk/publications/41130/documents/205611/default/
Nielsen EM, Fussing V, Engberg J, Nielsen NL, Neimann J. Most Campylobacter subtypes from sporadic infections can be found in retail poultry products and food animals. Epidemiol Infect. 2006;134:758–67.
Google Scholar
Garrett N, Devane ML, Hudson JA, Nicol C, Ball A, Klena JD, et al. Statistical comparison of Campylobacter jejuni subtypes from human cases and environmental sources: comparison of Campylobacter subtypes. J Appl Microbiol. 2007;103:2113–21.
Google Scholar
Wilson DJ, Gabriel E, Leatherbarrow AJH, Cheesbrough J, Gee S, Bolton E, et al. Tracing the Source of Campylobacteriosis. Guttman DS, editor. PLoS Genet. 2008;4:e1000203.
Sheppard SK, Dallas JF, Strachan NJC, MacRae M, McCarthy ND, Wilson DJ, et al. Campylobacter genotyping to determine the source of human infection. Clin Infect Dis. 2009;48:1072–8.
Google Scholar
Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco California USA: ACM; 2016;785–94. Available from: https://dl.acm.org/doi/10.1145/2939672.2939785. [cited 2025 Sept 2].
Mackay TFC. The genetic architecture of quantitative traits. Annu Rev Genet. 2001;35:303–39.
Google Scholar
Peacock SJ, Moore CE, Justice A, Kantzanou M, Story L, Mackie K, et al. Virulent combinations of adhesin and toxin genes in natural populations of Staphylococcus aureus. Infect Immun. 2002;70:4987–96.
Google Scholar
Astle W, Balding DJ. Population Structure and Cryptic Relatedness in Genetic Association Studies. Statist Sci. 2009;24. Available from: https://projecteuclid.org/journals/statistical-science/volume-24/issue-4/Population-Structure-and-Cryptic-Relatedness-in-Genetic-Association-Studies/10.1214/09-STS307.full. [cited 2025 Sept 2].
Price AL, Zaitlen NA, Reich D, Patterson N. New approaches to population stratification in genome-wide association studies. Nat Rev Genet. 2010;11:459–63.
Google Scholar
Sheppard SK. Strain wars and the evolution of opportunistic pathogens. Curr Opin Microbiol. 2022;67:102138.
Google Scholar
Pearl J. Causal inference in statistics: An overview. Statist Surv. 2009;3. Available from: https://projecteuclid.org/journals/statistics-surveys/volume-3/issue-none/Causal-inference-in-statistics-An-overview/10.1214/09-SS057.full. [cited 2025 Sept 2].
Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R, et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun. 2018;9:224.
Google Scholar
Sheppard SK, Didelot X, Meric G, Torralbo A, Jolley KA, Kelly DJ, et al. Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter. Proc Natl Acad Sci USA. 2013;110:11923–7.
Google Scholar
Earle SG, Wu C-H, Charlesworth J, Stoesser N, Gordon NC, Walker TM, et al. Identifying lineage effects when controlling for population structure improves power in bacterial association studies. Nat Microbiol. 2016;1:16041.
Google Scholar
Lees JA, Galardini M, Bentley SD, Weiser JN, Corander J. pyseer: a comprehensive tool for microbial pangenome-wide association studies. Stegle O, editor. Bioinformatics. 2018;34:4310–2.
Young BC, Earle SG, Soeng S, Sar P, Kumar V, Hor S, et al. Panton-valentine leucocidin is the key determinant of Staphylococcus aureus pyomyositis in a bacterial GWAS. Elife. 2019;8:e42486.
Google Scholar
Earle SG, Lobanovska M, Lavender H, Tang C, Exley RM, Ramos-Sevillano E, et al. Genome-wide association studies reveal the role of polymorphisms affecting factor H binding protein expression in host invasion by Neisseria meningitidis. Nassif X, editor. PLoS Pathog. 2021;17:e1009992.
Green AG, Yoon CH, Chen ML, Ektefaie Y, Fina M, Freschi L, et al. A convolutional neural network highlights mutations relevant to antimicrobial resistance in Mycobacterium tuberculosis. Nat Commun. 2022;13:3817.
Google Scholar
The CRyPTIC Consortium. Genome-wide association studies of global Mycobacterium tuberculosis resistance to 13 antimicrobials in 10,228 genomes identify new resistance mechanisms. Ladner J, editor. PLoS Biol. 2022;20:e3001755.
Mosquera-Rendón J, Moreno-Herrera CX, Robledo J, Hurtado-Páez U. Genome-wide association studies (GWAS) approaches for the detection of genetic variants associated with antibiotic resistance: a systematic review. Microorganisms. 2023;11:2866.
Google Scholar
Didelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet. 2012;13:601–12.
Google Scholar
Walker TM, Cruz ALG, Peto TE, Smith EG, Esmail H, Crook DW. Tuberculosis is changing. Lancet Infect Dis. 2017;17:359–61.
Google Scholar
Satta G, Lipman M, Smith GP, Arnold C, Kon OM, McHugh TD. Mycobacterium tuberculosis and whole-genome sequencing: how close are we to unleashing its full potential? Clin Microbiol Infect. 2018;24:604–9.
Google Scholar
Jakobsdottir J, Gorin MB, Conley YP, Ferrell RE, Weeks DE. Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers. Abecasis GR, editor. PLoS Genet. 2009;5:e1000337.
Yang Y, Niehaus KE, Walker TM, Iqbal Z, Walker AS, Wilson DJ, et al. Machine learning for classifying tuberculosis drug-resistance from DNA sequencing data. Birol I, editor. Bioinformatics. 2018;34:1666–71.
Kouchaki S, Yang Y, Walker TM, Sarah Walker A, Wilson DJ, Peto TEA, et al. Application of machine learning techniques to tuberculosis drug resistance analysis. Wren J, editor. Bioinformatics. 2019;35:2276–82.
Yang Y, Walker TM, Walker AS, Wilson DJ, Peto TEA, Crook DW, et al. DeepAMR for predicting co-occurrent resistance of Mycobacterium tuberculosis. Hancock J, editor. Bioinformatics. 2019;35:3240–9.
Gröschel MI, Owens M, Freschi L, Vargas R, Marin MG, Phelan J, et al. Gentb: A user-friendly genome-based predictor for tuberculosis resistance powered by machine learning. Genome Med. 2021;13:138.
Google Scholar
The CRyPTIC Consortium and the 100,000 Genomes Project. Prediction of Susceptibility to First-Line Tuberculosis Drugs by DNA Sequencing. N Engl J Med. 2018;379:1403–15.
He G, Zheng Q, Shi J, Wu L, Huang B, Yang Y. Evaluation of WHO catalog of mutations and five WGS analysis tools for drug resistance prediction of Mycobacterium tuberculosis isolates from China. Georghiou SB, editor. Microbiol Spectr. 2024;12:e03341–23.
Ferrari E, Retico A, Bacciu D. Measuring the effects of confounders in medical supervised classification problems: the confounding index (CI). Artif Intell Med. 2020;103:101804.
Google Scholar
Ribeiro MT, Singh S, Guestrin C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco California USA: ACM; 2016;1135–44. Available from: https://dl.acm.org/doi/10.1145/2939672.2939778. [cited 2025 Sept 2].
Lundberg S, Lee S-I. A Unified Approach to Interpreting Model Predictions. arXiv; 2017 Available from: https://arxiv.org/abs/1705.07874. [cited 2025 Sept 2].
Meyes R, Lu M, Waubert de Puiseau C, Meisen T. Ablation studies to uncover structure of learned representations in artificial neural networks. Proceedings of the International Conference on Artificial Intelligence (ICAI). Athens, Greece: CSREA Press; 2019 Available from: https://www.researchgate.net/publication/334871296_Ablation_Studies_to_Uncover_Structure_of_Learned_Representations_in_Artificial_Neural_Networks. [cited 2025 Sept 2].
Callaway E. How generative AI is building better antibodies. Nature. 2023;d41586–023–01516-w.
118.Callaway E. ‘ChatGPT for CRISPR’ creates new gene-editing tools. Nature. 2024;629:272–272.
Google Scholar
Tang X, Dai H, Knight E, Wu F, Li Y, Li T, et al. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Briefings in Bioinformatics. 2024;25:bbae338
Winnifrith A, Outeiral C, Hie BL. Generative artificial intelligence for de novo protein design. Current Opinion in Structural Biology. 2024;86:102794