Fleming N. How artificial intelligence is changing drug discovery. Nature. 2018;557(7706):S55–S55.
Google Scholar
Zeng X, Wang F, Luo Y, Kang Sg, Tang J, Lightstone FC, et al. Deep generative molecular design reshapes drug discovery. Cell Rep Med. 2022;3:100794. https://doi.org/10.1016/j.xcrm.2022.100794.
Google Scholar
Vert JP. How will generative ai disrupt data science in drug discovery? Nat Biotechnol. 2023;41:750–1. https://doi.org/10.1038/s41587-023-01789-6.
Google Scholar
Diao Y, Liu D, Ge H, Zhang R, Jiang K, Bao R, et al. Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery. Nat Commun. 2023;14(1):4552.
Google Scholar
Flam-Shepherd D, Zhu K, Aspuru-Guzik A. Language models can learn complex molecular distributions. Nat Commun. 2022;13(1): 3293.
Google Scholar
Mahmood O, Mansimov E, Bonneau R, Cho K. Masked graph modeling for molecule generation. Nat Commun. 2021;12(1):3156.
Google Scholar
Yang X, Fu L, Deng Y, Liu Y, Cao D, Zeng X. GPMO: Gradient Perturbation-Based Contrastive Learning for Molecule Optimization. In: IJCAI. 2023. pp. 4940–8.
Jin W, Barzilay R, Jaakkola T. Junction tree variational autoencoder for molecular graph generation. In: International conference on machine learning. PMLR; 2018. pp. 2323–32.
Jin W, Barzilay R, Jaakkola T. Hierarchical generation of molecular graphs using structural motifs. In: International conference on machine learning. Online: PMLR; 2020. pp. 4839–48.
Xue D, Zhang H, Chen X, Xiao D, Gong Y, Chuai G, et al. X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis. Sci Bull. 2022;67(9):899–902.
Google Scholar
Zhang Z, Liu Q, Wang H, Lu C, Lee CK. Motif-based graph self-supervised learning for molecular property prediction. Adv Neural Inf Process Syst. 2021;34:15870–82.
You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y. Graph contrastive learning with augmentations. Adv Neural Inf Process Syst. 2020;33:5812–23.
Xiang H, Jin S, Xia J, Zhou M, Wang J, Zeng L, et al. An image-enhanced molecular graph representation learning framework. In: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. Jeju, Korea: IJCAI; 2024. pp. 6107–15.
Luo S, Chen T, Xu Y, Zheng S, Liu TY, Wang L, et al. One Transformer Can Understand Both 2D & 3D Molecular Data. In: The Eleventh International Conference on Learning Representations. Kigali, Rwanda: ICLR; 2023.
Guo Z, Sharma P, Martinez A, Du L, Abraham R. Multilingual Molecular Representation Learning via Contrastive Pre-training. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: ACL; 2022. pp. 3441–53.
Li H, Zhang R, Min Y, Ma D, Zhao D, Zeng J. A knowledge-guided pre-training framework for improving molecular representation learning. Nat Commun. 2023;14(1):7568.
Google Scholar
Xiang H, Zeng L, Hou L, Li K, Fu Z, Qiu Y, et al. A molecular video-derived foundation model for scientific drug discovery. Nat Commun. 2024;15(1):9696.
Google Scholar
Hou L, Xiang H, Zeng X, Cao D, Zeng L, Song B. Attribute-guided prototype network for few-shot molecular property prediction. Brief Bioinform. 2024;25(5): bbae394.
Google Scholar
Zhang X, Xiang H, Yang X, Dong J, Fu X, Zeng X, et al. Dual-view learning based on images and sequences for molecular property prediction. IEEE J Biomed Health Inform. 2023;28(3):1564–74.
Xia J, Zhao C, Hu B, Gao Z, Tan C, Liu Y, et al. Mole-bert: Rethinking pre-training graph neural networks for molecules. In: The Eleventh International Conference on Learning Representations. Virtual Event: ICLR; 2022.
Zeng X, Xiang H, Yu L, Wang J, Li K, Nussinov R, et al. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat Mach Intell. 2022;4(11):1004–16.
Hendrickson JB. Concepts and applications of molecular similarity. Science. 1991;252(5009):1189–90.
Stumpfe D, Bajorath J. Exploring activity cliffs in medicinal chemistry: miniperspective. J Med Chem. 2012;55(7):2932–42.
Google Scholar
Wedlake AJ, Folia M, Piechota S, Allen TE, Goodman JM, Gutsell S, et al. Structural alerts and random forest models in a consensus approach for receptor binding molecular initiating events. Chem Res Toxicol. 2019;33(2):388–401.
van Tilborg D, Alenicheva A, Grisoni F. Exposing the limitations of molecular machine learning with activity cliffs. J Chem Inf Model. 2022;62(23):5938–51.
Google Scholar
Deng J, Yang Z, Wang H, Ojima I, Samaras D, Wang F. A systematic study of key elements underlying molecular property prediction. Nat Commun. 2023;14(1):6395.
Xia J, Zhang L, Zhu X, Liu Y, Gao Z, Hu B, et al. Understanding the limitations of deep models for molecular property prediction: Insights and solutions. Adv Neural Inf Process Syst. 2023;36:64774–92.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations. France: ICLR; 2017.
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph Attention Networks. In: International Conference on Learning Representations. Vancouver, Canada: ICLR; 2018.
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing for quantum chemistry. In: International conference on machine learning. Sydney, NSW, Australia: PMLR; 2017. pp. 1263–72.
Li Q, Han Z, Wu XM. Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI conference on artificial intelligence. Louisiana, USA: AAAI; 2018. vol. 32.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE; 2016. pp. 770–8.
Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–54.
Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Florence, Italy: ACL; 2019. pp. 4171–86.
He K, Chen X, Xie S, Li Y, Dollár P, Girshick R. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. LA, USA: IEEE; 2022. pp. 16000–9.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations. Virtual Event: ICLR; 2021.
Kim W, Son B, Kim I. Vilt: Vision-and-language transformer without convolution or region supervision. In: International Conference on Machine Learning. Online: PMLR; 2021. pp. 5583–94.
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877–901.
Radford A, Narasimhan K, Salimans T, Sutskever I, et al. Improving language understanding by generative pre-training. OpenAI blog. 2018. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
Hu Y, Bajorath J. Extending the activity cliff concept: structural categorization of activity cliffs and systematic identification of different types of cliffs in the ChEMBL database. J Chem Inf Model. 2012;52(7):1806–11.
Google Scholar
Stumpfe D, Hu H, Bajorath J. Advances in exploring activity cliffs. J Comput Aided Mol Des. 2020;34:929–42.
Google Scholar
Chithrananda S, Grand G, Ramsundar B. ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. 2020. Preprint at https://doi.org/10.48550/arXiv.2010.09885.
Rong Y, Bian Y, Xu T, Xie W, Wei Y, Huang W, et al. Self-supervised graph transformer on large-scale molecular data. Adv Neural Inf Process Syst. 2020;33:12559–71.
Wang Y, Wang J, Cao Z, Barati Farimani A. Molecular contrastive learning of representations via graph neural networks. Nat Mach Intell. 2022;4(3):279–87.
Hu W, Liu B, Gomes J, Zitnik M, Liang P, Pande V, et al. Strategies for Pre-training Graph Neural Networks. In: International Conference on Learning Representations. Virtual Event: ICLR; 2020.
Wu F, Qin H, Gao W, Li S, Coley CW, Li SZ, et al. InstructBio: A Large-scale Semi-supervised Learning Paradigm for Biochemical Problems. 2023. arXiv preprint arXiv:2304.03906.
Fang X, Liu L, Lei J, He D, Zhang S, Zhou J, et al. Geometry-enhanced molecular representation learning for property prediction. Nat Mach Intell. 2022;4(2):127–34.
Stärk H, Beaini D, Corso G, Tossou P, Dallago C, Günnemann S, et al. 3d infomax improves gnns for molecular property prediction. In: International Conference on Machine Learning. Baltimore, Maryland, USA: PMLR; 2022. pp. 20479–502.
Liu S, Wang H, Liu W, Lasenby J, Guo H, Tang J. Pre-training Molecular Graph Representation with 3D Geometry. In: International Conference on Learning Representations. Virtual Event: ICLR; 2022.
Xiang H, Jin S, Liu X, Zeng X, Zeng L. Chemical structure-aware molecular image representation learning. Brief Bioinform. 2023;24(6): bbad404.
Google Scholar
Zhang T. An introduction to support vector machines and other kernel-based learning methods. AI Mag. 2001;22(2):103.
Google Scholar
Breiman L. Bagging predictors. Mach Learn. 1996;24:123–40.
Fix E, Hodges JL. Discriminatory analysis: nonparametric discrimination, consistency properties. Int Stat Rev. 1989;57(3):238–47.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
Google Scholar
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–1232. https://doi.org/10.1214/aos/1013203451.
Google Scholar
Kullback S, Leibler RA. On information and sufficiency. Ann Math Statist. 1951;22(1):79–86.
Lee DH, et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, PMLR; Vol. 3, 2013, no. 2.
Torng W, Altman RB. Graph convolutional neural networks for predicting drug-target interactions. J Chem Inf Model. 2019;59(10):4131–49.
Google Scholar
Sakai M, Nagayasu K, Shibui N, Andoh C, Takayama K, Shirakawa H, et al. Prediction of pharmacological activities from chemical structures with graph convolutional neural networks. Sci Rep. 2021;11(1):525.
Google Scholar
Li P, Wang J, Qiao Y, Chen H, Yu Y, Yao X, et al. An effective self-supervised framework for learning expressive molecular global representations to drug discovery. Brief Bioinform. 2021;22(6): bbab109.
Google Scholar
Fay MP, Proschan MA. Wilcoxon-mann-whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Stat Surv. 2010;4: 1.
Google Scholar
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. Hawaii: IEEE; 2017. pp. 618–26.
Luo D, Cheng W, Xu D, Yu W, Zong B, Chen H, et al. Parameterized explainer for graph neural network. Adv Neural Inf Process Syst. 2020;33:19620–31.
Wu Z, Wang J, Du H, Jiang D, Kang Y, Li D, et al. Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking. Nat Commun. 2023;14(1):2585.
Google Scholar
Wu Z, Jiang D, Wang J, Hsieh CY, Cao D, Hou T. Mining toxicity information from large amounts of toxicity data. J Med Chem. 2021;64(10):6924–36.
Google Scholar
Xu C, Cheng F, Chen L, Du Z, Li W, Liu G, et al. In silico prediction of chemical Ames mutagenicity. J Chem Inf Model. 2012;52(11):2840–7.
Google Scholar
Polishchuk PG, Kuz’min VE, Artemenko AG, Muratov EN. Universal approach for structural interpretation of QSAR/QSPR models. Mol Inform. 2013;32(9–10):843–53.
Google Scholar
Peng S, Hu P, Xiao YT, Lu W, Guo D, Hu S, et al. Single-cell analysis reveals EP4 as a target for restoring T-cell infiltration and sensitizing prostate cancer to immunotherapy. Clin Cancer Res. 2022;28(3):552–67.
Google Scholar
Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res. 2007;35(Suppl_1):D198-201.
Google Scholar
Nakao K, Murase A, Ohshiro H, Okumura T, Taniguchi K, Murata Y, et al. CJ-023,423, a novel, potent and selective prostaglandin EP4 receptor antagonist with antihyperalgesic properties. J Pharmacol Exp Ther. 2007;322(2):686–94.
Google Scholar
He J, Lin X, Meng F, Zhao Y, Wang W, Zhang Y, et al. A novel small molecular prostaglandin receptor EP4 antagonist, L001, suppresses pancreatic cancer metastasis. Molecules. 2022;27(4):1209.
Google Scholar
Murase A, Okumura T, Sakakibara A, Tonai-Kachi H, Nakao K, Takada J. Effect of prostanoid EP4 receptor antagonist, CJ-042,794, in rat models of pain and inflammation. Eur J Pharmacol. 2008;580(1–2):116–21.
Google Scholar
Blouin M, Han Y, Burch J, Farand J, Mellon C, Gaudreault M, et al. The discovery of 4-({)1-[(({)2, 5-dimethyl-4-[4-(trifluoromethyl) benzyl]-3-thienyl(}) carbonyl) amino] cyclopropyl(}) benzoic acid (MK-2894), a potent and selective prostaglandin E2 subtype 4 receptor antagonist. J Med Chem. 2010;53(5):2227–38.
Caselli G, Bonazzi A, Lanza M, Ferrari F, Maggioni D, Ferioli C, et al. Pharmacological characterisation of CR6086, a potent prostaglandin E 2 receptor 4 antagonist, as a new potential disease-modifying anti-rheumatic drug. Arthritis Res Ther. 2018;20:1–19.
Kotani T, Takano H, Yoshida T, Hamasaki R, Kohanbash G, Takeda K, et al. Inhibition of PGE2/EP4 pathway by ONO-4578/BMS-986310, a novel EP4 antagonist, promotes T cell activation and myeloid cell differentiation to dendritic cells. Cancer Res. 2020;80(16_Supplement):4443.
Albu DI, Wang Z, Huang KC, Wu J, Twine N, Leacu S, et al. EP4 antagonism by E7046 diminishes myeloid immunosuppression and synergizes with Treg-reducing IL-2-diphtheria toxin fusion protein in restoring anti-tumor immunity. Oncoimmunology. 2017;6(8):e1338239.
Google Scholar
Jin Y, Liu Q, Chen P, Zhao S, Jiang W, Wang F, et al. A novel prostaglandin E receptor 4 (EP4) small molecule antagonist induces articular cartilage regeneration. Cell Discov. 2022;8(1):24.
Google Scholar
Das D, Qiao D, Liu Z, Xie L, Li Y, Wang J, et al. Discovery of novel, selective prostaglandin EP4 receptor antagonists with efficacy in cancer models. ACS Med Chem Lett. 2023;14(6):727–36.
Google Scholar
Degen J, Wegscheid-Gerlach C, Zaliani A, Rarey M. On the Art of Compiling and Using ‘Drug-Like’ Chemical Fragment Spaces. ChemMedChem. 2008;3(10):1503–7.
Google Scholar
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2019;47(D1):D1102–9.
Google Scholar
Chen C, Ye W, Zuo Y, Zheng C, Ong SP. Graph networks as a universal machine learning framework for molecules and crystals. Chem Mater. 2019;31(9):3564–72.
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25:1097–105.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Google Scholar
Cheng Z, Xiang H, Ma P, Zeng L, Jin X, Yang X, et al. MaskMol: knowledge-guided molecular image pre-training framework for activity cliffs with Pixel Masking. Zenodo. 2025. https://doi.org/10.5281/zenodo.15834481.