Listen to this article
Over the past two decades, Bayer Pharma has developed an in silico absorption, distribution, metabolism, and excretion (ADMET) platform with the aim of generating models for various pharmacokinetic and physicochemical endpoints in early drug discovery. This platform is accessible to all scientists within the company and has proven valuable in assisting with the selection and design of novel leads and the lead optimization process.
A recent research paper discusses the development of machine-learning (ML) approaches with a focus on data, descriptors, and algorithms. The authors emphasize the significance of high-quality data and tailored descriptors, as well as a thorough understanding of the experimental endpoints, in ensuring the utility of their models. To assess model quality, they use leave-one-cluster-out cross-validation, where for classification, the Matthews correlation coefficient (MCC) should be >0.4, and for regression models, the Pearson R2 should be >0.3 and Spearman R2 >0.6.
Regular model updates are essential to keep the models useful for current compounds, and the improvement achievable from regular retraining depends on the model building technique and the property being studied. The authors have already implemented a weekly automated data download and filtering process, as well as automated model retraining for various endpoints. They acknowledge that modeling has its limitations and cannot fully replace the need for experimental data, particularly in biological experiments.
To achieve robust models, a large number of homogeneous data and descriptors tailored to the underlying experimental endpoint are necessary. The automated generation of numerous models, including different data splits, descriptors, and ML algorithms, can aid in selecting the most accurate ones. The article also highlights the need for better 3D-based descriptors of molecules, considering intramolecular hydrogen bonds and tautomers. Challenges for the future include embedding in vivo ADMET models into holistic artificial intelligence approaches, estimating binding affinity and compound synthesizability, and improving solutions for applicability domain estimates in silico.
The authors conclude that the successful application of in silico ADMET models depends on model quality, relevance for research processes, and easy access and interpretability of results. Data, algorithms, and descriptors all contribute to model quality, and continuous advancements in this field are crucial for enhancing drug discovery and development processes.
Copyright © DrugPatentWatch. Originally published at