Projects

SAnDReS

Highlights
SAnDReS 2.0 (Statistical Analysis of Docking Results and Scoring functions) brings advanced computational tools for protein-ligand docking simulation and machine-learning modeling. We have AutoDock Vina (version 1.2.3) (Eberhardt et al., 2021) as a docking engine. Also, SAnDReS 2.0 has 54 regression methods implemented using Scikit-Learn (Pedregosa et al., 2011), which allows us to explore the Scoring Function Space (SFS) concept. This exploration of the SFS permits us to have an adequate machine-learning model for a targeted protein system. This approach creates computational models with superior predictive performance compared with classical scoring functions (also known as universal scoring functions). SAnDReS aims to merge the holistic view of systems biology with machine-learning methods to contribute to drug discovery projects. SAnDReS predicts binding affinity for a specific protein system with superior performance compared to classical scoring functions. Evaluation of the predictive performance of 107 scoring functions against the CASF-2016 benchmark (Su et al., 2019) indicates that a machine-learning model developed with SAnDReS 2.0 outperformed classical and machine-learning scoring functions such as KDEEP (Jiménez et al., 2018), CSM-lig (Pires & Ascher, 2016), and ΔVinaRF20 (Wang & Zhang, 2017) (plots below). Dr. Walter F. de Azevedo Jr. proposed the initial idea of SAnDReS in 2016, which now has an international team of scientists participating in its development and testing.

A) Predictive performance using DOME (Walsh et al., 2021) strategy (r2, rho, and EDOME). B) Scattering plot (Predicted pKi Experimental pKi) for all docked structures in the CASF-2016 Ki test set.

Funding
The Brazilian National Council for Scientific and Technological Development (CNPq) (Process 306298/2022-8) supports this research project. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazil (CAPES) – Finance Code 001. MVA acknowledges Diether Haenicke Scholarship from Western Michigan University. ОТ, NB, and VP thank the Program for Basic Research in the Russian Federation for a long-term period 2021–2030 (project No. 122030100170-5). R.Q and M.A.V thank Secyt-UNC for their financial support.

SFSXplorer

Download GitHub Wiki

Highlights

SFSXplorer (Scoring Function Space Explorer) is a Python package to explore the concept of Scoring Function Space (SFS). We can explore the SFS to build a computational model targeted to a specific protein system (targeted-scoring function). SFSXplorer employs binding affinity data and protein-ligand structures (docked or crystallographic) to train machine learning models to predict binding affinity. We base this SFS exploration on a scoring function with variable energy terms. This scoring function is a polynomial equation with terms accounting for van der Waals, hydrogen bonds, electrostatic, desolvation entropy, and torsional contributions. For the hydrogen-bond energy term, we do not focus on 12/10 potential only. SFSXplorer implements an n/m potential equation. We have the same flexibility for van der Waals terms. SFSXplorer calculates energy terms varying the exponents n and m. For electrostatic potential, we modify the permittivity function. We also have a flexible expression for the desolvation entropy term. We account for the torsional energy by employing the standard potential based on the number of torsion angles. Then, we may choose the set of energy terms with the best predictive performance. We have the flexibility for energy terms making available unexplored regions of the SFS. Dr. Walter F. de Azevedo Jr. proposed the initial idea of SFSXplorer, which now has an international team of scientists participating in its development and testing of SFSXplorer.

Downloads

Funding
The Brazilian National Council for Scientific and Technological Development (CNPq) (Process 306298/2022-8) supports this research project.

Taba

Download GitHub Wiki

Highlights

Taba (Tool to Analyze the Binding Affinity) generates scoring functions to predict binding affinity based on the atomic coordinates of a protein-ligand complex. It employs a polynomial equation where the terms are mass-spring potentials. Taba calculates average interatomic distances and takes them as equilibrium distances for a mass-spring potential. The mass-spring potential equation and the equilibrium constants for each pair of atoms compose the Taba Force Field (TFF). We are currently updating Taba and expect to release a new version of Taba by August/2024. It is indicated below a flowchart highlighting the main steps of Taba. Please cite the following reference (da Silva et al., 2020) if you use the Taba program.

This flowchart shows the main steps used to generate targeted-scoring functions with Taba (da Silva et al., 2020).

TFF outperforms classical and machine learning scoring functions such as KDEEP (Jiménez et al., 2018) and CSM-lig (Pires & Ascher, 2016)(plot below).

Predictive performance with DOME (Walsh et al., 2021) using r2, rho, and log (EDOME). We employed EDOME to generate this plot. Statistical analysis of scoring functions includes previously published data (da Silva et al., 2020) to create this plot, except for KDEEP (Jiménez et al., 2018) and CSM-lig (Pires & Ascher, 2016), which utilized the atomic coordinates for the structures in the test set to predict the affinity (pKi). The plot highlights the performance of seven scoring functions: T (Taba), K (KDEEP ), C (CSM-lig), A (AutoDock4) (Morris et al. 2009), V (AutoDock Vina) (Eberhardt et al., 2021), M (MolDock Score), and P (Plants Score) (Thomsen & Christensen, 2006).

Downloads

Funding

The Brazilian National Council for Scientific and Technological Development (CNPq) (Process 306298/2022-8) supports this research project. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazil (CAPES) – Finance Code 001. MVA acknowledges Diether Haenicke Scholarship from Western Michigan University. ОТ, NB, and VP thank the Program for Basic Research in the Russian Federation for a long-term period 2021–2030 (project No. 122030100170-5). R.Q and M.A.V thank Secyt-UNC for their financial support.