Current Drug Targets

Frontiers Section Editor (Bioinformatics and Biophysics) for the Current Drug Targets ISSN: 1873-5592

Bentham Link

Current Medicinal Chemistry

Section Editor (Bioinformatics in Drug Design and Discovery) for the Current Medicinal Chemistry ISSN: 1875-533X

Bentham Link

Combinatorial Chemistry and High Throughput Screening

Section Editor (Combinatorial/Medicinal Chemistry) for the Combinatorial Chemistry and High Throughput Screening ISSN: 1875-5402

Bentham Link

Current Bioinformatics

Member of the Editorial Board for the Current Bioinformatics ISSN: 2212-392X (Online) ISSN: 1574-8936 (Print)

Bentham Link

Organic and Medicinal Chemistry International Journal

Member of the Editorial Board for the Organic and Medicinal Chemistry International Journal ISSN: 2474-7610

Bentham Link

Bioengineering International

Section Editor in Chief (Bioinformatics) for Bioengineering International. ISSN 2668-7119

Bioengineering International


Scoring Function Space eXplorer 

How to Install

You need to have Python 3 installed on your computer to run SFSXplorer. In addition, you also need NumPy (1.14.5*), Matplotlib, scikit-learn (0.19.1*), and SciPy (1.1.0*).
*You can use higher versions as well.

  • Step 1. Download SFSXplorer (available here)
  • Step 2. Unzip the zipped file
  • Step 3. Copy SFSXplorer directory to c:\ 
  • Step 4. Open a command prompt window and type: cd c:\SFSXplorer 
  • Step 5. Then type: python  . That´s it, good exploration of the scoring function space. 
  • Step 1. Download SFSXplorer (available here)
  • Step 2. Unzip the zipped file
  • Step 3. Copy SFSXplorer directory to the directory of your choice
  • Step 4. Open a terminal and type cd /your personal directory/SFSXplorer
  • Step 5. Then type: python

That´s it, good exploration of the scoring function space. 


Using SFSXplorer

The program SFSXplorer allows us to explore the scoring function space. All necessary files to run SFSXplorer are in the zipped folder at GitHub. The input file is shown below,

n_e0,5 # Number of elements of epsilon0(e0) (5) (permittivity constant of bulk water at 25˚C e0 = 78.4) (float)
# For Solvatation potential (set up parameters for arrays)
m_sol_i,1 # Initial value of expoent m (1) (integer)
m_sol_f,4 # Final value of expoent m (4) (integer)
n_m_sol,4 # Number of elements of expoent m (4) (integer)
n_sol_i,1 # Initial value of expoent n (1) (integer)
n_sol_f,4 # Final value of expoent n (4) (integer)
n_n_sol,4 # Number of elements of expoent nnnnn (4) (integer)
sigma_sol_i,2.5 # Initial value of sigma used in desolvatation potential (2.5 Angtrom)
sigma_sol_f,5.5 # Final value of sigma used in desolvatation potential (5.5 Angstrom)
n_sigma_sol,4 # Number of elements of sigma used in desolvatation potential (5)

The first line brings the name of the chklig file, that has the list of the PDB files present in the dataset. The second line shows the directory where the PDB files are. The next two lines indicate the output files and the last line the type of binding affinity. The following lines bring specific information about each energy term calculated by SFSXplorer.

The is as follows,

/# Type of binding information: ki

CHKLIG,1DWB,BEN,H, 1, 2.92

CHKLIG,1ETR,MIT,H, 1, 7.40


The first column is a keyword to indicate that this line brings ligand data. The second column shows the PDB access code, followed by the ligand id, ligand chain, ligand number, and the binding affinity. The first line shows the type of binding affinity.  


In our research, we see protein-ligand interaction as a result of the relation between the protein space (Smith, 1970) and the chemical space  (Bohacek et al., 1996; Dobson, 2004Kirkpatrick & Ellis, 2004; Lipinski & Hopkins, 2004Shoichet, 2004; Stockwell, 2004), and we propose to represent these sets as a unique complex system, where the application of computational methodologies may contribute to generate models to predict protein-ligand binding affinities. Such approaches have the potential to create novel semi-empirical force fields to predict binding affinity with superior predictive power when compared with standard methodologies. SFSXplorer is an acronym for Scoring Function Space eXplorer. This computational tool explores the scoring function space with a hybrid algorithm, where we vary energy terms and adjust their relative weights using machine learning algorithms.

Scoring Function Space 

We propose to use the abstraction of a mathematical space composed of infinite computational models to predict ligand-binding affinity. We named this space as the scoring function space (Heck et al., 2017; Bitencourt-Ferreira & de Azevedo Jr., 2019). By the use of supervised machine learning techniques is possible to explore this scoring function space and build a computational model targeted to a specific biological system. For instance, we created targeted-scoring functions for coagulation factor Xa (Xavier et al., 2016), cyclin-dependent kinases (de Ávila et al., 2017; Levin et al., 2018), HIV-1 protease (Pintro & de Azevedo, 2017), estrogen receptor (Amaral et al., 2018), cannabinoid receptor 1 (Russo & de Azevedo, 2019; Russo & de Azevedo, 2020), and 3-dehydroquinate dehydratase (de Ávila & de Azevedo, 2018). We have also developed a scoring function to predict Gibbs free energy of binding for protein-ligand complexes (Bitencourt-Ferreira & de Azevedo, 2018). We developed the programs SAnDReS (Xavier et al., 2016), SFSXplorer, and Taba (da Silva et al., 2020) to generate computational models to predict ligand-binding affinity. SAnDReS, SFSXplorer, and Taba are integrated computational tools to explore the scoring function space

Firstly, let´s consider the protein space composed of protein structures. This protein space can be represented by the protein structure space, as depicted by Hou et al. 2005 and shown in the figure below. In this figure, elements of the protein structure space are represented by spheres. The sphere color represents the superfamilies of protein structures. Analysis of the protein structure space indicated that proteins with similar structures clustered together in this space. Also, the authors pointed out that the distribution of structural classes of this space followed closely that of the protein fold space. 

We take this finite protein space as a starting point to the application of the concept of scoring function space. If we pick an element of the protein space, for instance, the cyclin-dependent kinase family, we may identify all ligands that bind to this protein. Now, let’s consider the chemical space (Bohacek et al., 1996; Dobson, 2004Kirkpatrick & Ellis, 2004; Lipinski & Hopkins, 2004Shoichet, 2004; Stockwell, 2004), which is formed by small molecules that may bind or not to an element of the protein space. If we take into account a subspace of the chemical space composed of structures that bind to cyclin-dependent kinase family, it is easy to imagine an association involving the cyclin-dependent kinase and this subspace of the chemical space. We represent this relationship as an arrow from the protein space to the chemical space, as indicated below.

Finally, we consider a mathematical space composed of infinite scoring functions, each element of this space is a mathematical function that uses the atomic coordinates of protein-ligand complexes to predict the binding affinity. Below we have an idealization of the scoring function space.

Moving forward, we propose that there exist at least one scoring function capable of predicting the ligand binding affinity of the elements of the chemical space for a component of the protein space. We indicate this relationship as an arrow from the scoring function space to the arrow indicating the relation between CDK and the chemical space, as shown below.

A view of the scoring function space as a way to develop a computational model to predict ligand-binding affinity. Structures of proteins available with the following PDB access codes: 2OW4, 2OVU, 2IDZ, 2GSJ, 2G85, 2A4l, 1ZTB, 1Z99, 1WE2, 1M73, 1FLH, and 1FHJ.

So, the basic idea is quite simple; we intend to identify an element of the scoring function space that predicts the binding affinity of a component of the protein space for all elements of the subspace of the chemical space composed of ligands for the element of the protein space.

The animation below summarizes the major aspects of the relationship between the protein and chemical spaces and the emerging figure of the scoring space function.


Amaral MEA, Nery LR, Leite CE, de Azevedo Junior WF, Campos MM. Pre-clinical effects of metformin and aspirin on the cell lines of different breast cancer subtypes. Invest New Drugs. 2018; 36(5):782–796.   PubMed   PDF   

Bitencourt-Ferreira G, de Azevedo Jr. WF. Development of a machine-learning model to predict Gibbs free energy of binding for protein-ligand complexes. Biophys Chem. 2018; 240: 63–69.   PubMed   PDF      

Bitencourt-Ferreira G, de Azevedo WF Jr. Exploring the Scoring Function Space. Methods Mol Biol. 2019; 2053:275-281. doi: 10.1007/978-1-4939-9752-7_17.   PubMed   

da Silva AD, Bitencourt-Ferreira G, de Azevedo WF Jr. Taba: A Tool to Analyze the Binding Affinity. J Comput Chem. 2020; 41(1): 69-73. doi: 10.1002/jcc.26048.   PubMed   Publons  

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):350.   PubMed   PDF   

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2.  Biochem Biophys Res Commun. 2017; 494: 305–310.  PubMed   PDF 

de Ávila MB, de Azevedo WF Jr. Development of machine learning models to predict inhibition of 3-dehydroquinate dehydratase. Chem Biol Drug Des. 2018;92:1468–1474.   PubMed   PDF   

Dobson CM. Chemical space and biology. Nature. 2004; 432(7019):824–828.   PubMed   PDF    

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem. 2017; 24(23): 2459–2470.   PubMed   PDF    

Hou J, Jun SR, Zhang C, Kim SH. Global mapping of the protein structure space and application in structure-based inference of protein function. Proc Natl Acad Sci U S A. 2005; 102(10):3651-6.   PubMed   PDF   

Kirkpatrick P, Ellis C. Chemical Space. Nature 2004; 432:823   Link   PDF 

Levin NMB, Pintro VO, Bitencourt-Ferreira G, Mattos BB, Silvério AC, de Azevedo Jr. WF. Development of CDK-targeted scoring functions for prediction of binding affinity. Biophys Chem. 2018; 235: 1–8.   Link   PubMed   PDF        

-Lipinski C, Hopkins A. Navigating chemical space for biology and medicine. Nature. 2004;432(7019):855–861.   PubMed   PDF          

-Pintro VO, Azevedo WF. Optimized Virtual Screening Workflow. Towards Target-Based Polynomial Scoring Functions for HIV-1 Protease. Comb Chem High Throughput Screen. 2017; 20(9): 820–827.   PubMed   PDF   

-Russo S, de Azevedo WF. Advances in the Understanding of the Cannabinoid Receptor 1 - Focusing on the Inverse Agonists Interactions. Curr Med Chem. 2018. doi: 10.2174/0929867325666180417165247   PubMed     

Shoichet BK. Virtual screening of chemical libraries. Nature. 2004; 432(7019):862–865.   PubMed   PDF   

-Smith JM. Natural selection and the concept of a protein space. Nature. 1970; 225(5232): 563–564.   PubMed   

-Stockwell BR. Exploring biology with small organic molecules. Nature. 2004; 432(7019):846–854.   PubMed   PDF   

-Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions. Comb. Chem. High Throughput Screen. 2016; 19(10): 80112.   PubMed    PDF    GitHub