____________________________________________________________

Research   

Protein-Ligand Interactions

 

In the study of intermolecular interactions involving protein and ligands, we expect to gain further insights into the structural basis for the specificity of small-molecule ligands against a specific protein target (De Azevedo, 2008). This is a central problem in the drug design. Knowledge of the key features responsible for the specificity of a ligand for a protein allows us to determine which physical-chemical parameters could be changed to improve the protein-ligand interaction (De Azevedo & Dias, 2008a). Furthermore, the development of a computational model to predict the binding affinity based on the atomic coordinates of a protein-ligand complex (De Azevedo & Dias, 2008b) opens the possibility to apply virtual screening approaches to search small-molecule databases to identify a drug candidate (De Azevedo, 2010a). To study protein-ligand interactions, we make use of protein crystallography (Canduri & De Azevedo, 2008), nuclear magnetic resonance spectroscopy (Fadel et al., 2005), molecular docking (De Azevedo, 2010b), and molecular dynamics (De Azevedo, 2011). 

References   

Canduri F, de Azevedo WF. Protein crystallography in drug discovery. Curr Drug Targets. 2008; 9(12):1048-53. PubMed

De Azevedo WF Jr. Protein-drug interactions. Curr Drug Targets. 2008; 9(12):1030. PubMed     

De Azevedo WF Jr, Dias R. Experimental approaches to evaluate the thermodynamics of protein-drug interactions. Curr Drug Targets. 2008a; 9(12):1071-6. PubMed    

De Azevedo WF Jr, Dias R. Computational methods for calculation of ligand-binding affinity. Curr Drug Targets. 2008b; 9(12):1031-9. PubMed     

De Azevedo WF Jr. Structure-based virtual screening. Curr Drug Targets. 2010a; 11(3):261-3. PubMed  

De Azevedo WF Jr. MolDock applied to structure-based virtual screening. Curr Drug Targets. 2010b; 11(3):327-34. PubMed 

De Azevedo WF Jr. Molecular dynamics simulations of protein targets identified in Mycobacterium tuberculosis. Curr Med Chem. 2011; 18(9):1353-66. PubMed  

Fadel V, Bettendorff P, Herrmann T, de Azevedo WF Jr, Oliveira EB, Yamane T, Wüthrich K. Automated NMR structure determination and disulfide bond identification of the myotoxin crotamine from Crotalus durissus terrificus. Toxicon. 2005; 46(7):759-67.   PubMed  

Molecular Docking     

The computational prediction of the position of a given ligand into the binding pocket of a protein target is called protein-ligand molecular docking (De Azevedo, 2010). Our focus here is on the application of optimized molecular docking strategies to identify potential new inhibitors for enzymes that are targets for drug development. We are interested in the discovery of new inhibitors for cyclin-dependent kinases (de Ávila et al., 2017), HIV-1 protease (Pintro & Azevedo, 2017), and several others proteins targets (Heck et al., 2017). We are also interested in the development of integrated strategies for molecular docking simulations (Xavier et al., 2016). Furthermore, we are seeking the development of targeted-scoring function for the biological system we are interested in (Heck et al., 2017).  

References

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2.  Biochem Biophys Res Commun. 2017; 494: 305-10.  PubMed   PDF     

De Azevedo WF Jr. MolDock applied to structure-based virtual screening. Curr Drug Targets. 2010; 11(3):327-34. PubMed   

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem. 2017; 24(23): 2459-70.   PubMed   PDF    

Pintro VO, Azevedo WF. Optimized Virtual Screening Workflow. Towards Target-Based Polynomial Scoring Functions for HIV-1 Protease. Comb Chem High Throughput Screen. 2017. doi: 10.2174/1386207320666171121110019.   PubMed          

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801-12.   Link   PubMed   Go To SAnDReS   PDF    


Bio-inspired Computing     

   

Nature as a source of inspiration has been shown to have a great beneficial impact on the development of new computational methodologies. Algorithms that mimic biological systems are able to create new paradigms for computation, such as neural networks, evolutionary computing, and swarm intelligence. Biologically inspired algorithms (BIA) comprise a class of stochastic optimization and adaptation methodologies, designed for global optimization. One of the most promising biologically inspired algorithms is the evolutionary algorithm. The great majority of evolutionary algorithms extracts inspiration from the process of genetic evolution. In Darwinian evolution, species selection is based on their capacity for survival of the fittest in an ecosystem. Under this view, classes of evolutionary algorithms, known as genetic algorithms, genetic programming, and evolutionary programming, have been developed. All classes of evolutionary algorithms share a large number of characteristics (Heberlé & De Azevedo, 2011). We have been working in the application and development of bio-inspired computation to assess the problem of protein-ligand interactions (Xavier et al., 2016Heck et al., 2017, de Ávila et al., 2017Pintro & Azevedo, 2017).  

References 

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2.  Biochem Biophys Res Commun. 2017; 494: 305-10.  PubMed   PDF     

Heberlé G, de Azevedo WF Jr. Bio-inspired algorithms applied to molecular docking simulations. Curr Med Chem. 2011; 18(9):1339-52. PubMed  

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem. 2017; 24(23): 2459-70.   PubMed   PDF    

Pintro VO, Azevedo WF. Optimized Virtual Screening Workflow. Towards Target-Based Polynomial Scoring Functions for HIV-1 Protease. Comb Chem High Throughput Screen. 2017. doi: 10.2174/1386207320666171121110019.   PubMed          

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801-12.   Link   PubMed   Go To SAnDReS   PDF    

Computational Systems Biology         

We have been working on the development of computational models for unraveling the molecular mechanisms underlying enzyme inhibition and protein-ligand interactions. These computational models can be used to predict binding affinity of a potential inhibitor for an enzyme, such knowledge has the potential to speed up drug discovery and decrease the cost of development of new drugs (de Ávila et al., 2017Pintro & Azevedo, 2017). Furthermore, the availability of computational models to predict binding affinity based on the atomic coordinates of protein-ligand complexes adds flexibility to the process of drug discovery, since it allows us to computationally test different scenarios where a potential new drug may interact with a protein target (Xavier et al., 2016Heck et al., 2017). 

References

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2.  Biochem Biophys Res Commun. 2017; 494: 305-10.  PubMed   PDF     

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem. 2017; 24(23): 2459-70.   PubMed   PDF    

Pintro VO, Azevedo WF. Optimized Virtual Screening Workflow. Towards Target-Based Polynomial Scoring Functions for HIV-1 Protease. Comb Chem High Throughput Screen. 2017. doi: 10.2174/1386207320666171121110019.   PubMed          

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801-12.   Link   PubMed   Go To SAnDReS   PDF    

Research Projects

SAnDReS: Statistical Analysis of Docking Results and Scoring functions

SAnDReS draws inspiration from several protein-ligand projects that we have been working on in the last two decades. These projects began in the 1990s with pioneering studies focused on intermolecular interactions between cyclin-dependent kinase and inhibitors (De Azevedo et al., 1996; 1997). SAnDReS is a free and open-source (GNU General Public License) computational environment for the development of machine-learning models for prediction of ligand-binding affinity. SAnDReS is also a tool for statistical analysis of docking simulations and evaluation of the predictive performance of computational models developed to calculate binding affinity. We have implemented machine learning techniques to generate regression models based on experimental binding affinity and scoring functions such as PLANTS and MolDock scores. The scikit-learn library has a wide spectrum of supervised machine learning techniques for regression, such as Stochastic Gradient Descent and Support Vector. SAnDReS was developed using Python programming language, and SciPy, NumPy, scikit-learn, and Matplotlib libraries. Data obtained from any protein-ligand docking program can be analyzed by SAnDReS, the only requisite is to have protein structures in Protein Data Bank (PDB) format, ligands in Structure Data File (SDF) format, docking and scoring function data in comma-separated values (CSV) format. This program has been applied to several datasets comprised of crystallographic structures for which there is information for the ligand-binding affinity, in order to generate scoring functions tailored to the biological system of interest (Xavier et al., 2016).  


Gallery of Plots Generated by the Program SAnDReS     


Scatter plots and ROC Curve generated by SAnDReS 


The flowchart below illustrates the main steps to integrate a molecular docking program and SAnDReS.

Flowchart for application of SAnDReS to analyze docking results and develop scoring functions. Grey boxes indicate tasks carried out by SAnDReS.


Below you have a list of biological systems being investigated using SAnDReS.

-3-enol-pyruvoylshikimate-5-phosphate synthase (EPSP synthase) (EC 2.5.1.19)   PubMed            
-11-Beta-Hydroxysteroid Dehydrogenase (EC 1.1.1.146)   PubMed      
-2014 Benchmark Exercise for Coagulation Factor Xa (EC 3. 4.21.6)   PubMed            

-Acetylcholinesterase (EC 3.1.1.7)   PubMed   MOTM   
-ADAM 17 Endopeptidase (EC 3.4.24.86)   PubMed         
-Adenosine A2a Receptor (EC 3.2.1.17)   PubMed          
-Adenosine Deaminase (EC 3.5.4.4)   PubMed          
-Adenosylhomocysteinase (EC 3.3.1.1)   PubMed         
-Aldehyde Reductase (EC 1.1.1.21)   PubMed              
-Angiotensin-Converting Enzyme (EC 3.4.15.1)   PubMed           

-Beta-2 Adrenergic Receptor (EC 3.2.1.17)   PubMed          
-Beta-Glucocerebrosidase (or Glucosylceramidase)  (EC 3.2.1.45)   PubMed      
-Beta-Lactamase (or Cephalosporinase) (EC 3.5.2.6)   PubMed   MOTM    
-Beta-Secretase 1 (or Memapsin 2) (EC 3.4.23.46)   PubMed   MOTM        

-Carbonic Anhydrase II (or Carbonic Dehydratase) (EC 4.2.1.1)   PubMed   MOTM       
-Caspase-3 (or Apopain) (EC 3.4.22.56)   PubMed     MOTM
-Catechol O-Methyltransferase (EC 2.1.1.6)   PubMed  
-Chorismate Synthase (EC 4.2.3.5)   PubMed   
-c-Jun N-Terminal Kinase 3 (or Mitogen-Activated Protein Kinase) (EC 2.7.11.24)   PubMed       
-Coagulation Factor X (or Prothrombase) (EC 3.4.21.6)   PubMed         
-Coagulation Factor VIIa (EC 3.4.21.21)   PubMed           
-C-X-C Chemokine Receptor Type 4 (EC 3.2.1.17)   PubMed         
-Cyclin-Dependent Kinase (EC 2.7.11.22)   PubMed         
-Cyclooxygenase-1 and 2 (EC 1.14.99.1)   PubMed   MOTM   
-Cytochrome P450 2C9 (EC 1.14.13.-)   PubMed   MOTM   

-Dihydrofolate Reductase (EC 1.5.1.3)   PubMed   MOTM
-Dihydroorotate Dehydrogenase (Quinone) (EC 1.3.5.2)   PubMed       
-Dipeptidyl Peptidase IV (EC 3.4.14.5)   PubMed       
-Dopamine D3 Receptor (EC 3.2.1.17)   PubMed         
-Dual Specificity Mitogen-Activated Protein Kinase Kinase 1 (EC 2.7.12.2)   PubMed           

-Enoyl-[Acyl-Carrier-Protein] Reductase (NADH) (EC 1.3.1.9)   PubMed          
-Epidermal Growth Factor Receptor ErbB1 (EC 2.7.10.1)   PubMed   MOTM  

-Farnesyl Diphosphate Synthase (EC 2.5.1.10)   PubMed             
-Fibroblast Growth Factor Receptor 1 (EC 2.7.10.1)   PubMed           
-FK506-Binding Protein 1 (EC 5.2.1.8)   PubMed           
-Focal Adhesion Kinase 1 (EC 2.7.10.2)   PubMed           

-GAR Transformylase (EC 2.1.2.2)   PubMed          
-Glucocorticoid Receptor (EC 2.3.1.48)   PubMed         

-Hepatocyte Growth Factor Receptor (or Receptor Protein-Tyrosine Kinase) (EC 2.7.10.1)   PubMed              
-Hexokinase Type IV (or Glucokinase) (EC 2.7.1.2)   PubMed           
-High-Resolution Crystallographic Structures with Delta G Information       
-High-Resolution Crystallographic Structures with Kd Information    
-High-Resolution Crystallographic Structures with Ki Information       
-High-Resolution Crystallographic Structures with IC50 Information    
-Histone Deacetylase 2 (or HDAC) (EC 3.5.1.98)   PubMed         
-Histone Deacetylase 8 (or HDAC) (EC 3.5.1.98)   PubMed        
-HMG-CoA Reductase (or Hydroxymethylglutaryl-CoA Reductase (NADPH) ) (EC 1.1.1.34)   PubMed         
-Human Immunodeficiency Virus Type 1 Integrase (EC 2.7.7.-)   PubMed   MOTM   
-Human Immunodeficiency Virus Type 1 Protease (EC 3.4.23.16)   PubMed   MOTM              
-Human Immunodeficiency Virus Type 1 Reverse Transcriptase (EC 2.7.7.49)   PubMed  MOTM         
-Hydrolases (EC 3.-.-.-)   PubMed        

-Insulin-Like Growth Factor I Receptor (or Receptor Protein-Tyrosine Kinase) (EC 2.7.10.1)   PubMed        
-Inhibitor of Apoptosis Protein (or Magnesium-Importing ATPase) (EC 3.6.3.2)   PubMed       
-Isomerases (EC 5.-.-.-)   PubMed          
 
-Kinases   PubMed         

-Leukotriene-A(4) Hydrolase (or LTA-4 Hydrolase) (EC 3.3.2.6)   PubMed        
-Ligases (EC 6.-.-.-)   PubMed   MOTM             
-Lyases (EC 4.-.-.-)   PubMed        
      
-Macrophage Colony Stimulating Factor Receptor Macrophage Colony Stimulating Factor Receptor (EC 2.7.10.1)   PubMed       
-MAP Kinase-Activated Protein Kinase 2 (or Non-Specific Serine/Threonine Protein Kinase) (EC 2.7.11.1)   PubMed       
-MAP Kinase ERK2 (or Mitogen-Activated Protein Kinase) (EC 2.7.11.24)   PubMed        
-MAP Kinase p38 Alpha  (or Mitogen-Activated Protein Kinase) (EC 2.7.11.24)   PubMed      
-Matrix Metalloproteinase 13 (EC 3.4.24.-)   PubMed       
-Monoamine Oxidase B (or Monoamine Oxidase) (EC 1.4.3.4)   PubMed       
-Muscle Glycogen Phosphorylase (or Amylophosphorylase) (EC 2.4.1.1)   PubMed   MOTM         

-Nitric-Oxide Synthase (EC 1.14.13.39)   PubMed   MOTM         
-Neuraminidase (EC 3.2.1.18)   PubMed   MOTM   

-Oxidoreductases (EC 1.-.-.-)   PubMed         

-Peptide Deformylase (EC 3.5.1.88)   PubMed            
-Peroxisome Proliferator-Activated Receptor Alpha (EC 2.3.1.48)   PubMed     
-Peroxisome Proliferator-Activated Receptor Gamma (EC 2.3.1.48)   PubMed 
-Phosphodiesterase 5A (EC 3.1.4.35)   PubMed         
-Phospholipase A(2) Group IIA (EC 3.1.1.4)   PubMed        
-Poly [ADP-ribose] Polymerase-1 (or NAD(+) ADP-ribosyltransferase) (EC 2.4.2.30)   PubMed            
-Protein Farnesyltransferase/Geranylgeranyltransferase Type I Alpha Subunit (EC 2.5.1.58 or EC 2.5.1.59)   PubMed        
-Protein Kinase C Beta (EC 2.7.11.13)   PubMed            
-Purine Nucleoside Phophorylase (EC 2.4.2.1)   PubMed         

-Renin (or Angiotensin-Forming Enzyme (EC 3.4.23.15)   PubMed    
-Rho-Associated Protein Kinase 1 (EC 2.7.11.1)   PubMed    
-Serine/Threonine-Protein Kinase AKT (EC 2.7.11.1)   PubMed     
-Serine/Threonine-Protein Kinase AKT2 (EC 2.7.11.1)   PubMed    
-Serine/Threonine-Protein Kinase B-raf (EC 2.7.11.1)   PubMed   MOTM     
-Serine/Threonine-Protein Kinase PLK1 (or Polo Kinase) (EC 2.7.11.21)   PubMed     
-Serine/Threonine-Protein Kinase WEE1 (EC 2.7.10.2)   PubMed     
-Serotonin Receptor   PubMed   MOTM    

-Shikimate Kinase (EC 2.7.1.71)   PubMed   
-Stem Cell Growth Factor Receptor (or Receptor Protein-Tyrosine Kinase) (EC 2.7.10.1)   PubMed     

-TGF-Beta Receptor Type I (or Receptor Protein Serine/Threonine Kinase (EC 2.7.11.30)   PubMed     
-Thymidine Kinase (EC 2.7.1.21)   PubMed    
-Thymidylate Synthase (EC 2.1.1.45)   PubMed    
-Thrombin (or Fibrinogenase) (EC 3.4.21.5)   PubMed     
-Transferases (EC 2.-.-.-)   PubMed     
-Trypsin I (or Alpha-Trypsin or Beta-Trypsin) (EC 3.4.21.4)   PubMed      
-Tryptase Beta-1 (EC 3.4.21.59)   PubMed        
-Tyrosine-Protein Kinase ABL (or Non-Specific Protein-Tyrosine Kinase) (EC 2.7.10.2)   PubMed       
-Tyrosine-Protein Kinase JAK2 (or Non-Specific Protein-Tyrosine Kinase) (EC 2.7.10.2)   PubMed 
-Tyrosine-Protein Kinase LCK (or Non-Specific Protein-Tyrosine Kinase) (EC 2.7.10.2)   PubMed  
-Tyrosine-Protein Kinase SRC (or Non-Specific Protein-Tyrosine Kinase) (EC 2.7.10.2)   PubMed  MOTM  

-Urokinase-Type Plasminogen Activator (or U-plasminogen Activator) (EC 3.4.21.73)   PubMed    

-Vascular Endothelial Growth Factor Receptor 2 (or Receptor Protein-Tyrosine Kinase) (EC 2.7.10.1)   PubMed  


Related Links  

     -A Database of Useful Decoys: Enhanced (DUDE)     
     -Enzyme Nomenclature Database (Expasy)     
     -Scikit-learn Machine Learning Techniques for Regression   
     -Matplotlib     
     -NumPy     
     -Protein Data Bank (PDB)     
     -Python      
     -SAnDReS       
     -SciPy      
     -UCI Machine Learning Repository   
     -Wolfram Demonstration Projects for Machine Learning   
     -Wolfram Demonstration Projects for Regression     
     -Wolfram Demonstration Projects for Stochastic Gradient Descent          
 

Evolutionary Algorithms Applied to the Study of Intermolecular Interactions  

The present research project aims to study protein-ligand interactions through the application of evolutionary algorithms and empirical scoring functions. It will be used structural information available at Protein Data Bank (PDB) and published binding affinity as well, in order to obtain training sets for empirical scoring functions to predict binding affinity. These scoring functions will be tuned using available binding-affinity data, sorted by enzymatic classes, which allows these functions to be specific for the molecular system to be simulated. In this way, we hope to give our humble contribution to the understanding of intermolecular interactions present in protein and ligands, a pivotal topic for computer-based drug design.    


Keywords: Protein, binding affinity, drug design, evolutionary algorithms, differential evolution


Funding: R$ 120.000,00 (one-hundred and twenty thousand reais)
Funding Agency: Conselho Nacional de Desenvolvimento Científico e Tecnológico - National Counsel of Technological and Scientific Development (www.cnpq.br)
Period: From March/2015 to February/2019.
Principal Investigator : Walter F. de Azevedo Jr., Ph.D  
Process Number: 308883/2014-4