Testing of a new docking scoring function on the example of inhibitors of protein tyrosine phosphatase 1B

A new scoring function H1 recently developed for molecular docking has been tested on the complexes of protein tyrosine phosphatase 1B (PTP1B) from the PDB data bank and using docking of a set of inhibitors from the NIH database. The function is based on the scoring functions of AutoDock and AutoDock Vina and is implemented in the modified version of the AutoDock. The function performed well both in the case of the complexes from the PDB databank and in a real docking process. Calculation of pKi for the complexes from the PDB databank was very accurate. The molecular docking has been done with a modified version of AutoDock that uses spatial constraints and a new search engine. Energies of complexes were minimized, and pKi values of the resulting complexes were estimated by the new scoring function. As shown previously, conformations of PTP1B in complexes with ligands can be divided into five clusters. All five typical conformations of PTP1B binding pocket were used for docking. Better docking results were obtained on the clusters with open WPD loop though some compounds could not be docked well to such conformations of the enzyme. The function has shown a good “scoring power” (i. e. the ability to predict pKi values) and “screening power” (the ability to enrich top 10 or 20% of predictions by real active compounds) thus proving to be suitable for the virtual screening of potential PTP1B inhibitors. The performance of the new scoring function H1 was much better than that of the original scoring function of AutoDock tested earlier.


ТЕСТУВАННЯ НОВОЇ ОЦІНОЧНОЇ ФУНКЦІЇ ДЛЯ МОЛЕКУЛЯРНОГО ДОКІНГУ НА ПРИКЛАДІ ІНГІБІТО-РІВ ПРОТЕЇНОТИРОЗИНФОСФАТАЗИ 1B В.Ю.Танчук, В.О.Танін
Multiple biochemical processes, which depend on dephosphorylation of phosphotyrosine residues in proteins, are regulated by protein tyrosine phosphatases, including cell-signaling and metabolism pathways [1][2][3]. Being known to be involved in insulin receptor dephosphorylation, the intracellular protein tyrosine phosphatase 1B (PTP1B) is considered to be a negative regulator of insulin signal transduction [4]. For potential treatment of type 2 diabetes and obesity PTP1B is considered to be one of the most promising therapeutic targets [5]. The development of potent and selective inhibitors of this enzyme engages constantly growing interest. Derivatives of carboxylic, phosphonic, sulfonic acids, heterocyclic and other compounds have been tested as PTP1B inhibitors [6].
Computer simulations are known to play a considerable role in drug design studies, and such methods have been already applied in the case of PTP1B. Specified active compounds have been studied using computer-based approaches, including molecular docking [7,8], an important tool used to understand a detailed mechanisms of the inhibitor binding to an enzyme. PTP1B is one or rare cases when the enzyme is represented by a large amount of data in the PDB data bank [9]. This is another evidence of the significance of the enzyme, but it is also an issue for the investigator. Computer simulations usually rely on multi-dimensional optimization, which makes them heavily dependent on starting conditions since the energy surfaces are very complicated with a great number of local minima. We have already studied conformations of PTP1B and found that they can be divided into 5 clusters. Each cluster is a group of similar PTP1b conformations in protein-ligand complexes representing a typical way of ligand binding [10]. A cluster centroid is the most typical representative of the cluster that may be used for computer simulations as a representative of its binding type. Conformations of two clusters have the so-called WPD-loop (an important moving part of the enzyme at the entrance to the catalytic centre with WPD (Tryptophan-Proline-Aspartic acid) sequence in the middle) in an open position and three other ones in a closed position. The WPD-loop plays an important role in the enzyme functioning. It interacts with a substrate during the catalysis of dephosphorylation. There are many inhibitors that bind to the enzyme with the open or closed WPD-loop. Using centroids of all clusters we have already performed molecular docking with a modified version of the AutoDock [11]. Unfortunately, we have not found the cluster providing the best docking results. We have not also found the dependence between the chemical structure and the kind of the cluster that gives the best result for the compound. Despite that there were dockings that gave results very close to the experimental ones.
The authors have recently developed a new scoring function for molecular docking [12]. The function is based on the scoring functions of the well-known docking packages AutoDock [13] and AutoDock Vina [14]. The scoring functions used are very different in nature, but share the same input and output format making their combination practical. A new scoring function H1 includes all terms of both scoring functions. New weights for them are fitted by MLRA (Multiple Regression Analysis). The training set of proteinligand complexes was obtained from the refined set of PDBbind (www.pdbbind.org.cn, version 2012) [15] and included 2,412 complexes (some complexes were excluded because of their incompatible format). A test set consisted of 313 complexes that appeared in the 2013 edition of PDBbind 1 . The new function H1 outperformed both old scoring functions on both sets.
The aim of the present work was to study the new scoring function on the example of PTP1B inhibitors.
First of all, we tested H1 on the known complexes of PTP1B with inhibitors. The refined set of the PDBbind database included 26 complexes. The results are summarized in Table 1. The AutoDock Vina performs slightly better on PTP1B compared to the H1 function. This can be explained by the fact that 20 complexes were included in the core set of the PDBbind already in 2007 version and used to train the AutoDock Vina scoring function. As an additional validation, we have added 28 additional complexes from other sources, which we consider reliable enough to be used. The performance of our hybrid scoring function is much better for this test set and for a combined set of 54 complexes as well. It is also interesting to note that AutoDock performs better than AutoDock Vina, which is not usually the case.
The result of the H1 scoring function seems extraordinary taking into account that the experimental error of pK i determination is about 0.6 [16]. Unfortunately, this is not sufficient. The results of docking depend much on both the accurate scoring function and the docking algorithms. The scoring function must be tested in real docking, and it should provide at least enrichment of results by active compounds.
That is why the new scoring function was tested in conditions similar to our previous work [11]. The same set of phosphorus containing inhibitors 2 from the NIH database [17] was docked to the same PDB structures representing five typical conformations ISSN 2308-8303 of PTP1B [11]. It is assumed that the inhibitors with phosphonic groups are competitive inhibitors and bind near the catalytic Cys215. Docking was performed by a modified version of AutoDock using a new search algorithm [18]. Positions of the atoms of phosphorus were limited to the region around the position of phosphorus (or sometimes sulfur) in the original PDB files of the cluster centroids. There is also a substantial difference. The previous versions of AutoDock use the same function for docking and scoring. In this version docking is performed by minimizing the energy of the ligand-enzyme complex, and the scoring function is calculated later (the postprocessing stage). This was done because there were Table 1 Comparison of the prediction power of the scoring functions on the known complexes of PTP1B with the inhibitors where R is the non-parametric Pearson correlation coefficient, RMSE is the root mean square error, and MAE is the mean absolute error  some doubts about the suitability of H1 for docking. There were also some technical difficulties. Nevertheless, such combination of docking (energy minimization by more or less standard force field) and scoring (estimation of the complex by the new scoring function H1) proved to be workable. As it follows from Table 2, the new scoring function and the new docking scheme provide a substantial improvement. There is practically no correlation between the predicted and calculated binding constants [11]. Now the correlation is much better. The values of Pearson's correlation coefficient (R) being of 0.5-0.6 are not bad for docking. For example, the authors [19] compared more than 20 scoring functions from different sources and their best function gave R of 0.61 and RMSE of 1.78 for the test set of 195 complexes. This is very similar to our results. They called this testing of ability to predict pK i value a "scoring power" test.
The clusters with the open WPD look seem to be preferable for docking phosphorus-containing inhibitors. Correlation coefficients are higher and errors are lower by using such clusters. Nevertheless, it is hard to obtain good results for all ligands using only one cluster (see Fig.).
In addition to the "scoring power" test, we decided to test the "screening power". In this case we selected top 10% (21) and top 20% (42) predictions for each cluster and checked how many true activities (experimental pKi>6.0) are among these ligands ( Table 3).
As it can be seen from Table 3, the function H1 provides much better enrichment on all clusters. Besides, the results are almost identical for all the clusters (used, studied). Only 1NL9, being the best at "scoring power", is the worst one here. 1PH0, another cluster with the open WDP-loop, seems to be the best in the both tests (good correlation, R = 0.53, the smallest RMSE = 1.56). Though the old AutoDock's function is far from ideal, it gives better results for the clusters with the open WPD-loop.

Conclusions
The new scoring function for molecular docking and the new docking algorithm of the modified AutoDock have been tested on the complexes of PTP1B with inhibitors. The new approach has shown much better results than our previous attempts. The scoring function H1 and the new docking approach based on the modified search engine and optimization of the energy of the enzyme-ligand complex have proven to be suitable for the virtual screening of potential PTP1B inhibitors.