Example of one of our earlier performed In silico screenings (penetrance through the blood brain barrier) in Parkinson’s disease


To identify compounds with high blood-brain barrier permeability we have performed an in silico screening using a classical CADD approach. A prediction model was used to score compounds by their ability to penetrate the blood-brain barrier. The model was build using XGBoost algorithm and implemented as python script. A comprehensive set of experimental data on BBB permeation of 448 compounds (146 compounds with poor (BBB(-)) and 302 compounds with good BBB permeation (BBB(+)), was collected with key assumption that only passive diffusion is involved in BBB transport of these compounds. Statistical analysis enabled the selection of an optimal set of molecular descriptors for the effective prediction of BBB penetration.

Parameters that were used to build the prediction model:

  • Train/test split ratio - 0.9/0.1
  • number of compounds - 448
  • number of BBB(+), BBB(-) compounds: 302 - BBB(+), 146 - BBB(-)

Three independent randomizations of the 448-compound database were used for the training-testing experiments. The classification accuracy of the models was in range 78%-85%. The best model with classification accuracy of 85% was used for the identification of BBB-permeable compounds from the ChemDiv and Enamine collections.


Results of the performed virtual screening:


900K compounds from the Enamine database and 500K compounds from the ChemDiv database were classified as BBB-permeable (Fig. 1)

Targeting PD associated genes/proteins


LRRK2

Open source and commercially available databaseses (SureChembl, ChemBL, SciFinder, Thomson Reuters Integrity) were analysed to get a reference set of more than 900 LRRK2 inhibitors. The reference set was clustered to identify main structural and property features as well as other parameters describing chemical space of LRRK2 inhibitors. 2D-structure clustering procedure was performed in ChemoSoft software using following parameters:

  • similarity threshold: 0.6
  • minimal number of compounds in cluster: 5
  • average similarity to the cluster

Out of 906 compounds in initial reference set, 161 were identified as singletons (don’t belong to clustered space). Other structures were divided into 52 distinct clusters with number of average compounds per cluster = 17.

In silico screening procedure was performed to identify compounds with potential inhibitory activity against LRRK2. More that 1.5M BBB+ compounds from ChemDiv and Enamine collections were scored against reference set of known ligands, using a combination of 3D shape, hot-spot and pharamacophore feature analysis.

Docking studies were performed for the selected compounds in Schrödinger software in order to evaluate their affinity to LRRK2 protein. Docking model was built based on previously published structure of LRRK2-MLi-2 complex (PDB: 5U6I, Fig. 2).


The examples of identified compounds with corresponding statistics are shown below:

GBA1

Reference set of reported 1704 GBA1 inhibitors was collected from Chembl and Integrity databases. The dataset was clustered using clustering algorithm in ChemoSoft. The reference set was clustered to identify main structural and property features as well as other parameters describing chemical space of GBA1 inhibitors. 2D-structure clustering procedure was performed in ChemoSoft software using following parameters:

  • similariy threshold: 0.6
  • minimal number of compounds in cluster: 5
  • average similarity to the cluster

Out of 906 compounds in initial reference set, 161 were identified as singletons (don’t belong to clustered space). Other structures were divided into 52 distinct clusters with number of average compounds per cluster = 17.

In silico screening was performed to identify potential activators of GBA1. BBB-permeable compounds from the ChemDiv and Enamine collections were scored against reference set of known GBA1-activators, using a combination of topological, 3D-shape and pharmacophore feature analysis. More than 800 compounds-potential activators of GBA1 were identified:

Fig. below: Descriptors statistics: compounds from reference set (green), compounds from virtual screening (blue) and representative examples of clusters (result of virtual screening).


C-Abl

Having a long history in chronic myeloid leukemia and acute lymphocytic leukemia, C-Abl has recently emerged as a promising therapeutic target in PD therapeutics. It was shown that C-Abl activation is involved in neuronal death and C-Abl inhibition leads to neuroprotective effects and prevents death of dopaminergic neurons. Reference set of more than 450 reported C-Abl inhibitors was collected from scientific articles, open-source databases and other sources. The virtual screening of commercially available compounds from Chemdiv and Enamine stock collections resulted in set of compounds that was further evaluated in molecular docking studies. Docking studies were performed in Schrödinger software. The model of the active site of C-Abl was built based on previously published structure of C-Abl-PD180970 complex

Example of compound with high score (output from docking studies) in the active site of C-Abl

The virtual screening procedure revealed more than 600 BBB-permeable compounds as potential inhibitors of C-Abl kinase. The distributions of key properties of the identified molecules are not highly correlated with the reference data (Fig. 19). This fact is explained by a large number of C-Ab inhibitors developed as anticancer molecules for which high BBB-permeability is not a key property. Therefore, due to the use of BBB-permeability filter, differences in several physchem properties (logP, number of rotatable bonds, molecular weight) are observed.  (PDB: 2HZI, Fig. below).