Covalent Modifiers: Graph neural networks for identifying protein-reactive compounds

Friday, August 2, 2024

Graph neural networks for identifying protein-reactive compounds

Victor Hugo Cano Gil and Christopher N. Rowley

Digital Discovery, 2024

https://pubs.rsc.org/en/content/articlelanding/2024/dd/d4dd00038b

The identification of protein-reactive electrophilic compounds is critical to the design of new covalent modifier drugs, screening for toxic compounds, and the exclusion of reactive compounds from high throughput screening. In this work, we employ traditional and graph machine learning (ML) algorithms to classify molecules being reactive towards proteins or nonreactive. For training data, we built a new dataset, ProteinReactiveDB, composed primarily of covalent and noncovalent inhibitors from the DrugBank, BindingDB, and CovalentInDB databases. To assess the transferability of the trained models, we created a custom set of covalent and noncovalent inhibitors, which was constructed from the recent literature. Baseline models were developed using Morgan fingerprints as training inputs, but they performed poorly when applied to compounds outside the training set. We then trained various Graph Neural Networks (GNNs), with the best GNN model achieving an Area Under the Receiver Operator Characteristic (AUROC) curve of 0.80, precision of 0.89, and recall of 0.72. We also explore the interpretability of these GNNs using Gradient Activation Mapping (GradCAM), which shows regions of the molecules GNNs deem most relevant when making a prediction. These maps indicated that our trained models can identify electrophilic functional groups in a molecule and classify molecules as protein-reactive based on their presence. We demonstrate the use of these models by comparing their performance against common chemical filters, identifying covalent modifiers in the ChEMBL database and generating a putative covalent inhibitor based on an established noncovalent inhibitor.

Covalent Modifiers

Friday, August 2, 2024

Graph neural networks for identifying protein-reactive compounds

A multicenter, open-label, first-in-human study of TYRA-200 in advanced intrahepatic cholangiocarcinoma and other solid tumors with activating FGFR2 gene alterations (SURF201).

Search This Blog