August 2010


Correlating Drug Side Effects, Biochemical Pathways, and Diseases

Taking a drug for a medical condition carries with it the risk of side effects, sometimes deadly. Think of thalidomide, the popular sedative and morning sickness drug from the late 1950s and early 1960s, which causes birth defects when taking during pregnancy (by the way, this is a United States Food and Drug Administration success story), and Vioxx, the once-popular treatment for degenerative arthritis, taken off the market a few years ago because of the risk of a heart attack or stroke.

It's very challenging to develop an effective drug in the first place, which happens to be the primary motivation for scientists applying old drugs for new purposes. It may be even more challenging to predict and minimize drug side effects.

This is because the majority of drug side effects are caused by unintended drug interactions with biochemical pathways which are not the intended target. It's often very challenging to predict such interactions, and they are primarily only discovered and evaluated at the clinical trial stage.

Predicting drug side effects: A new approach is needed.

Many studies aimed at predicting drug side effects are based on the presumption that chemically similar drugs or protein binding sites will behave similarly. Although valuable, such approaches often provide little if any clue to the underlying biochemical pathways involved.

Further, they may also require one to know exactly where on the protein the drug binds (this data is commonly unavailable). These two limitations hinder scientists' ability to predict the side effects of fundamentally novel and unique drugs, for which detailed biochemical data is limited.

Izhar Wallach, Navdeep Jaitly, and Ryan Lilien (Unversity of Toronto, Canada) have worked towards addressing this limitation with a computational model linking drug side effects to the underlying biochemical pathway(s). Their approach is based on the presumption that drugs which act on the same biochemical pathway may lead to similar side effects.

Suppose a drug A causes a side effect B, found to be associated with a specific biochemical pathway C. If another drug D produces a similar side effect B, yet is predicted to not interfere with biochemical pathway C, it's unlikely that biochemical pathway C is the sole contributor to the side effect of the two drugs.

Furthermore, it's likely that the biochemical pathways manipulated by drugs A and D are functionally related (and may be related to seemingly unrelated diseases). The goal of these scientists' research is to associate drug side effect data with interlinking biochemical pathway data into a theoretical model that can (1) predict drug side effects for a given disease, and (2) predict how two different diseases may be functionally related.

Development of the computational model: An overview.

This section may provide "too much information" for the typical interested reader. If you want to skip it, just know that the scientists utilized multiple sets of public data to build a computational model linking drug side effects with biochemical pathways.

The scientists' computational model makes use of three public databases. These are the Protein Data Bank (PDB, protein structures), the Kyoto Encyclopedia of Genes and Genomes (KEGG, e.g. its database of biochemical pathways and their associated genetic regulation), and the Side Effect Resource (SIDER, drugs and their known side effects).

They studied 730 drug molecules approved for clinical use, and 830 protein targets. The drugs were within the molecular weight range of 100 to 800 Daltons, and each possessed fewer than 10 rotational bonds (to facilitate the computations).

The proteins were of known structure (PDB database), via nuclear magnetic resonance or x-ray analysis (in this case, with a resolution of at least 3 angstroms). Furthermore, all of the proteins were from humans, possessed more than 50 amino acids (protein subunits), were annotated (e.g. were associated with a known gene and biochemical pathway), and were enzymes.

Proteins of greater than 90% amino acid sequence similarity, along more than 90% of their length, were removed from analysis, to prevent redundancy (bias) in model development. The protein targets were collectively associated with 176 known biochemical pathways (KEGG database).

The protein targets were also collectively associated with 506 known side effects (SIDER database). A side effect was not included in the scientists' analysis if it was associated with fewer than 3 drugs or more than 5% of them, if it was reported after drug approval, or if its frequency was less than 1% after controlling for placebo effects; i.e., both highly rare and highly common side effects were not included in the analysis.

Since the exact position on the protein to which the drug binds is generally unknown, the scientists' model presumes it to be on one of the two largest pockets, as determined via the LIGSITEcsc webserver. Previous research has shown that one of these two sites are often (nearly 93% of the time) the drug binding site.

The scientists' model predicted 185 drug side effect/ biochemical pathway associations, involving 121 side effects and 90 biochemical pathways, which warranted further investigation. These are discussed next.

Narrowing the list of associations.

As mentioned, the scientists' model predicted many associations between drug side effects and biochemical pathways. How many of these are real?

This is not an easy question to answer. The scientists decided to check the associations' validity against actual experiments backed up by strong evidence in the scientific literature (measured by their frequency of appearance in scientific publications), followed by a manual review. (As an aside, I've noted in previous blog posts that many enzymes in public databases are misannotated, and popular research is more likely to be erroneous.)

It's clear that many of the predicted associations may be valid, yet end up getting thrown out because relevant experiments have not been performed. However, this validation approach does add concrete, testable validity to the predicted associations.

The scientists found that 22 of their 185 computationally-predicted associations have strong experimental support, and a further 10 have weaker yet significant support. The scientists emphasize that lack of experimental support does not imply that the computational predictions are false; that a drug may be merely involved, rather than the cause, of a biochemical pathway disruption; and that certain diseases may inherently favor the appearance of a specific side effect.

Keeping these caveats in mind, which associations survived the final cut? This is discussed next.

Discussion of selected associations.

The scientists next discussed some of the associations predicted by their model that are also backed up by experimental evidence. I discuss some of them in this section (the original article, cited below, is open access, and discusses more than I present herein).

One is an association between side effects related to nicotinate and nicotinamide metabolism. The scientists' model (and experimental evidence) associates three drug side effects related to manipulating this biochemical pathway: cirrhosis (liver destruction), fibrosis (organ scarring), and ascites (fluid buildup in the abdomen).

These three side effects are clearly related to one another. Drugs which target nicotinate and nicotinamide metabolism end up causing similar side effects.

Another association is between Parkinson's disease and the pyruvate metabolism pathway. The scientists' model suggests that 33 drugs used to treat Parkinson's desease also collectively interact with 15 proteins involved in pyruvate metabolism.

This adds to evidence that pyruvate deficiency is correlated with the progression of Parkinson's disease. Perhaps new drugs designed to treat Parkinson's disease should include a specific intent to modulate pyruvate metabolism.

Final comments.

The scientists performed multiple tests to demonstrate that the accurate associations predicted by their model were very unlikely to be the result of random chance. Therefore, their model has actual predictive power.

The practical, unambiguous utility of the model is limited by the availability (and accuracy) of experimental data. Nevertheless, their model can be utilized to choose specific experiments useful for drug discovery.

This research development will lower the cost of drug discovery, and lead to more effective treatments for a wide range of medical conditions. I hope relevant pharmaceutical scientists make note of this development, and use it as inspiration in their drug development efforts.

NOTE: The scientists' research was funded by the Bill and Melinda Gates Foundation and the Natural Sciences and Engineering Research Council. for more information:
Wallach, I., Jaitly, N., & Lilien, R. (2010). A Structure-Based Approach for Mapping Adverse Drug Reactions to the Perturbation of Underlying Biological Pathways PLoS ONE, 5 (8) DOI: 10.1371/journal.pone.0012063