Extend GIP to new drug compounds that is compounds for which no interaction is known

To this aim, we propose a simple weighted nearest neighbor procedure. For a new drug compound, its chemical similarity with other known drug compounds and their corresponding profiles are used in order to infer a score interaction profile for that drug compound. To illustrate how this prior introduces a bias on the results, we consider the following simple procedure, called Const. Const constructs an all ‘1’s profile for the drug compounds or target proteins with only one interaction. We can incorporate Const into GIP in the same way as WNN, giving the Const-GIP method. With this method all possible interactions for drug/targets with only one interaction will be ranked before interactions with drugs/targets that also have other interactions. Essentially, for such interactions the method only has to do half the work, since the fact that the drug/target is correct can be known with certainty. In real world situations there are also drug compounds that interact with none of the target under consideration, and vice versa, which would invalidate the ConstGIP method. This resulted in a similarity matrix for the denoted by Sc, which represents the chemical space. Amino acid sequences of the target proteins were obtained from the KEGG GENES database. Sequence similarity between proteins was computed using a normalized version of Smith-Waterman score, resulting in a similarity matrix denoted Sg, which represents the genomic space. We could test the prediction capability of the proposed methods on unknown drug-target interactions of the given network using the procedure adopted in. Therein, the complete interaction network for each dataset is used as training data, and the predictions on non-interacting pairs in the training set are ranked with respect to their interaction scores. However, since each drug compound or target in the training set has at least one interaction, we do not need to use WNN and the results are those of GIP. We report the top five predicted interactions for each dataset in Table 3. In general, these results indicate that cross validation should be applied and interpreted with care. Note that the cross validation procedure used in the comparison with KBMF2K is also positively biased, since we know that each ‘new’ drug compound has at least one interaction, but there the bias is much smaller. In this work, we proposed a simple yet effective procedure to predict interaction profiles for unknown drug compounds and show how it can be directly integrated into a recent machine learning algorithm for the in-silico prediction of drug-target interactions. The novelty of our approach comes in the use of a weighted nearest neighbor procedure for inferring a profile for a drug compound by using interaction profiles of the compounds in the training data, where each profile is weighted using information about chemical similarity between drug compounds integrated with a simple decay scheme.

Leave a Reply

Your email address will not be published.