Differences

This shows you the differences between two versions of the page.

--- subsets [2016/12/27 20:58] – rkiss
+++ subsets [2016/12/27 21:07] – rkiss
@@ Line 28: / Line 28: @@
 The [[https://mcule.com/database/|downloadable files]] contain the compounds in diversity order i.e. the first N compounds represent the most dissimilar N compounds. This means that if you want to further narrow down the number of compounds you can keep the first X compounds of the files and they will be the most dissimilar ones.
-Structural similarity was measured by Tanimoto coefficient (TC) between FP2 linear fingerprints generated by OpenBabel. The combinations of the following algorithms were applied to extract the most dissimilar subsets:
+Structural similarity was measured by Tanimoto coefficient (TC) between FP2 linear fingerprints generated by [[http://jcheminf.springeropen.com/articles/10.1186/1758-2946-3-33|OpenBabel]]. The combinations of the following algorithms were applied to extract the most dissimilar subsets:
   * we used sphere exclusion to eliminate highly similar compounds to reduce the input size where needed
   * then [[diversitysel|stepwise elimination]] was applied to obtain the most dissimilar compounds