User Tools

Site Tools


subsets

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Last revision Both sides next revision
subsets [2016/12/27 22:07]
rkiss
subsets [2016/12/27 22:19]
rkiss
Line 28: Line 28:
 The [[https://​mcule.com/​database/​|downloadable files]] contain the compounds in diversity order i.e. the first N compounds represent the most dissimilar N compounds. This means that if you want to further narrow down the number of compounds you can keep the first X compounds of the files and they will be the most dissimilar ones. The [[https://​mcule.com/​database/​|downloadable files]] contain the compounds in diversity order i.e. the first N compounds represent the most dissimilar N compounds. This means that if you want to further narrow down the number of compounds you can keep the first X compounds of the files and they will be the most dissimilar ones.
  
-Structural similarity was measured by Tanimoto coefficient (TC) between FP2 linear fingerprints generated by [[http://​jcheminf.springeropen.com/​articles/​10.1186/​1758-2946-3-33|OpenBabel]]. The combinations of the following algorithms were applied to extract the most dissimilar ​subsets+Structural similarity was measured by Tanimoto coefficient (TC) between FP2 linear fingerprints generated by [[http://​jcheminf.springeropen.com/​articles/​10.1186/​1758-2946-3-33|OpenBabel]]. The combinations of the following algorithms were applied to extract the most dissimilar ​compounds
-  * we used sphere exclusion to eliminate highly similar compounds to reduce the input size where needed +  * sphere exclusionto quickly ​eliminate highly similar compounds to reduce the input collection to a manageable ​size for the subsequent [[diversitysel|stepwise elimination]] algorithm 
-  * then [[diversitysel|stepwise elimination]] ​was applied to obtain ​the most dissimilar compounds+  * [[diversitysel|stepwise elimination]]: a more thorough algorithm that eliminates one molecule of the most similar molecule pairs
  
-In sphere exclusion we used the stock compounds first as "​centers" ​for the elimination ​of redundant compounds, and we retained the stock compounds ​during ​the stepwise elimination.+In sphere exclusion we used the in-stock compounds first as "​centers" ​and eliminated their most similar analogs, while during [[diversitysel|stepwise ​elimination]] we retained the in-stock compounds ​from the most similar molecule pairs.
  
 ===== Subsets ===== ===== Subsets =====
subsets.txt · Last modified: 2016/12/27 22:23 by rkiss