This is an old revision of the document!
Table of Contents
Preselected subsets of the Mcule database
We provide you with Ro5 and Ro3 subsets that can serve as a starting point of your virtual screening projects if you don't want to screen the full Mcule database (36M compounds currently). Structurally diverse subsets of the drug like and fragment like parts were generated to represent the same chemical space with a smaller number of compounds.
Availability
The subsets can be
- freely downloaded in SMILES and SDF file formats on our download page
- or can be selected as the input collection for online screening on mcule.com if you have a free account
Diversity selection
The Mcule database contains ~5.7M stock compounds and ~30.3M virtual compounds. Diversity selection was carried out in a way to prefer the stock compounds over the virtual ones. The aim is to represent only those part of the chemical space by virtual compounds space by virtual compopunds
We've developed a method for large scale diversity selection. The selection is carried out diverse subsets can be extracted while we