This is an old revision of the document!
If you have access to a unix based system and the split command you can use the below commands to split large files into smaller chunks.
To split a smi.gz / smiles.gz file into multiple uncompressed chunks use a command like this:
gzip -dc your.smi.gz | split --verbose --lines=<size> --numeric-suffixes --suffix-length=<suffix_length> --additional-suffix='.smi' - your__
For example to split the Mcule Purchasable (Full) smi.gz file into 1M uncompressed chunks use:
gzip -dc mcule_purchasable_full_180817.smi.gz | split --verbose --lines=1000000 --numeric-suffixes --suffix-length=10 --additional-suffix='.smi' - mcule_purchasable_full_180817__
To split a smi.gz / smiles.gz file into multiple gzip compressed chunks use a command like this:
gzip -dc your.smi.gz | split --verbose --lines=<size> --numeric-suffixes --suffix-length=<suffix_length> --additional-suffix='.smi' --filter='gzip -9> $FILE' - your__
For example to split the Mcule Purchasable (Full) smi.gz file into 1M gzip compressed chunks use:
gzip -dc mcule_purchasable_full_180817.smi.gz | split --verbose --lines=1000000 --numeric-suffixes --suffix-length=10 --additional-suffix='.smi' --filter='gzip -9> $FILE' - mcule_purchasable_full_180817__
If you have pigz installed on your system you can replace gzip with pigz in the commands above to speed up the process, especially when you want compressed chunks.
We won’t delete any of your search/screen results. However, you can delete them if you’d like by selecting the collection under the “collections” tab and clicking on “Delete”.
Under the “Collections” tab. All your previous search/screen results are listed here ranked by their date of creation.
We don’t limit the number of searches and screens you can submit. However, maximum five of your submitted searches/screens will be running in parallel.
The Free package contains a maximum 10,000 entry limit for each of your collections. However, if you subscribe to any for-fee packages, you will be able to create collections of any size. Note: when executing a screening workflow, the collection limit only affects the size of the last, saved collection, i.e. the number of molecules passing from one to another workflow step is not limited.
You can check the available tools, features and limits of the Free package.
If you have already executed the search/screen and you are waiting for your results, your results and your query have been already saved at mcule.com. Data will not be lost if your browser crashes. You only need to open a new browser or fix the problem and go back to mcule.com. You will find all your previous searches/screens under the “Collections” tab.
Yes. After the submission of any searches/screens you can close the browser and shut down your computer if you like. You can go back to mcule.com log into your account and browse the results or check the status of your searches/screens any time. For longer calculations, you will also receive a pop-up (next to your user name in the upper right corner) and an email notification when the search/screen is finished.
Compounds, structures, conformers and products are the currently available entry types that a collection can contain. These different entry types store different level of information. Compounds are tautomer and protonation state independent. Structures represent molecules in a specific tautomer and protonation state. Conformers are structures with 3D coordinates. Products are compounds with associated chemical supplier data. If a workflow step is running on or outputs entries at a different entry level than those in the input collection, the entry level changes. For example, the input collection for Docking(Vina) might contain structures, but since the output is a set of conformers (binding poses), the output collection will contain conformers. To learn more about the different entry levels, click here (link to regsys, entry levels).
Under the “find chemicals” tab you can run simple searches (Exact, Similarity and Substructure) only. These are useful for users looking for specific compounds or just want to run some basic searches very frequently. Under the “workflow builder” tab more complex, multistep virtual screens can be built and run. Available tools under the workflow builder tab include Docking (Vina), Diversity selection, SMARTS query filter, FTrees Visual Similarites, etc.
By default, all your collections are private, which means that no one except you can get access to your search/screen queries and results. You can, however, change the “Privacy level” of any of your collections and share any collections with your colleagues or make them fully public. To change the “Privacy level” of any of your collections, click on the collection (under the “Collections” tab) and then click on the “Edit” button on the right. You can learn more about the Privacy level here. Note that your queries and workflows will be always private independent of the privacy level of the result collection (they cannot be seen by other users).
All communications and data transfer are done through the secure https protocol.
Extra tools, like FTrees Visual Similarites or ChemAxon Properties are provided on a subscription basis. To get access to these tools, go to our Pricing page. Subscriptions can be purchased by just a few clicks. To learn more about our subscription packages.
Yes. Typically different prices are applied for all subscription packages. To see the acedemic prices go to our Pricing page and check the “Academic pricing” box on the top of the page. Note, that academic subscriptions need to be verified and approved by mcule, which might take 1-2 extra business days before you can get access to the tools and features available in the subscription package.
Yes. Please contact us (email@example.com) and let us know more about your needs. We can prepare a customized solution for you.
This is a very rare event, but it might be possible that we encounter an “Error” during your search/screen. It is recommended to try rerun the search/screen once. If your new search/screen also get into an “Error” state, it might be a permanent error, and we have to check and fix the issue. We usually get notified about such errors automatically, but it is always good idea to send report to firstname.lastname@example.org about this event. We typically try to fix the issue as soon as possible and get back to you when the problem has been fixed.
The mcule database is a set of high quality molecules that have been processed by the rigorous molecule registration system of mcule. The Purchasable compounds collection contains more than 4 million compounds that are available for purchase. This is set by default as input collection for your searches and screens. Other collections (made public by other mcule users) can be also set as input collection. We also plan to introduce further collections in near future: NCI and ChEMBL).
Click on “Input collection” at the top under the “find chemicals” or “workflow builder” tabs.
Small molecule upload will be introduced soon. Target molecules for docking can be uploaded by clicking on “Upload a file” in the “Docking (Vina)” workflow step.
Click on “Visualize pose” under the docking scores in Table view mode.
Collections are fundamental units of the mcule system. They can serve as input for the searches and screens (e.g. Purchasable compounds collection). The results of any searches or screens are collections themselves, users can use such result collections as input for further searches and screens. To learn more about the collections, click here Collection management.
This can happen at least for two reasons. You have either reached your collection size limit (maximum 10,000 entries per collection are allowed in the Free package). Or you are trying to add entries to a search/screen result collection. Entries can’t be manually added to search/screen result collections to keep them consistent with the search/screen query.
Yes. Queries (search/screen inputs) are stored together with their results. If you click on one collection (under the “Collections” tab) containing the results of one of your previous searches/screens and click on the “Display query” button, your query will be displayed.
Any collections can be exported in multiple formats (e.g. SDF). Click on the collection you would like to export under the “Collections” tab, and click on the “Export” button in the upper right corner. To learn more about collection export, click here (link to collection export).
Collections can be exported as SDF, SMILES, InChI, InChIKey and mcule ID. Contact us, if you need any other file types. To learn more about collection export.
To make use of mcule’s collection management system, it is necessary to register for an individual account. By registering for a free account you will not just get access to some searching and screening tools and the mcule database, but you will be able to use our data management system storing all your molecule collections, search and screen results and queries, your available features, tools, limits and notifications.
You don’t need to pay anything for registration. It is free. After registration you will get immediate access to the Free package. Subscription packages are also available for extra tools and features.
We don’t plan to remove any of the currently available tools and features from the Free package. On the contrary, we plan to add new features continuously.
Yes, please contact us (email@example.com) and describe what you need in order to evaluate the tool of your interest.
To order a single compound, go to its index page (click on its 2D representation or its mcule ID) and click on the orange “Quote” button on the right. To order a collection of compounds, open the collection and click on the orange “Quote” button on the right. For more information about the ordering process, click here.
Yes. Click on “Change table display options” in Table view and select those properties you would like to add.
In the Docking (Vina) workflow step box, you can click on the “Select target” button and select from the ~10,000 prepared target structures (integrated from scPDB(link)). If you can’t find any target structures there, you can check the Protein Data Bank website (http://www.rcsb.org) and download more target structures in PDB format. These PDB files can be upload by clicking on “Upload a file” in the Docking(Vina) workflow step box. It is recommended to run an automatic preparation on all unprepared PDB files, which can be selected in the “Upload docking target” dialog box.
There are several ways of doing this. Here is one general example. Go to “workflow builder”, set a “Property filter” (RO5 max 1). If there is a reference ligand available that is a known inhibitor of your protein, you can use it as a query molecule for “FTrees Visual Similarities” or “Similarity search”. If there is an available crystal structure for your protein, you can check if it can be selected from the ~10,000 prepared target structures in the Docking (Vina) workflow step. To make sure you don’t exceed your limits, use the “Sampler filter” or “Diversity selection” to reduce the number of molecules to the required number. Click on “Screen” to run your screening workflow. The results will be displayed as they are generated. To export the results click on “Export”, to order the best hits, click on “Quote”. If you need help, contact us (firstname.lastname@example.org).
To check the on stock amounts of the available products for a compound, you can go to the index page of the compound and click on the “Product availability” tab. All products will be listed together with the most uptodate on stock amount information. If you want to order the compound, click on the orange “Quote” button on the right.
Yes. Draw explicit hydrogens to satisfy all free valences at the positions, where substitutions are not allowed.
It is possible that some input molecules contained unknown or undefined stereo centers. These cannot be directly processed by docking. Input molecules for Docking (Vina) should have 3D coordinates, which means that a single stereoisomer need to be generated and if no such single stereoisomer is present in the mcule database, they need to be registered into the mcule database on the fly. Note that the mcule IDs of these, on-the-fly generated stereoisomers will differ from the mcule ID of the original input molecule. You can check the original molecule for each conformer when browsing the result collection in Table view. Click on “Change table display options” and select “Origin” from the search results. The “Origin” column will contain the mcule IDs of the original input molecules.