faq
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
faq [2018/12/19 01:40] – flack | faq [2018/12/21 14:00] – [Can I get a Mcule database SDF file in smaller chunks?] flack | ||
---|---|---|---|
Line 18: | Line 18: | ||
To split a smi.gz / smiles.gz file into multiple **gzip compressed chunks** use a command like this: | To split a smi.gz / smiles.gz file into multiple **gzip compressed chunks** use a command like this: | ||
< | < | ||
- | gzip -dc your.smi.gz | split --verbose --lines=< | + | gzip -dc your.smi.gz | split --verbose --lines=< |
</ | </ | ||
For example to split the Mcule Purchasable (Full) smi.gz file into 1M **gzip compressed chunks** use: | For example to split the Mcule Purchasable (Full) smi.gz file into 1M **gzip compressed chunks** use: | ||
< | < | ||
- | gzip -dc mcule_purchasable_full_180817.smi.gz | split --verbose --lines=1000000 --numeric-suffixes --suffix-length=10 --additional-suffix=' | + | gzip -dc mcule_purchasable_full_180817.smi.gz | split --verbose --lines=1000000 --numeric-suffixes --suffix-length=10 --additional-suffix=' |
</ | </ | ||
Line 33: | Line 33: | ||
If you have access to a unix based system and awk you can use the below commands to split large, gzipped SDF files into smaller chunks. | If you have access to a unix based system and awk you can use the below commands to split large, gzipped SDF files into smaller chunks. | ||
- | To split an sdf.gz file into multiple uncompressed chunks, use a command like this: | + | To split an sdf.gz file into multiple |
< | < | ||
gzip -dc your.sdf.gz | awk -v name=< | gzip -dc your.sdf.gz | awk -v name=< | ||
Line 40: | Line 40: | ||
Just replace your.sdf.gz with your filename, < | Just replace your.sdf.gz with your filename, < | ||
- | For example to split the Mcule Purchasable (Full) sdf.gz file into 1M uncompressed chunks use: | + | For example to split the Mcule Purchasable (Full) sdf.gz file into 1M **uncompressed chunks** use: |
< | < | ||
gzip -dc mcule_purchasable_full_180817.sdf.gz | awk -v name=mcule_purchasable_full_180817__ -v ext=sdf -v size=1000000 ' | gzip -dc mcule_purchasable_full_180817.sdf.gz | awk -v name=mcule_purchasable_full_180817__ -v ext=sdf -v size=1000000 ' | ||
Line 46: | Line 46: | ||
- | To split an sdf.gz file into multiple gzip compressed chunks, use a command like this: | + | To split an sdf.gz file into multiple |
< | < | ||
gzip -dc your.sdf.gz | awk -v name=< | gzip -dc your.sdf.gz | awk -v name=< | ||
Line 53: | Line 53: | ||
Just replace your.sdf.gz with your filename, < | Just replace your.sdf.gz with your filename, < | ||
- | For example to split the Mcule Purchasable (Full) sdf.gz file into 1M gzip compressed chunks use: | + | For example to split the Mcule Purchasable (Full) sdf.gz file into 1M **gzip compressed chunks** use: |
< | < | ||
gzip -dc mcule_purchasable_full_180817.sdf.gz | awk -v name=mcule_purchasable_full_180817__ -v ext=sdf.gz -v size=1000000 ' | gzip -dc mcule_purchasable_full_180817.sdf.gz | awk -v name=mcule_purchasable_full_180817__ -v ext=sdf.gz -v size=1000000 ' |
faq.txt · Last modified: 2024/04/09 08:33 by rkiss