This is an old revision of the document!

BioSolveIT FTrees Visual Similarities

BioSolveIT's FTrees Visual Similarities is a highly efficient tool for scaffold hopping (i.e. identify new molecular scaffolds and maintain activity) and ligand-based virtual screening. Its underlying topological descriptor (the Feature Tree) captures connectivity and physico-chemical properties of functional groups. The reduced graph representation that preserves the pharmacophore characteristics of the ligands in a fuzzy way enables the identification of novel scaffolds. Moreover, the topological Feature Tree descriptor makes similarity assessment extremely fast compared to 3D approaches and is also exempt from the uncertainties of 3D coordinate calculation. The optimum similarity of two descriptors is defined by an alignment, so an SAR (structure activity relationship) may be readily detected. The output of FTrees Visual Similarities is not only a similarity score, but the underlying substructure contributions can be also displayed with very intuitive visualization for each query-target pair.

You can find more information about FTrees here.

When to use

FTrees has been widely used in industrial and academic settings for many years. It has been shown to be highly successful in numerous projects by various customers in lead finding, high-throughput screening data analysis and general virtual screening applications.

How to use

It is important to emphasize that due to substantial differences in the similarity calculation methods of FTrees and traditional similarity searching with linear fingerprints, the optimal similarity threshold for FTrees is different from that for the Similarity search filter. As a general limit, we suggest to focus on molecules above the 0.85 threshold. It is also recommended to visually analyze the similarity between the query and the hits by clicking on the “Visualize similarity” link (under the FTrees score).

Options

Similarity threshold

The similarity threshold is the minimum FTrees similarity score between the query and target molecules (set to 0.85 by default).

Advanced options

Feature Tree generation limits and options:

Protonation by FTrees

Generate one standard protomer/tautomer of the query and target molecules by FTrees and use it for the calculations.

Exclude macrocycles

Exclude molecules containing macrocycles. FTrees similarity calculated for macrocycles is usually less informative, as large cycles might be converted into a single node.

Minimum macrocycle length (in bonds)

Only effective when “Exclude macrocylces” is selected (default: 10)

Exclude high node degree

Exclude Feature Trees containing nodes with high degree of branching

Maximum node degree

Only effective when “Exclude high node degree” is selected.

Minimum number of nodes

Maximum number of nodes

Volume mode

Defines the volume model for the ’size’ feature. Options: “Number of atoms”, “Van der Waals volume (default).

Minimum subtree size

Minimum size of each subtree to initiate a recursive subdivision of the subtrees during tree comparison.

Shape descriptor

Controls different shape descriptor components. Options (more than 1 might be selected):

Volume term (comparison of molecular volumes, estimated volume of a subtree) (default)
Maximum path length term (the length in bonds between the furthest exit and entry points to the subtree: gives a rough guide to the topological shape of a subtree)
Ring closure term (comparison of number of ring closures) (default)

Chemistry descriptor

Weights of the different chemistry descriptor components: H-bond donor, H-bond acceptor, ring closure, amide, aromaticity/delocalization, hydrophobic (default: 3:3:1:1:1:1)

Chemical similarity addon

Adjusts the chemical similarity between nodes where one node may have zero values in its interaction profile and the other not. If this parameter is set to zero, the chemical similarity between such nodes is always zero regardless of the non-empty profile. If this parameter is greater than zero the variation of the non-zero profile from a zero profile is taken into account (default: 0.1).

Minimum level 0 similarity

Level 0 (global) similarity threshold to initiate a recursive subdivision of the subtrees during tree comparison. Note: during the level 0 similarity assessment both the query and the target molecules are described by 1-1 single nodes (not equivalent with FTrees score (level x similarity)) (default: 0.1)

Van der Waals correction

Hydrogens at carbon atoms are not explicitly considered in some cases. Instead, the united atom radii model is used. For each hydrogen attached to a carbon, the van der Waals radius of the atom is incremented by the given value (default: 0.2)

Minimum nodes per subtree

Minimum number of nodes of each subtree to initiate a recursive subdivision of the subtrees during tree comparison (default: 2)

Feature Tree matching:

Matching algorithm

Choose from three available Feature Tree matching algorithms:

Match search (heuristic algorithm, no inner NIL matches (subtree in one Feature Tree matches nothing in the other Feature Tree) are allowed, more suitable for virtual screening) (default)
Dynamic match search (systematic algorithm (best possible matching is always found), NIL matches (subtree in one Feature Tree matches nothing in the other Feature Tree) are allowed but penalized, more suitable for identifying subgraphs or for global alignments)
Split search (heuristic algorithm, allows unpenalized NIL matches (subtree in one Feature Tree matches nothing in the other Feature Tree) with no control over how large these can become)

Volume balance

Size of the two matched subtrees may not differ more than the given factor (default: 2)

Null match weighting factor

Parts of the molecules involved in null matches are only considered with the given weighting factor during normalization of the FTrees similarity (1: fully considered, 0: not considered, default: 0.3)

Shape weight

Weight factor of shape versus chemistry to compute overall similarity (default: 0.3)

Maximum initial splits

Only effective when “Matching algorithm” is set to “Match search” (default: 10)

Maximum extension matches

Only effective when “Matching algorithm” is set to “Match search” (default: 3)

Extension match weighting factor

Weighting of extension match similarity against “rest” similarity in the scoring of extension matches (default: 0.8). Only effective when “Matching algorithm” is set to “Match search”.

Subgraph weight

Weighting of similarity scoring towards subgraph matchings or total Feature Tree matchings. If this parameter is zero, two trees of different sizes can never score a perfect match, i.e. search for complete Feature Tree matchings. If this parameter is one, then a subtree perfectly matched onto the larger Feature Tree is allowed to score a perfect match, i.e. allow subgraph matching (default: 0). Only effective when “Matching algorithm” is set to “Dynamic match search”

Gap penalty

Penalizing the insertion of a gap into a Feature Tree alignment. Only effective when “Matching algorithm” is set to “Dynamic match search”. Options: “No penalty”, “Penalize by length”, “Penalize by size” (default)

Inner gap penalty

Penalty for gaps in the interior of an alignment. Gap penalty will be multiplied by the given value (default: 0.1). Only effective when “Matching algorithm” is set to “Dynamic match search” and “Gap penalty” is set to “Penalize by length” or “Penalize by size”.

Outer gap penalty

Penalty for gaps at the edge of an alignment. Gap penalty will be multiplied by the given value (default: 0.01). Only effective when “Matching algorithm” is set to “Dynamic match search” and “Gap penalty” is set to “Penalize by length” or “Penalize by size”.

Match score

Adjust scoring scheme of matches. Only effective when “Matching algorithm” is set to “Dynamic match search”. Options: “Give all matches made a score of 1.0”, “Use the FTrees subtree similarity to score matches” (default)

Merge size

Adjust the number of Feature Tree nodes that are allowed to lie in one match (default: 4). Only effective when “Matching algorithm” is set to “Dynamic match search”.

Maximum number of splits

Maximum number of splits evaluated during one subdivision (default: 3). Only effective when “Matching algorithm” is set to “Split search”

Split scoring

Balance criterion (default: 0.6). Only effective when “Matching algorithm” is set to “Split search”.

Results

Molecules satisfying search criteria
FTrees score will be displayed in List and Table views
FTrees similarity can be visualized by clicking on the “Visualize similarity” link under the FTrees score

Access

FTrees Visual Similarities can be accessed by subscribing to the FTrees Visual Similarities package.

online documentation

Table of Contents