Skip to content

antiSMASH Record Types

Catarina Loureiro edited this page Jan 21, 2025 · 2 revisions

BiG-SCAPE 2 introduces the possibility to use any of the record types defined by antismash: (region, candidate cluster, protocluster and protocore) as the ‘working’ BGC record. For a full description of the region concept see the antiSMASH Documentation.

When protocluster or protocore record types are used, records within a 'chemical hybrid' candidate cluster or 'interleaved' candidate cluster are merged, to avoid duplication of records. This can be adjusted by updating the config.yml file (MERGED_CAND_CLUSTER_TYPES). Furthermore, in the BGC record visualization sections of the UI, the entire .gbk region will be displayed, and the relevant record domains will be shaded while the remaining domains will be semi-transparent (Fig. 2).

For every .gbk, BiG-SCAPE will try to extract the requested record type, if this is not present, BiG-SCAPE will try to extract the next higher level record type, i.e. if a proto_core feature is not present, BiG-SCAPE will look for a protocluster feature, and so on and so forth. The record type hierarchy is: region > cand_cluster > protocluster > proto_core.

Note: We advise new users to stick with either region or protocluster, as cand_cluster and proto_core will require more careful consideration of input data features and interpretation of results. Please refer to the antiSMASH Documentation for a full, comprehensive, description of how these records are generated and what features they include.

Sim-links and Topo-links

When making use of the record types that yield more than one record per region, two types of links can be made between records:

  • Sim-links, or similarity links, are the traditional BiG-SCAPE distances making up (solid-line) edges between nodes (records) in the sequence similarity network.
  • Topo-links, or topological links, are formed between two nodes/records when these originate from the same antiSMASH region, and are thus co-located in the chromosome. These links are represented by dashed lines in the BiG-SCAPE 2 sequence similarity network visualization (Fig. 2).
Clone this wiki locally