-
Notifications
You must be signed in to change notification settings - Fork 10
Relocations
Relocation - a group of different types of intra-chromosomal structural rearrangements which occur when two regions located in different parts of the same reference sequence are placed nearby in the same query sequence
Simple relocation - a relocation where two query fragments are placed adjacent to each other.
Figure 1: Simple relocation example. The reference coordinates End_r_1 and St_r_2, corresponding to the query breakpoint ends End_q_1 and St_q_2, respectively, coincide with the reference relocated block ends Rel_end_r_1 and Rel_st_r_2.
Figure 2: Simple relocation example. The reference coordinates End_r_1 and St_r_2, corresponding to the query breakpoint ends End_q_1 and St_q_2, respectively, do not coincide with the reference relocated block ends Rel_end_r_1 and Rel_st_r_2.
A relocation difference is output in the query_struct.gff and ref_struct.gff files. Information about the relocated blocks is also output in the ref_blocks.gff and query_blocks.gff files. The descriptions and examples of the last two files can be found at their wiki pages.
An example with the relocation entries in query_struct.gff :
##gff-version 3
##sequence-region query_5 1 1000
query_5 NucDiff_v2.0 SO:0001874 500 501 . . . ID=SV_1;Name=relocation;ref_sequence=ref_1;blk_1_query=1-500;blk_1_ref=46001-46500;blk_1_query_len=500;blk_1_ref_len=500;blk_1_st_query=1;blk_1_st_ref=46001;blk_1_end_query=500;blk_1_end_ref=46500;blk_2_query=501-1000;blk_2_ref=57001-57500;blk_2_query_len=500;blk_2_ref_len=500;blk_2_st_query=501;blk_2_st_ref=57001;blk_2_end_query=1000;blk_2_end_ref=57500;color=#990099
##sequence-region query_7 1 1000
query_7 NucDiff_v2.0 SO:0001874 500 501 . . . ID=SV_2;Name=relocation;ref_sequence=ref_1;blk_1_query=1-500;blk_1_ref=69001-69500;blk_1_query_len=500;blk_1_ref_len=500;blk_1_st_query=1;blk_1_st_ref=69001;blk_1_end_query=500;blk_1_end_ref=69500;blk_2_query=501-1000;blk_2_ref=80001-80500;blk_2_query_len=500;blk_2_ref_len=500;blk_2_st_query=501;blk_2_st_ref=80001;blk_2_end_query=1000;blk_2_end_ref=80500;color=#990099
The query_struct.gff file contains the following information (see Figure 1 for notations):
| GFF3 fields | Content | Notes |
|---|---|---|
| col 1 | Query_seq | |
| col 2 | NucDiff_v2.0 | name and current version of the tool |
| col 3 | SO:0001874 | Sequence Ontology accession number corresponding to the "intrachromosomal_breakpoint" SO term |
| col 4 | End_q_1 | |
| col 5 | St_q_2 | |
| col 6/col 7/col8 | . | score/strand/phase fields are not used |
| col 9, ID | "SV_1" | ID in query_struct.gff is related to ID in ref_struct.gff |
| col 9, Name | "relocation" | |
| col 9, ref_sequence | Ref_seq | |
| col 9, blk_1_query | St_q_1 - End_q_1 | |
| col 9, blk_1_ref | Rel_st_r_1 - Rel_end_r_1 | |
| col 9, blk_1_query_len | Length(A) | |
| col 9, blk_1_ref_len | Length(A*) | |
| col 9, blk_1_st_query | St_q_1 | |
| col 9, blk_1_st_ref | St_r_1 | |
| col 9, blk_1_end_query | End_q_1 | |
| col 9, blk_1_end_ref | End_r_1 | |
| col 9, blk_2_query | St_q_2 - End_q_2 | |
| col 9, blk_2_ref | Rel_st_r_2 - Rel_end_r_2 | |
| col 9, blk_2_query_len | Length(B) | |
| col 9, blk_2_ref_len | Length(B*) | |
| col 9, blk_2_st_query | St_q_2 | |
| col 9, blk_2_st_ref | St_r_2 | |
| col 9, blk_2_end_query | End_q_2 | |
| col 9, blk_2_end_ref | End_r_2 |
An example with the relocation entries in ref_struct.gff :
##gff-version 3
##sequence-region ref_1 1 115000
ref_1 NucDiff_v2.0 SO:0001874 46500 46500 . . . ID=SV_1.1;Name=relocation;query_sequence=query_5;query_coord=500;breakpoint_query=500-501;blk_query=1-500;blk_ref=46001-46500;blk_query_len=500;blk_ref_len=500;color=#990099
ref_1 NucDiff_v2.0 SO:0001874 57001 57001 . . . ID=SV_1.2;Name=relocation;query_sequence=query_5;query_coord=501;breakpoint_query=500-501;blk_query=501-1000;blk_ref=57001-57500;blk_query_len=500;blk_ref_len=500;color=#990099
ref_1 NucDiff_v2.0 SO:0001874 69500 69500 . . . ID=SV_2.1;Name=relocation;query_sequence=query_7;query_coord=500;breakpoint_query=500-501;blk_query=1-500;blk_ref=69001-69500;blk_query_len=500;blk_ref_len=500;color=#990099
ref_1 NucDiff_v2.0 SO:0001874 80001 80001 . . . ID=SV_2.2;Name=relocation;query_sequence=query_7;query_coord=501;breakpoint_query=500-501;blk_query=501-1000;blk_ref=80001-80500;blk_query_len=500;blk_ref_len=500;color=#990099
The ref_struct.gff file contains the following information (see Figure 1 for notations):
| GFF3 fields | Content for Relocation block 1 | Content for Relocation block 2 | Notes |
|---|---|---|---|
| col 1 | Ref_seq | Ref_seq | |
| col 2 | NucDiff_v2.0 | NucDiff_v2.0 | name and current version of the tool |
| col 3 | SO:0001874 | SO:0001874 | Sequence Ontology accession number corresponding to the "intrachromosomal_breakpoint" SO term |
| col 4 | End_r_1 | St_r_2 | |
| col 5 | End_r_1 | St_r_2 | |
| col 6/col 7/col8 | . | . | score/strand/phase fields are not used |
| col 9, ID | "SV_1.1" | "SV_1.2" | ID in ref_struct.gff is related to ID in query_struct.gff |
| col 9, Name | "relocation" | "relocation" | |
| col 9, query_sequence | Query_seq | Query_seq | |
| col 9, query_coord | End_q_1 | St_q_2 | a query_coord base corresponds to the reference base from col 4 |
| col 9, breakpoint_query | End_q_1 - St_q_2 | End_q_1 - St_q_2 | |
| col 9, blk_query | St_q_1 - End_q_1 | St_q_2 - End_q_2 | |
| col 9, blk_ref | Rel_st_r_1 - Rel_end_r_1 | Rel_st_r_2 - Rel_end_r_2 | |
| col 9, blk_query_len | Length(A) | Length(B) | |
| col 9, blk_ref_len | Length(A*) | Length(B*) |