1414 * This class models a segment of a reference sequence, which can represent a complete genome,
1515 * a plasmid, a single contig, or a scaffold. It extends the {@link Attributable} class to
1616 * inherit functionality for managing attributes associated with the contig.
17- * </p>
1817 * <p>
1918 * Each instance of this class is uniquely identified by its {@code name} and contains
2019 * information about its nucleotide sequence, variants, and other relevant properties.
21- * </p>
2220 */
2321public class Contig extends Attributable {
2422
@@ -28,7 +26,6 @@ public class Contig extends Attributable {
2826 * This field uniquely identifies the contig within the context of the application.
2927 * It is a final field, meaning its value is immutable once assigned during the
3028 * construction of the {@link Contig} instance.
31- * </p>
3229 */
3330 public final String name ;
3431
@@ -38,10 +35,8 @@ public class Contig extends Attributable {
3835 * This field stores the nucleotide sequence of the contig. The sequence is expected to be
3936 * stored as a GZIP-compressed string to optimize storage. It may be empty or null if no
4037 * sequence is available for the contig.
41- * </p>
4238 * <p>
4339 * <b>Note:</b> The sequence is not validated against the variants stored in the {@code variants} map.
44- * </p>
4540 */
4641 protected final String sequence ;
4742
@@ -55,11 +50,9 @@ public class Contig extends Attributable {
5550 * <li>The third level (value: {@link VariantInformation}) contains additional information about the variant,
5651 * such as associations with {@link SequenceType}s or {@link Sample}s.</li>
5752 * </ul>
58- * </p>
5953 * <p>
6054 * This structure allows storage and retrieval of variant data, enabling queries by position,
6155 * alternative content, and associated metadata.
62- * </p>
6356 */
6457 protected final TreeMap <Integer , Map <String , VariantInformation >> variants ;
6558
@@ -69,11 +62,9 @@ public class Contig extends Attributable {
6962 * This field is a transient {@link HashMap} used to cache subsequences of the contig's sequence.
7063 * The keys in the map are {@link Tuple} objects representing the start and end positions of the subsequence,
7164 * and the values are the corresponding subsequences as {@link String}.
72- * </p>
7365 * <p>
7466 * The cache is transient because it is not intended to be serialized, as it is dynamically populated
7567 * during runtime to optimize performance by avoiding redundant sequence decompression or retrieval.
76- * </p>
7768 */
7869 protected transient HashMap <Tuple <Integer , Integer >, String > sequenceCache ;
7970
@@ -84,7 +75,6 @@ public class Contig extends Attributable {
8475 * initializes the {@link #variants} map to store variant information and the {@link #sequenceCache}
8576 * map to cache subsequences for optimized retrieval. The sequence is expected to be stored
8677 * as a GZIP-compressed string to reduce storage requirements.
87- * </p>
8878 *
8979 * @param name The name or identifier of the contig.
9080 * @param sequence The nucleotide sequence of the contig, stored as a GZIP-compressed string.
@@ -103,7 +93,6 @@ protected Contig(String name, String sequence) {
10393 * This method determines whether the contig has a stored sequence by checking
10494 * if the {@code sequence} field is not empty. A non-empty sequence indicates
10595 * that the contig has an associated nucleotide sequence.
106- * </p>
10796 *
10897 * @return {@code true} if the contig has a sequence (i.e., the sequence length is not zero),
10998 * {@code false} otherwise.
@@ -117,7 +106,6 @@ public boolean hasSequence() {
117106 * <p>
118107 * This method decompresses the GZIP-compressed sequence stored in the {@code sequence} field
119108 * and returns it as a string. If no sequence is stored, it returns an empty string.
120- * </p>
121109 *
122110 * @return The decompressed nucleotide sequence of this contig, or an empty string if no sequence is stored.
123111 * @throws IOException If an error occurs during the decompression of the sequence.
@@ -136,12 +124,9 @@ public String getSequence() throws IOException {
136124 * specified start and end positions. The subsequence is cached to avoid redundant decompression
137125 * and substring operations for the same range. If the subsequence is already cached, it is
138126 * retrieved directly from the cache. Otherwise, it is computed, stored in the cache, and returned.
139- * </p>
140- *
141127 * <p>
142128 * The start and end positions are 1-based indices, meaning the first nucleotide in the sequence
143129 * is at position 1. If no sequence is stored for the contig, the method returns an empty string.
144- * </p>
145130 *
146131 * @param start The 1-based indexed start position of the subsequence (inclusive).
147132 * @param end The 1-based indexed end position of the subsequence (exclusive).
@@ -169,7 +154,6 @@ public String getSubsequence(int start, int end) throws IOException {
169154 * This method iterates through the hierarchical map of variants stored in the {@code variants} field.
170155 * It computes the total count by summing up the sizes of all inner maps, where each inner map represents
171156 * the alternative sequences for a specific position on the contig.
172- * </p>
173157 *
174158 * @return The total number of variants located on this contig.
175159 */
@@ -184,7 +168,6 @@ public int getVariantsCount() {
184168 * for a variant located at the specified position with the given alternative bases. The returned
185169 * {@link VariantInformation} contains details about the variant, including its occurrences in
186170 * samples and features, as well as any associated attributes.
187- * </p>
188171 *
189172 * @param position The 1-based position of the variant on the contig.
190173 * @param alternativeBases The alternative base sequence of the variant.
@@ -203,14 +186,11 @@ public VariantInformation getVariantInformation(int position, String alternative
203186 * <li>The position of the variant (field {@code a} of the tuple).</li>
204187 * <li>The alternate allele of the variant (field {@code b} of the tuple).</li>
205188 * </ul>
206- * </p>
207- *
208189 * <p>
209190 * For each variant, the method retrieves the associated {@link VariantInformation} using
210191 * the position and alternate allele. It then checks if the {@code Constants.EFFECTS} attribute
211192 * is present. If the attribute is found, its value (a comma-separated string of effects) is split
212193 * into individual effects, which are trimmed and aggregated into a {@link Set} to ensure uniqueness.
213- * </p>
214194 *
215195 * @param variants A list of {@link Tuple} objects representing the variants. Each tuple contains:
216196 * <ul>
@@ -236,7 +216,6 @@ public Set<String> getVariantsEffects(List<Tuple<Integer, String>> variants) {
236216 * <li>The position of the variant on the contig.</li>
237217 * <li>The alternative base sequence of the variant.</li>
238218 * </ul>
239- * </p>
240219 *
241220 * @return An {@link ArrayList} of {@link Tuple} objects, where each tuple contains
242221 * the position and alternative base sequence of a variant.
@@ -261,7 +240,6 @@ public ArrayList<Tuple<Integer, String>> getVariants() {
261240 * <li>The position of the variant on the contig.</li>
262241 * <li>The alternative base sequence of the variant.</li>
263242 * </ul>
264- * </p>
265243 *
266244 * @param start The 1-based indexed inclusive start position of the range.
267245 * @param end The 1-based indexed inclusive end position of the range.
@@ -289,7 +267,6 @@ public ArrayList<Tuple<Integer, String>> getVariantsByLocation(int start, int en
289267 * <li>The position of the variant on the contig.</li>
290268 * <li>The alternative base sequence of the variant.</li>
291269 * </ul>
292- * </p>
293270 *
294271 * @param feature The feature to filter variants by.
295272 * @param alleleUids A set of allele unique identifiers to filter variants by.
@@ -320,7 +297,6 @@ public ArrayList<Tuple<Integer, String>> getVariantsByAlleles(Feature feature, S
320297 * <li>The position of the variant on the contig.</li>
321298 * <li>The alternative base sequence of the variant.</li>
322299 * </ul>
323- * </p>
324300 *
325301 * @param sampleName The name of the sample to filter variants by.
326302 * @return An {@link ArrayList} of {@link Tuple} objects, where each tuple contains
@@ -349,7 +325,6 @@ public ArrayList<Tuple<Integer, String>> getVariantsBySample(String sampleName)
349325 * <li>The position of the variant on the contig.</li>
350326 * <li>The alternative base sequence of the variant.</li>
351327 * </ul>
352- * </p>
353328 *
354329 * @param sampleName The name of the sample to filter variants by.
355330 * @param start The 1-based indexed inclusive start position of the location range.
0 commit comments