Skip to content

Conversation

balwierz
Copy link

Large numbers are often divided into groups using a delimeter to ease of reading. English often uses comma to create groups of three for thousands, millions etc. IUPAC and International Bureau of Weights and Measures recommend using spaces.
Wikipedia on digit grouping

Some genomic software uses digit grouping. Notably, IGV and UCSC genome browser use comma for digit grouping.

This pull request allows for a use of commas and spaces in coercion to IRanges from character by stripping all commas and spaces from the input strings. It should work with any currently legal format plus allow for spaces and commas.

> GRanges("chrX:12,345,678-12,345,999")
GRanges object with 1 range and 0 metadata columns:
      seqnames            ranges strand
         <Rle>         <IRanges>  <Rle>
  [1]     chrX 12345678-12345999      *
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

> GRanges("chrX:12 345 678..12 345 999")
GRanges object with 1 range and 0 metadata columns:
      seqnames            ranges strand
         <Rle>         <IRanges>  <Rle>
  [1]     chrX 12345678-12345999      *
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

PS. This is code extends GenomicRanges coercion from character as well. If this pull request is approved, error message in .from_character_to_GRanges in GenomicRanges needs to be updated as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant