Skip to content

Commit 26acc41

Browse files
committed
numpy 2 support; update docs
1 parent c1e71a7 commit 26acc41

File tree

4 files changed

+59
-50
lines changed

4 files changed

+59
-50
lines changed

ChangeLog

+48-42
Original file line numberDiff line numberDiff line change
@@ -1,53 +1,59 @@
1-
2025-02-14 Tao Liu <[email protected]>
1+
2025-02-19 Tao Liu <[email protected]>
22
MACS 3.0.3
33

44
* Features added
55

6-
1) FRAG format introduced. FRAG format is for fragment files
7-
defined by 10x genomics to store the alignments from single-cell
8-
ATAC-seq experiment. It can be regarded as the BEDPE format with
9-
two extra columns -- the barcode information and the counts of the
10-
fragments aligned to the same location with the same
11-
barcode. Currently, `callpeak` and `pileup` both support the new
12-
format. We will add support for other functions such as `hmmratac`
13-
in the future.
14-
15-
We implemented the IO module for reading the fragment files in
16-
`Parser.FragParser`. And we then implemented a new
17-
`PairedEndTrack.PETrackII` to store the data in fragment file,
18-
including the barcodes and counts information. In the `PETrackII`
19-
class, we are able to extract a subset using a list of barcodes,
20-
which enables us to call peaks only on a pool (pseudo-bulk) of
21-
cells.
22-
23-
2) We extensively rewrote the `pyx` codes into `py` codes. In
24-
another words, we now apply the 'pure python style' with PEP-484
25-
type annotations to our previous Cython style codes. So that, the
26-
source codes can be more compatible to Python programming tools
27-
such as `flake8`. During rewritting, we cleaned the source codes
28-
even more, and removed unnecessary dependencies during
29-
compilation. We will continue to do more code cleaning in the
30-
future.
31-
32-
3) We changed the behavior on the usage of 'blacklist' regions in
33-
`hmmratac`. We will remove the aligned fragments located in the
34-
'blacklist' regions before the EM step to estimate fragment
35-
lengths distributions and HMM step to learn and predict nucleosome
36-
states. The reason is discussed in #680. To implement this
37-
feature, we added the `exclude` functions to PETrackI and
38-
PETrackII.
6+
1) Now support FRAG format for single-cell ATAC-seq in `callpeak`
7+
and `pileup`. FRAG format is used by 10x Genomics to store
8+
alignments from the single-cell ATAC-seq pipeline
9+
`cellranger-atac` or the multiomics pipeline `cellranger-arc`. The
10+
format is essentially BEDPE with two additional columns: the
11+
barcode and the count of fragments aligned to the same location
12+
with the same barcode. Support for FRAG in other tools is coming
13+
soon, as well as for `hmmratac` calls.
14+
15+
If you specify `-f FRAG` as your input format:
16+
17+
- You can use a barcode list for a subset of cells with
18+
`--barcodes`, then `callpeak` will identify peaks and `pileup`
19+
will build pileup track for the fragments of this subset of cells.
20+
21+
- Duplicates will not get removed as we'll assume all fragments
22+
are valid. Optionally, an option, `--max-count`, can be applied to
23+
set the maximum count.
24+
25+
2) We transitioned our `pyx` codes to `py` codes, adopting a 'pure
26+
Python style' with PEP-484 type annotations. This change has made
27+
oursource codes more compatible with Python programming tools such
28+
as `flake8`. During this process, we performed further code
29+
cleaning and eliminated unnecessary dependencies. We intend to
30+
continue improving our code quality in the future.
31+
32+
3) We have modified the handling of 'blacklist' regions in the
33+
`hmmratac` tool. This change impacts both the
34+
Expectation-Maximization (EM) step that estimates fragment length
35+
distributions, and the Hidden Markov Model (HMM) step that learns
36+
and predicts nucleosome states. We now exclude aligned fragments
37+
located in the 'blocklist' regions before both steps. We
38+
implemented the `exclude` functions in both PETrackI and PETrackII
39+
to support this feature. For more detailed information and the
40+
reasoning behind it, refer to issue #680.
41+
42+
4) We have tested Numpy>=2. Now MACS3 can be run on all Numpy >=
43+
1.25.
3944

4045
* Bug fixed
4146

42-
1) `hmmratac` option `--keep-duplicate` had opposite effect
43-
previously as indicated by the name and description. It has been
44-
renamed as `--remove-dup` to reflect the actual
45-
behavior. `hmmratac` will not remove duplicated fragments unless
46-
this option is set.
47+
1) The `hmmratagc` option `--keep-duplicate` previously had the
48+
opposite effect of what its name and description
49+
suggested. Therefore, it was renamed to `--remove-dup` to more
50+
accurately describe the actual behavior. Duplicate fragments will
51+
not be removed by `hmmratac` unless this option is explicitly set
52+
up.
4753

48-
2) `hmmratac`: wrong class name used while saving digested signals
49-
in BedGraph files. Multiple other issues related to output
50-
filenames. #682
54+
2) `hmmratac`: wrong class name was used while saving digested
55+
signals in BedGraph files. Fixed multiple other issues related to
56+
output filenames. #682
5157

5258
3) Fix issues in big-endian system in `Parser.py` codes. Enable
5359
big-endian support in `BAM.py` codes for accessig certain

docs/source/docs/INSTALL.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ reproducing your results, we also add them into the requirement list
2626
with specific version numbers. So here is the list of the required
2727
python libraries that will impact the numerical calculation in MACS3:
2828

29-
- numpy>=1.25 <2.0.0
29+
- numpy>=1.25
3030
- hmmlearn>=0.3.2
3131
- scikit-learn>=1.3
3232
- scipy>=1.12

docs/source/index.md

+8-5
Original file line numberDiff line numberDiff line change
@@ -40,12 +40,13 @@ tools is coming soon, as well as for `hmmratac` calls.
4040

4141
If you specify FRAG as your input format:
4242

43-
- Use a barcode list for a subset of cells and `callpeak` will
44-
identify peaks using these fragments.
43+
- You can use a barcode list for a subset of cells with `--barcodes`,
44+
then `callpeak` will identify peaks and `pileup` will build pileup
45+
track for the fragments of this subset of cells.
4546
- Duplicates will not get removed as we'll assume all fragments are
46-
valid. Optionally, an option can be applied to set the maximum
47-
count.
48-
47+
valid. Optionally, an option, `--max-count`, can be applied to set
48+
the maximum count.
49+
4950
2) We transitioned our `pyx` codes to `py` codes, adopting a 'pure
5051
Python style' with PEP-484 type annotations. This change has made our
5152
source codes more compatible with Python programming tools such as
@@ -62,6 +63,8 @@ before both steps. We implemented the `exclude` functions in both
6263
PETrackI and PETrackII to support this feature. For more detailed
6364
information and the reasoning behind it, refer to issue #680.
6465

66+
4) We have tested Numpy>=2. Now MACS3 can be run on all Numpy >= 1.25.
67+
6568
### Bug fixed
6669

6770
1) The `hmmratagc` option `--keep-duplicate` previously had the

pyproject.toml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[build-system]
2-
requires=['setuptools>=68.0', 'numpy>=1.25,<2', 'scipy>=1.12', 'cykhash>=2.0', 'Cython>=3.0', 'scikit-learn>=1.3', 'hmmlearn>=0.3.2']
2+
requires=['setuptools>=68.0', 'numpy>=1.25', 'scipy>=1.12', 'cykhash>=2.0', 'Cython>=3.0', 'scikit-learn>=1.3', 'hmmlearn>=0.3.2']
33
build-backend = "setuptools.build_meta"
44

55
[project]
@@ -25,7 +25,7 @@ classifiers =['Development Status :: 5 - Production/Stable',
2525
'Programming Language :: Python :: 3.12',
2626
'Programming Language :: Python :: 3.13',
2727
'Programming Language :: Cython']
28-
dependencies = ["numpy>=1.25,<2",
28+
dependencies = ["numpy>=1.25",
2929
"scipy>=1.12",
3030
"hmmlearn>=0.3.2",
3131
"scikit-learn>=1.3",

0 commit comments

Comments
 (0)