Skip to content

Commit e723501

Browse files
committed
Initial commit
1 parent 26d3a64 commit e723501

File tree

11 files changed

+1884
-2
lines changed

11 files changed

+1884
-2
lines changed

README.md

Lines changed: 106 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,106 @@
1-
# MambaOut
2-
MambaOut: Do We Really Need Mamba for Vision?
1+
# [MambaOut: Do We Really Need Mamba for Vision?](https://arxiv.org/abs/2405.xxxxx)
2+
3+
<p align="left">
4+
<a href="https://arxiv.org/abs/2405.xxxxx" alt="arXiv">
5+
<img src="https://img.shields.io/badge/arXiv-2203.16900-b31b1b.svg?style=flat" /></a>
6+
<a href="https://colab.research.google.com/drive/" alt="Colab">
7+
<img src="https://colab.research.google.com/assets/colab-badge.svg" /></a>
8+
</p>
9+
10+
<p align="center"><em>In memory of Kobe Bryant</em></p>
11+
> "What can I say, Mamba out."
12+
>
13+
> *Kobe Bryant, NBA farewell speech, 2016*
14+
15+
16+
17+
This is a PyTorch implementation of MambaOut proposed by our paper "[MambaOut: Do We Really Need Mamba for Vision?](https://arxiv.org/abs/2303.16900)".
18+
19+
20+
## Requirements
21+
PyTorch and timm 0.6.11 (`pip install timm==0.6.11`).
22+
23+
Data preparation: ImageNet with the following folder structure, you can extract ImageNet by this [script](https://gist.github.com/BIGBALLON/8a71d225eff18d88e469e6ea9b39cef4).
24+
25+
```
26+
│imagenet/
27+
├──train/
28+
│ ├── n01440764
29+
│ │ ├── n01440764_10026.JPEG
30+
│ │ ├── n01440764_10027.JPEG
31+
│ │ ├── ......
32+
│ ├── ......
33+
├──val/
34+
│ ├── n01440764
35+
│ │ ├── ILSVRC2012_val_00000293.JPEG
36+
│ │ ├── ILSVRC2012_val_00002138.JPEG
37+
│ │ ├── ......
38+
│ ├── ......
39+
```
40+
41+
42+
## Models
43+
### MambaOut trained on ImageNet
44+
| Model | Resolution | Params | MACs | Top1 Acc |
45+
| :--- | :---: | :---: | :---: | :---: |
46+
| [mambaout_femto](https://github.com/yuweihao/MambaOut/releases/download/model/mambaout_femto.pth) | 224 | 7.3M | 1.2G | 78.9 |
47+
| [mambaout_tiny](https://github.com/yuweihao/MambaOut/releases/download/model/mambaout_tiny.pth) | 224 | 26.5M | 4.5G | 82.7 |
48+
| [mambaout_small](https://github.com/yuweihao/MambaOut/releases/download/model/mambaout_small.pth) | 224 | 48.5M | 9.0G | 84.1 |
49+
| [mambaout_base](https://github.com/yuweihao/MambaOut/releases/download/model/mambaout_base.pth) | 224 | 84.8M | 15.8G | 84.2 |
50+
51+
52+
#### Usage
53+
We also provide a Colab notebook which runs the steps to perform inference with MambaOut: [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/)
54+
55+
56+
## Validation
57+
58+
To evaluate models, run:
59+
60+
```bash
61+
MODEL=mambaout_tiny
62+
python3 validate.py /path/to/imagenet --model $MODEL -b 128 \
63+
--pretrained
64+
```
65+
66+
## Train
67+
We use batch size of 4096 by default and we show how to train models with 8 GPUs. For multi-node training, adjust `--grad-accum-steps` according to your situations.
68+
69+
70+
```bash
71+
DATA_PATH=/path/to/imagenet
72+
CODE_PATH=/path/to/code/MambaOut # modify code path here
73+
74+
75+
ALL_BATCH_SIZE=4096
76+
NUM_GPU=8
77+
GRAD_ACCUM_STEPS=4 # Adjust according to your GPU numbers and memory size.
78+
let BATCH_SIZE=ALL_BATCH_SIZE/NUM_GPU/GRAD_ACCUM_STEPS
79+
80+
81+
MODEL=mambaout_tiny
82+
DROP_PATH=0.2
83+
84+
85+
cd $CODE_PATH && sh distributed_train.sh $NUM_GPU $DATA_PATH \
86+
--model $MODEL --opt adamw --lr 4e-3 --warmup-epochs 20 \
87+
-b $BATCH_SIZE --grad-accum-steps $GRAD_ACCUM_STEPS \
88+
--drop-path $DROP_PATH
89+
```
90+
Training scripts of other models are shown in [scripts](/scripts/).
91+
92+
93+
## Bibtex
94+
```
95+
@article{yu2024mambaout,
96+
title={MambaOut: Do We Really Need Mamba for Vision?},
97+
author={Yu, Weihao and and Wang, Xinchao},
98+
journal={arXiv preprint arXiv:2405.xxxxx},
99+
year={2024}
100+
}
101+
```
102+
103+
## Acknowledgment
104+
Weihao was partly supported by Snap Research Fellowship, Google TPU Research Cloud (TRC), and Google Cloud Research Credits program. We thank Dongze Lian, Qiuhong Shen, Xingyi Yang, and Gongfan Fang for valuable discussions.
105+
106+
Our implementation is based on [pytorch-image-models](https://github.com/huggingface/pytorch-image-models), [poolformer](https://github.com/sail-sg/poolformer), [ConvNeXt](https://github.com/facebookresearch/ConvNeXt), [metaformer](https://github.com/sail-sg/metaformer) and [inceptionnext](https://github.com/sail-sg/inceptionnext).

distributed_train.sh

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/bin/bash
2+
NUM_PROC=$1
3+
shift
4+
python3 -m torch.distributed.launch --nproc_per_node=$NUM_PROC train.py "$@"
5+

models/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from .mambaout import *

0 commit comments

Comments
 (0)