The repository contains a PyTorch implementation of a pre-trained face parser based on SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation (NeurIPS 2022).
The code is based on SegNext-official-repo. We use CelebAMask-HQ as the training data.
Notes: ImageNet Pre-trained models can be found in TsingHua Cloud.
Rank 1 on Pascal VOC dataset: Leaderboard
Image 027999 for training, 2800029999 for validation.
| Method | Backbone | Pretrained | Iters | mIoU(ss) | Params | FLOPs | Config | Download | 
|---|---|---|---|---|---|---|---|---|
| SegNeXt | MSCAN-T | IN-1K | 160K | 77.86 | 4M | 7G | config | Google Drive | 
| SegNeXt | MSCAN-S | IN-1K | 160K | 78.19 | 14M | 16G | config | Google Drive | 
| SegNeXt | MSCAN-B | IN-1K | 160K | 78.97 | 28M | 35G | config | Google Drive | 
| SegNeXt | MSCAN-L | IN-1K | 160K | 79.34 | 49M | 70G | config | Google Drive | 
| Method | Backbone | Pretrained | Iters | mIoU(ss/ms) | Params | FLOPs | Config | Download | 
|---|---|---|---|---|---|---|---|---|
| SegNeXt | MSCAN-T | IN-1K | 160K | 41.1/42.2 | 4M | 7G | config | TsingHua Cloud | 
| SegNeXt | MSCAN-S | IN-1K | 160K | 44.3/45.8 | 14M | 16G | config | TsingHua Cloud | 
| SegNeXt | MSCAN-B | IN-1K | 160K | 48.5/49.9 | 28M | 35G | config | TsingHua Cloud | 
| SegNeXt | MSCAN-L | IN-1K | 160K | 51.0/52.1 | 49M | 70G | config | TsingHua Cloud | 
| Method | Backbone | Pretrained | Iters | mIoU(ss/ms) | Params | FLOPs | Config | Download | 
|---|---|---|---|---|---|---|---|---|
| SegNeXt | MSCAN-T | IN-1K | 160K | 79.8/81.4 | 4M | 56G | config | TsingHua Cloud | 
| SegNeXt | MSCAN-S | IN-1K | 160K | 81.3/82.7 | 14M | 125G | config | TsingHua Cloud | 
| SegNeXt | MSCAN-B | IN-1K | 160K | 82.6/83.8 | 28M | 276G | config | TsingHua Cloud | 
| SegNeXt | MSCAN-L | IN-1K | 160K | 83.2/83.9 | 49M | 578G | config | TsingHua Cloud | 
Notes: In this scheme, The number of FLOPs (G) is calculated on the input size of 512 
Install the dependencies and download ADE20K according to the guidelines in MMSegmentation. The code is based on MMSegmentation-v0.24.1.
pip install openmim
mim install mmcv-full==1.5.1 mmcls==0.20.1
cd egNeXt-FaceParser
python setup.py develop
We use 8 GPUs for training by default. Run:
./tools/dist_train.sh /path/to/config 8To evaluate the model, run:
./tools/dist_test.sh /path/to/config /path/to/checkpoint_file 8 --eval mIoUInstall torchprofile using
pip install torchprofileTo calculate FLOPs for a model, run:
bash tools/get_flops.py /path/to/config --shape 512 512For technical problem, please create an issue.
If you find this repo useful for your research, please consider citing:
@misc{SegNeXt-FaceParser, 
  author={Zhian Liu}, 
  title={SegNeXt-FaceParser}, 
  year={2023}, 
  url={https://github.com/e4s2022/SegNeXt-FaceParser} 
}
@article{guo2022segnext,
  title={SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation},
  author={Guo, Meng-Hao and Lu, Cheng-Ze and Hou, Qibin and Liu, Zhengning and Cheng, Ming-Ming and Hu, Shi-Min},
  journal={arXiv preprint arXiv:2209.08575},
  year={2022}
}
@inproceedings{CelebAMask-HQ,
  title={MaskGAN: Towards Diverse and Interactive Facial Image Manipulation},
  author={Lee, Cheng-Han and Liu, Ziwei and Wu, Lingyun and Luo, Ping},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}
@article{guo2022visual,
  title={Visual Attention Network},
  author={Guo, Meng-Hao and Lu, Cheng-Ze and Liu, Zheng-Ning and Cheng, Ming-Ming and Hu, Shi-Min},
  journal={arXiv preprint arXiv:2202.09741},
  year={2022}
}
@inproceedings{
    ham,
    title={Is Attention Better Than Matrix Decomposition?},
    author={Zhengyang Geng and Meng-Hao Guo and Hongxu Chen and Xia Li and Ke Wei and Zhouchen Lin},
    booktitle={International Conference on Learning Representations},
    year={2021},
}
Our implementation is mainly based on mmsegmentaion, Segformer and Enjoy-Hamburger. Thanks for their authors.
This repo is under the Apache-2.0 license. For commercial use, please contact the authors.



