Source code and dataset for our paper “USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection” by Lin Hong, Xin Wang, Gan Zhang, and Ming Zhao. IEEE TIP 2025
Created by Lin Hong, email: eelinhong@ust.hk or lin.hong@tum.de
Note: The USOD dataset contains only 300 images. The USOD10K dataset includes most of the images from USOD but uses different annotation standards. If you find it difficult to access the USOD dataset or manage conflicting annotations, it is recommended to benchmark your methods on the USOD10K dataset.
Baidu Netdisk: USOD10K fetch code: [good] Or Google drive: USOD10K is the first large-scale dataset for Underwater Salient Object Detection (USOD). USOD10K is free for academic research, not for any commercial purposes.
Note: for practical training and reliable test results of deep methods on the USOD10K dataset, there should be enough samples of each category on the training set, validation set (training set and validation set are merged in TC-USOD baseline), and test set. Hence we follow the USOD10K split of roughly 7:2:1. Its folder looks like this:
Data
|-- USOD10K
| |-- USOD10K-TR
| |-- |-- USOD10K-TR-RGB
| |-- |-- USOD10K-TR-GT
| |-- |-- USOD10K-TR-depth
| |-- |-- USOD10K-TR-Boundary
| |-- USOD10K-Val
| |-- |-- USOD10K-Val-RGB
| |-- |-- USOD10K-Val-GT
| |-- |-- USOD10K-Val-depth
| |-- |-- USOD10K-Val-Boundary
| |-- USOD10K-TE
| |-- |-- USOD10K-TE-RGB
| |-- |-- USOD10K-TE-GT
| |-- |-- USOD10K-TE-depth
The TC-USOD baseline Baidu Netdisk with fetch code: [ie0k] Or Google drive TC-USOD baseline is simple but strong, it adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively.
How to generate predicted saliency maps by yourself or retrain this model: You create a folder named checkpoint under the TC_USOD folder (cd TC_USOD->mkdir checkpoint) and put the download TC-USOD baseline model in it to generate the predicted saliency maps (You can also find them in the TC_USOD/preds/USOD10K in this project). Of course, you can retrain this method with the available USOD10K dataset to get your own model.
- Python 3.8
- Pytorch 1.6.0
- Torchvison 0.7.0
We retrained 35 SOTA methods in the fields of SOD and USOD. It takes us about 1750 hours to retrain these methods. Here is the qualitative evaluation of the 35 SOTA methods and the proposed TC-USOD baseline.

(1) Retrained models are available BaiduNetdisk fetch code: [usod] &&& Googledriven
(2) Predicted saliency maps of USOD10K are available BaiduNetdisk fetch code: [usod] &&& Google driven
(3) Predicted saliency maps of USOD are available BaiduNetdisk fetch code: [usod] &&& Google driven
(4) Evaluation results are available BaiduNetdisk fetch code: [usod] &&& Google driven
If you think our work is helpful, please cite
@ARTICLE{10102831,
author={Hong, Lin and Wang, Xin and Zhang, Gan and Zhao, Ming},
journal={IEEE Transactions on Image Processing},
title={USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection},
year={2025},
volume={34},
number={},
pages={1602-1615},
doi={10.1109/TIP.2023.3266163}}
(1) NJUD [baidu pan fetch code: 7mrn | Google drive]
(2) NLPR [baidu pan fetch code: tqqm | Google drive]
(3) DUTLF-Depth [baidu pan fetch code: 9jac | Google drive]
(4) STERE [baidu pan fetch code: 93hl | Google drive]
(5) LFSD [baidu pan fetch code: l2g4 | Google drive]
(6) RGBD135 [baidu pan fetch code: apzb | Google drive]
(7) SSD [baidu pan fetch code: j3v0 | Google drive]
(8) SIP [baidu pan fetch code: q0j5 | Google drive]
We thank the authors of VST for providing T2T-ViT backbone, the authors of DPT for providing us the method to get estimated depth maps of single underwater images in USOD10K, the authors of SVAM-Net for providing the USOD dataset, and Zhao Zhang for providing the efficient evaluation tool.
We hope our work will boost the development of USOD research. As a young research field, USOD is still far from being solved, leaving large room for further improvement!
