SafeEar: Content Privacy-Preserving Audio Deepfake Detection

By [1] Zhejiang University, [2] Tsinghua University.

Xinfeng Li* [1], Kai Li* [2], Yifan Zheng [1], Chen Yan† [1], Xiaoyu Ji [1], Wenyuan Xu [1].

This repository is an official implementation of the SafeEar accepted to ACM CCS 2024 (Core-A*, CCF-A, Big4) .

Please also visit our (1) Project Website, (2) Full CVoiceFake Dataset, and (3) Sampled CVoiceFake Dataset.

🔥News

[2025-03-18]: Supported the batch testing for ASVspoof 2019 and 2021, fixed some bugs for datasets and trainer.

[2024-12-10]: Fixed all the bugs for training and test, and uploaded the files for data generation datas/.

[2024-12-01]: Uploaded the checkpoint for data generation datas/.

✨Key Highlights:

In this paper, we propose SafeEar, a novel framework that aims to detect deepfake audios without relying on accessing the speech content within. Our key idea is to devise a neural audio codec into a novel decoupling model that well separates the semantic and acoustic information from audio samples, and only use the acoustic information (e.g., prosody and timbre) for deepfake detection. In this way, no semantic content will be exposed to the detector. To overcome the challenge of identifying diverse deepfake audio without semantic clues, we enhance our deepfake detector with multi-head self-attention and codec augmentation. Extensive experiments conducted on four benchmark datasets demonstrate SafeEar’s effectiveness in detecting various deepfake techniques with an equal error rate (EER) down to 2.02%. Simultaneously, it shields five-language speech content from being deciphered by both machine and human auditory analysis, demonstrated by word error rates (WERs) all above 93.93% and our user study. Furthermore, our benchmark constructed for anti-deepfake and anti-content recovery evaluation helps provide a basis for future research in the realms of audio privacy preservation and deepfake detection.

🚀Overall Pipeline

🔧Installation

Clone the repository:

git clone [email protected]:LetterLiGo/SafeEar.git
cd SafeEar/

Create and activate the conda environment:

conda create -n safeear python=3.9 
conda activate safeear

Install PyTorch and torchvision following the official instructions. The code requires python=3.9, pytorch=1.13, torchvision=0.14.

pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116

Install other dependencies:

pip install pip==24.0
pip install -r requirements.txt

📊Model Performance

ASVspoof 2019 & 2021

Speech Recognition Performance

Data preparation

AVSpoof 2019 & 2021

Please download the ASVspoof 2019 and ASVspoof 2021 datasets and extract them to the datas/datasets directory.

datas/datasets/ASVspoof2019
datas/datasets/ASVspoof2021

Generate the Hubert L9 feature files

mkdir model_zoos
cd model_zoos
wget https://dl.fbaipublicfiles.com/hubert/hubert_base_ls960.pt
wget https://cloud.tsinghua.edu.cn/f/413a0cd2e6f749eea956/?dl=1 -O SpeechTokenizer.pt
cd ../datas
# Generate the Hubert L9 feature files for ASVspoof 2019
python dump_hubert_avg_feature.py datasets/ASVSpoof2019 datasets/ASVSpoof2019_Hubert_L9
# Generate the Hubert L9 feature files for ASVspoof 2021
python dump_hubert_avg_feature.py datasets/ASVSpoof2021 datasets/ASVSpoof2021_Hubert_L9

📚Training

Before starting training, please modify the parameter configurations in configs.

Use the following commands to start training:

python train.py --conf_dir config/train19.yaml
python train.py --conf_dir config/train21.yaml

📈Testing/Inference

To evaluate a model on one or more GPUs, specify the CUDA_VISIBLE_DEVICES, dataset, model and checkpoint:

python test.py --conf_dir Exps/ASVspoof19/config.yaml
python test.py --conf_dir Exps/ASVspoof21/config.yaml

Bugs and Issues

If you meet RuntimeError: Failed to load audio from <_io.BytesIO object at 0x7f45cb978f90>, please use the following command to fix it:

conda install -c anaconda 'ffmpeg<4.4'

📜Citation

If you find our work/code/dataset helpful, please consider citing:

@inproceedings{li2024safeear,
  author       = {Li, Xinfeng and Li, Kai and Zheng, Yifan and Yan, Chen and Ji, Xiaoyu, and Xu, Wenyuan},
  title        = {{SafeEar: Content Privacy-Preserving Audio Deepfake Detection}},
  booktitle    = {Proceedings of the 2024 {ACM} {SIGSAC} Conference on Computer and Communications Security (CCS)}
  year         = {2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
assert		assert
config		config
datas		datas
fairseq_ours		fairseq_ours
safeear		safeear
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafeEar: Content Privacy-Preserving Audio Deepfake Detection

🔥News

✨Key Highlights:

🚀Overall Pipeline

🔧Installation

📊Model Performance

ASVspoof 2019 & 2021

Speech Recognition Performance

Data preparation

AVSpoof 2019 & 2021

Generate the Hubert L9 feature files

📚Training

📈Testing/Inference

Bugs and Issues

📜Citation

About

Releases

Packages

Contributors 3

Languages

License

LetterLiGo/SafeEar

Folders and files

Latest commit

History

Repository files navigation

SafeEar: Content Privacy-Preserving Audio Deepfake Detection

🔥News

✨Key Highlights:

🚀Overall Pipeline

🔧Installation

📊Model Performance

ASVspoof 2019 & 2021

Speech Recognition Performance

Data preparation

AVSpoof 2019 & 2021

Generate the Hubert L9 feature files

📚Training

📈Testing/Inference

Bugs and Issues

📜Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages