diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index d4085b7..cd09914 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -31,7 +31,10 @@ cd avex ### Install with uv (Recommended) ```bash -# Install the project with dev dependencies +# Install the project without dev dependencies +uv sync + +# For ESP users, install the project with dev dependencies uv sync --group dev ``` diff --git a/README.md b/README.md index f502ceb..72afc97 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,8 @@ -# avex - Animal Vocalization Encoder Library +# AVEX - Animal Vocalization Encoder Library +[![arXiv](https://img.shields.io/badge/arXiv-2508.11845-b31b1b.svg)](https://arxiv.org/abs/2508.11845) +[![PyPI](https://img.shields.io/pypi/v/avex.svg)](https://pypi.org/project/avex/) +[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Collection-yellow)](https://huggingface.co/collections/EarthSpeciesProject/esp-aves2) ![CI status](https://github.com/earthspecies/avex/actions/workflows/pythonapp.yml/badge.svg?branch=main) ![Pre-commit status](https://github.com/earthspecies/avex/actions/workflows/pre-commit.yml/badge.svg?branch=main) @@ -7,7 +10,7 @@ An API for model loading and inference, and a Python-based system for training a ## Description -The Animal Vocalization Encoder library avex provides a unified interface for working with pre-trained bioacoustics representation learning models, with support for: +The Animal Vocalization Encoder library AVEX provides a unified interface for working with pre-trained bioacoustics representation learning models, with support for: - **Model Loading**: Load pre-trained models with checkpoints and class mappings - **Embedding Extraction**: Extract features from audio for downstream tasks @@ -148,9 +151,9 @@ The framework supports the following audio representation learning models: - **ATST** - Audio Spectrogram Transformer - **ResNet** - ResNet models (ResNet18, ResNet50, ResNet152) - **CLIP** - Contrastive Language-Audio Pretraining models -- **BirdNet** - BirdNet models for bioacoustic classification -- **Perch** - Perch models for bioacoustics -- **SurfPerch** - SurfPerch models +- **BirdNet** - BirdNet models for bioacoustic classification - external tensorflow model, some features might not be available +- **Perch** - Perch models for bioacoustics - external tensorflow model, some features might not be available +- **SurfPerch** - SurfPerch models - external tensorflow model, some features might not be available See [Supported Models](docs/supported_models.md) for detailed information and configuration examples. @@ -175,11 +178,32 @@ See [Probe System](docs/probe_system.md) and [API Probes](docs/api_probes.md) fo If you use this framework in your research, please cite: ```bibtex -@article{miron2025matters, +@inproceedings{miron2025matters, title={What Matters for Bioacoustic Encoding}, - author={Miron, Marius and Robinson, David and Alizadeh, Milad and Gilsenan-McMahon, Ellen and Narula, Gagan and Pietquin, Olivier and Geist, Matthieu and Chemla, Emmanuel and Cusimano, Maddie and Effenberger, Felix and others}, - journal={arXiv preprint arXiv:2508.11845}, - year={2025} + author={Miron, Marius and Robinson, David and Alizadeh, Milad and Gilsenan-McMahon, Ellen and Narula, Gagan and Chemla, Emmanuel and Cusimano, Maddie and Effenberger, Felix and Hagiwara, Masato and Hoffman, Benjamin and Keen, Sara and Kim, Diane and Lawton, Jane K. and Liu, Jen-Yu and Raskin, Aza and Pietquin, Olivier and Geist, Matthieu}, + booktitle={The Fourteen International Conference on Learning Representations}, + year={2026} +} +``` + +Related ESP papers: + +```bibtex +@inproceedings{miron2026probing, + title={Multi-layer attentive probing improves transfer of audio representations for bioacoustics}, + author={Miron, Marius and Robinson, David and Hagiwara, Masato and Titouan, Parcollet and Cauzinille, Jules and and Narula, Gagan and Alizadeh, Milad and Gilsenan-McMahon, Ellen and Keen, Sara and Chemla, Emmanuel and Hoffman, Benjamin and Cusimano, Maddie and Kim, Diane and Effenberger, Felix and Lawton, Jane K. and Raskin, Aza and Pietquin, Olivier and Geist, Matthieu}, + booktitle={ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, + pages={1--5}, + year={2026}, + organization={IEEE} +} +@inproceedings{hagiwara2023aves, + title={Aves: Animal vocalization encoder based on self-supervision}, + author={Hagiwara, Masato}, + booktitle={ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, + pages={1--5}, + year={2023}, + organization={IEEE} } ``` @@ -187,12 +211,6 @@ If you use this framework in your research, please cite: We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for: -- Development setup -- Running tests -- Code style guidelines -- Adding new functionality -- Pull request process - ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. @@ -200,4 +218,6 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file ## Acknowledgments - Built on top of PyTorch -- Integrates with various pre-trained audio models +- ICLR2026 and ICASSP2026 reviewers for the feedback +- Titouan Parcollet for templating, engineering feedback +- Bioacoustics community (IBAC, BioDCASE, ABS) diff --git a/docs/index.md b/docs/index.md index 1821fbf..aa80785 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,10 +1,10 @@ -# Representation Learning Framework Documentation +# AVEX - Animal Vocalization Encoder Library Documentation Welcome to the Representation Learning Framework documentation. This framework provides an API for model loading and inference, and a Python-based system for training and evaluating bioacoustics representation learning models. ## Getting Started -### What is avex? +### What is AVEX? The Representation Learning Framework is an API for model loading and inference, and a Python-based system for training and evaluating bioacoustics representation learning models. It provides: diff --git a/pyproject.toml b/pyproject.toml index 67736b4..be75640 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "avex" -version = "0.5.0a1" +version = "1.0.0" description = "A comprehensive Python-based system for training, evaluating, and analyzing audio representation learning models with support for both supervised and self-supervised learning paradigms" readme = "README.md" requires-python = ">=3.10,<3.13" diff --git a/uv.lock b/uv.lock index 2166f5b..db7cdb1 100644 --- a/uv.lock +++ b/uv.lock @@ -380,7 +380,7 @@ wheels = [ [[package]] name = "avex" -version = "0.5.0a1" +version = "1.0.0" source = { editable = "." } dependencies = [ { name = "birdnetlib" },