Canapy Tools Documentation

Welcome to Canapy Tools! This repository contains a collection of scripts designed to facilitate audio model training and annotation. Below is a comprehensive guide to help you utilize each script effectively.

Getting Started with Canapy

Launch the Original Project

1. Set Up the Environment

First, you need to set up a Python 3.11 environment with CUDA.

Next, clone the repository:

Repository URL: https://github.com/birds-canopy/canapy.git

To install the dependencies, run the following command:

pip install -e <path_to_canapy_directory_containing_pyproject.toml>

2. Verify the Main Branch and Launch the Dashboard

Make sure you're on the main branch, and use one of the following commands to launch the dashboard with the specified dataset:

canapy dash -d D:\Inria\Datasets\M1-2016-spring -o output

If your audio files and annotations are in separate folders, use this command instead:

canapy dash -a song_dataset/annotations -s song_dataset/audio -o output

Main Scripts

Canapy_Main_Pipeline.py: Use this script to easily train a model and annotate unannotated audio files.
Canapy_Training.py: This script is specifically for training a model with Canapy.
Canapy_Annotation.py: Use this script to annotate unannotated audio files using a pre-trained model.

Reproducing Experiments

Basic Experiments

@TODO: put filenames at the beginning of all files

"Basic" refers to training the model on all available data for optimal performance across multiple seeds to evaluate its reliability:

BASIC_Training_Annotation.py: This script trains models and annotates a dataset across several seeds within a working directory.
BASIC_Metrics.py: Utilize this script to calculate metrics from the annotations generated by BASIC_Training_Annotation.py, including the syllable error rate and frame error rate.

Data Size Experiments

"Datasize" involves progressively increasing the training set to observe Canapy’s performance with growing amounts of data:

DATASIZE_Training_Annotation.py: Train models on various train/test splits within a working directory.
DATASIZE_Metrics.py: Calculate metrics generated by DATASIZE_Training_Annotation.py, such as the syllable error rate and frame error rate.
DATASIZE_Graph.py: This script plots and saves a graph representing the performance evolution of Canapy based on training set size.

Additional Scripts

CheckWavSamplingRate.py: Retrieves the sampling rate of a .wav file, which is essential for training with Canapy.
Check_Dataset_Syllables.py: Extracts and displays all labels represented in an annotation folder.
EditDistanceCalculator_Marron1.py: Calculates the edit distance between two annotation folders in the Marron1 CSV format.
EditDistanceCalculator_Vak.py: Computes the edit distance between two annotation folders in the SimpleSeq format.
FrameErrorRate.py: Calculates the frame error rate between two annotation files in the Marron1 CSV format.
Marron1ToDAS.py: Converts an annotation folder to a format accepted by DAS.
Marron1ToSimpleSeq.py: Converts an annotation folder from the SimpleSeq format to the Marron1 format (this conversion is handled automatically by Canapy).
Model_reader.py: Reads the configuration of a model trained by Canapy.
SimpleSeqToMarron1.py: Converts an annotation folder from the SimpleSeq format to the Marron1 CSV format.
UMAP_Marron1.py: Displays a UMAP visualization of a dataset in Marron1 CSV format.
Wav_Copy.py: Copies .wav files from one directory to another.
Wav_Analyze.py: Analyzes the temporal data of .wav audio files in a dataset, aiding in the decision of sequence values for a Datasize.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Canapy Tools Documentation

Getting Started with Canapy

Launch the Original Project

1. Set Up the Environment

2. Verify the Main Branch and Launch the Dashboard

Main Scripts

Reproducing Experiments

Basic Experiments

Data Size Experiments

Additional Scripts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
BASIC_Metrics.py		BASIC_Metrics.py
BASIC_Training_Annotation.py		BASIC_Training_Annotation.py
Canapy_Annotation.py		Canapy_Annotation.py
Canapy_Main_Pipeline.py		Canapy_Main_Pipeline.py
Canapy_Training.py		Canapy_Training.py
CheckWavSamplingRate.py		CheckWavSamplingRate.py
Check_Dataset_Syllables.py		Check_Dataset_Syllables.py
DATASIZE_Graph.py		DATASIZE_Graph.py
DATASIZE_Metrics.py		DATASIZE_Metrics.py
DATASIZE_Training_Annotation.py		DATASIZE_Training_Annotation.py
EditDistanceCalculator_Marron1.py		EditDistanceCalculator_Marron1.py
EditDistanceCalculator_Vak.py		EditDistanceCalculator_Vak.py
FrameErrorRate.py		FrameErrorRate.py
Marron1ToDAS.py		Marron1ToDAS.py
Marron1ToSimpleSeq.py		Marron1ToSimpleSeq.py
Model_reader.py		Model_reader.py
README.md		README.md
SimpleSeqToMarron1.py		SimpleSeqToMarron1.py
UMAP_Marron1.py		UMAP_Marron1.py
Wav_Analyze.py		Wav_Analyze.py
Wav_Copy.py		Wav_Copy.py
optim-hp.py		optim-hp.py
umap_projection.png		umap_projection.png
umap_projection_Marron1.png		umap_projection_Marron1.png

Folders and files

Latest commit

History

Repository files navigation

Canapy Tools Documentation

Getting Started with Canapy

Launch the Original Project

1. Set Up the Environment

2. Verify the Main Branch and Launch the Dashboard

Main Scripts

Reproducing Experiments

Basic Experiments

Data Size Experiments

Additional Scripts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages