Skip to content

Set up API endpoint for OrcaHello that analyzes audio files on demand #267

@adrmac

Description

@adrmac

The current OrcaHello pipeline is designed to ingest live streaming audio. Audio segments are continuously pulled from S3, processed into 60-second WAV files, analyzed, and written to CosmosDB if there are detections.

This architecture does not currently expose a "send me a WAV and I'll return detections" API. We would like a lightweight way to analyze static audio files without setting up the full Azure/S3 ingestion flow.

Proposal
Expose a small API endpoint that accepts a WAV/FLAC, and returns time stamped hits. Outputs should match the live system, including confidence levels and a spectrogram representation.

Benefits / use cases

  • testing model behavior
  • comparing model performance against other models
  • re-running false negatives to find blind spots
  • live endpoint for an 'analyze this clip' UI feature

Skills needed

  • Python / PyTorch
  • ffmpeg / torchaudio
  • FastAPI or ASP.NET

Originally discussed in orcasite#931

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestinference systemCode to perform inference with the trained model(s)

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions