diff --git a/FINETUNING_GUIDE.md b/FINETUNING_GUIDE.md new file mode 100644 index 0000000..8d1da0f --- /dev/null +++ b/FINETUNING_GUIDE.md @@ -0,0 +1,297 @@ +# Wav2Vec2 Fine-tuning Guide + +This guide explains how to fine-tune the Wav2Vec2 STT model using LLM-generated gold standard transcripts. + +## Overview + +The fine-tuning process: +1. **Evaluation Phase**: Processes 200 audio files (100 clean, 100 noisy), gets STT transcripts, uses LLM to generate gold standard transcripts, and calculates baseline WER/CER +2. **Fine-tuning Phase**: Fine-tunes the model only on samples where STT made errors +3. **Re-evaluation Phase**: Evaluates the fine-tuned model and shows improvements + +## Prerequisites + +- Python 3.8+ +- Audio files (200 total: 100 clean, 100 noisy) +- LLM (Mistral) connection working + +## Setup + +1. **Install dependencies** (if not already installed): +```bash +pip install torch transformers librosa jiwer datasets peft bitsandbytes +``` + +Optional (for faster LLM inference): +```bash +pip install flash-attn # Requires CUDA and proper compilation +``` + +2. **Organize your audio files**: +``` +data/finetuning_audio/ +├── clean/ +│ ├── audio_001.wav +│ ├── audio_002.wav +│ └── ... (100 files) +└── noisy/ + ├── audio_101.wav + ├── audio_102.wav + └── ... (100 files) +``` + +Alternatively, if you put all files in one directory, the script will automatically split them in half. + +## Test LLM Connection + +Before fine-tuning, test that the LLM is working: + +```bash +python scripts/test_llm_connection.py +``` + +Expected output: +``` +============================================================ +LLM Connection Test +============================================================ + +1. Initializing Mistral LLM... + Loading LLM: mistralai/Mistral-7B-Instruct-v0.3 on cuda (fast_mode=True) + Using 4-bit quantization for fast inference + Warming up model... + Model warm-up complete + ✓ LLM corrector initialized + +2. Checking LLM availability... + ✓ LLM is available and loaded + +3. Testing transcript correction... + Input: HIS LATRPAR AS USUALLY FORE + Output: [LLM corrected output] + ✓ LLM successfully corrected the transcript + +4. Testing transcript improvement... + ... +``` + +### LLM Optimization Features + +The LLM corrector now includes several optimizations for faster inference: + +1. **4-bit Quantization** (when CUDA available): + - Reduces memory usage by ~75% + - Significantly speeds up inference + - Minimal accuracy loss + +2. **Fast Mode** (enabled by default): + - Reduced max tokens (128 vs 512) + - Greedy decoding (faster, deterministic) + - KV cache optimization + - Model warm-up on initialization + +3. **Flash Attention 2** (optional): + - Automatically used if installed + - Faster attention computation + - Requires CUDA and proper compilation + +These optimizations target **<1 second per transcript** inference time while maintaining quality. + +## Run Fine-tuning + +### Basic Usage + +```bash +python scripts/finetune_wav2vec2.py --audio_dir data/finetuning_audio +``` + +By default, the script uses **LoRA** (Low-Rank Adaptation) for efficient fine-tuning, which is 3-5x faster and uses 3-5x less memory than full fine-tuning while maintaining comparable accuracy (within 0.3-0.5%). + +### Advanced Options + +```bash +python scripts/finetune_wav2vec2.py \ + --audio_dir data/finetuning_audio \ + --output_dir models/finetuned_wav2vec2 \ + --num_epochs 5 \ + --batch_size 8 \ + --learning_rate 3e-5 \ + --lora_rank 8 \ + --lora_alpha 16 +``` + +### Arguments + +- `--audio_dir`: Directory containing audio files (required) + - Should have `clean/` and `noisy/` subdirectories, OR + - All files in root directory (will be split in half) +- `--output_dir`: Output directory for fine-tuned model (default: `models/finetuned_wav2vec2`) +- `--num_epochs`: Number of training epochs (default: 3) +- `--batch_size`: Training batch size (default: 4) +- `--learning_rate`: Learning rate (default: 3e-5) +- `--use_lora`: Enable LoRA fine-tuning (default: True) +- `--no_lora`: Disable LoRA and use full fine-tuning +- `--lora_rank`: LoRA rank - controls number of trainable parameters (default: 8) + - Higher rank = more parameters, potentially better accuracy, but slower + - Recommended range: 4-16 +- `--lora_alpha`: LoRA alpha scaling factor (default: 16) + - Typically set to 2× rank for good performance + +## Output + +The script will: + +1. **Display baseline metrics**: + ``` + Baseline Metrics: + WER: 0.3620 (36.20%) + CER: 0.1300 (13.00%) + Error Samples: 150/200 + Error Rate: 0.7500 (75.00%) + ``` + +2. **Estimate training time**: + ``` + Estimated training time: ~X.X minutes + ``` + +3. **Run fine-tuning** and show progress + +4. **Display fine-tuned metrics**: + ``` + Fine-tuned Metrics: + WER: 0.3200 (32.00%) + CER: 0.1100 (11.00%) + Error Samples: 140/200 + ``` + +5. **Show summary with improvements**: + ``` + SUMMARY + ============================================================ + + Baseline WER: 0.3620 (36.20%) + Fine-tuned WER: 0.3200 (32.00%) + WER Improvement: 0.0420 (4.20 percentage points) + + Baseline CER: 0.1300 (13.00%) + Fine-tuned CER: 0.1100 (11.00%) + CER Improvement: 0.0200 (2.00 percentage points) + ``` + +6. **Save results** to `{output_dir}/evaluation_results.json` + +## LoRA vs Full Fine-Tuning + +### LoRA (Low-Rank Adaptation) - Default + +**Benefits:** +- **3-5x faster** training time +- **3-5x less GPU memory** usage +- Only ~0.8% of parameters are trainable +- Comparable accuracy (typically within 0.3-0.5% of full fine-tuning) +- Smaller saved models (only adapters, not full model) + +**When to use:** +- Limited computational resources +- Fast iteration and experimentation +- When slight accuracy trade-off is acceptable + +**Model saving:** +- LoRA adapters are saved to `{output_dir}/lora_adapters/` +- To use: Load base model + adapters, or merge adapters for standalone use + +### Full Fine-Tuning + +**Benefits:** +- Maximum accuracy potential +- All model parameters updated +- Better for complex domain-specific tasks + +**When to use:** +- When maximum accuracy is critical +- When you have abundant computational resources +- For complex tasks requiring comprehensive model updates + +**To use full fine-tuning:** +```bash +python scripts/finetune_wav2vec2.py --audio_dir data/finetuning_audio --no_lora +``` + +## Training Time Estimation + +The script estimates training time based on: +- Number of error samples +- Number of epochs +- LoRA vs Full fine-tuning + +**LoRA**: ~7.5 seconds per sample per epoch (3-5x faster) +**Full Fine-tuning**: ~30 seconds per sample per epoch + +**Examples**: +- **LoRA**: 150 error samples × 3 epochs × 7.5 seconds = ~56 minutes +- **Full**: 150 error samples × 3 epochs × 30 seconds = ~3.75 hours + +**Actual time** may vary based on: +- Hardware (CPU vs GPU) +- Audio file lengths +- Batch size +- LoRA rank (higher rank = slightly slower) + +## Using the Fine-tuned Model + +After fine-tuning, the model will be saved to the output directory. To use it in the system: + +1. Update `src/baseline_model.py` to load from the fine-tuned path for "wav2vec2-finetuned" +2. Or load directly: +```python +from src.baseline_model import BaselineSTTModel + +model = BaselineSTTModel(model_name="path/to/finetuned/model") +result = model.transcribe("audio_file.wav") +``` + +## Troubleshooting + +### LLM Not Available +If you see warnings about LLM not being available: +- Run `python scripts/test_llm_connection.py` to diagnose +- Check that Mistral model can be loaded +- The script will continue using STT transcripts as gold standard (not ideal) + +### Out of Memory +- Reduce `--batch_size` (try 2 or 1) +- Process fewer samples +- Use a smaller model + +### Slow Processing +- Ensure you're using GPU if available +- Reduce number of epochs +- Process files in batches + +## Performance Benchmarks + +### LoRA vs Full Fine-Tuning + +Typical performance on STT tasks: +- **LoRA**: WER/CER within 0.3-0.5% of full fine-tuning +- **Training time**: 3-5x faster with LoRA +- **Memory usage**: 3-5x less with LoRA +- **Model size**: LoRA adapters ~10-50MB vs full model ~300MB+ + +### LLM Inference Speed + +With optimizations enabled (fast_mode=True, 4-bit quantization): +- **Target**: <1 second per transcript +- **Typical**: 0.5-2 seconds depending on transcript length and hardware +- **Without optimizations**: 3-10+ seconds per transcript + +## Notes + +- The script only fine-tunes on **error cases** (samples where STT transcript != LLM gold standard) +- WER/CER are calculated using `jiwer` library +- With LoRA: Only adapters are saved (much smaller files) +- With Full Fine-tuning: Complete model is saved +- Training history and logs are saved to `{output_dir}/logs/` +- LoRA adapters can be merged into base model for standalone inference if needed + diff --git a/GEMMA_INTEGRATION_SUMMARY.md b/LLAMA_INTEGRATION_SUMMARY.md similarity index 99% rename from GEMMA_INTEGRATION_SUMMARY.md rename to LLAMA_INTEGRATION_SUMMARY.md index 45d564a..576d4aa 100644 --- a/GEMMA_INTEGRATION_SUMMARY.md +++ b/LLAMA_INTEGRATION_SUMMARY.md @@ -23,7 +23,7 @@ Gemma LLM has been successfully integrated into the agent system for intelligent - Added `use_llm_correction`, `llm_model_name`, `use_quantization` parameters 2. **`src/agent/__init__.py`** - - Exported `GemmaLLMCorrector` class + - Exported `LlamaLLMCorrector` class 3. **`src/agent_api.py`** - Updated to initialize agent with LLM support diff --git a/UI_TUTORIAL.md b/UI_TUTORIAL.md new file mode 100644 index 0000000..ccc8323 --- /dev/null +++ b/UI_TUTORIAL.md @@ -0,0 +1,594 @@ +# STT Control Panel - User Tutorial & Guide + +Welcome to the Adaptive Self-Learning Agentic AI System Control Panel! This guide will help you navigate the UI and understand how to use all the features. + +## Table of Contents + +1. [Getting Started](#getting-started) +2. [UI Overview](#ui-overview) +3. [Navigation Tabs](#navigation-tabs) +4. [Transcription Feature](#transcription-feature) +5. [Model Selection](#model-selection) +6. [Understanding Results](#understanding-results) +7. [Data Management](#data-management) +8. [Fine-Tuning](#fine-tuning) +9. [Troubleshooting](#troubleshooting) +10. [Important Notes](#important-notes) + +--- + +## Getting Started + +### Prerequisites + +- Python 3.8 or higher +- Virtual environment (recommended) +- Required dependencies installed (see `requirements.txt`) + +### Starting the Control Panel + +1. **Navigate to project directory:** + ```bash + cd Adaptive-Self-Learning-Agentic-AI-System + ``` + +2. **Activate virtual environment:** + ```bash + source venv/bin/activate # On macOS/Linux + # or + .venv\Scripts\activate # On Windows + ``` + +3. **Start the control panel:** + ```bash + ./start_control_panel.sh + ``` + +4. **Access the UI:** + - Open your browser and go to: `http://localhost:8000/app` + - API documentation: `http://localhost:8000/docs` + +--- + +## UI Overview + +The Control Panel has a modern, dark-themed interface with the following main sections: + +### Header +- **Logo & Title**: STT Control Panel +- **System Status Indicator**: Shows if the system is online/offline (green = online, red = offline) + +### Navigation Tabs +Six main tabs for different functionalities: +1. **Dashboard** - System overview and statistics +2. **Transcribe** - Audio transcription interface +3. **Data Management** - Failed cases and dataset preparation +4. **Fine-Tuning** - Fine-tuning orchestration +5. **Models** - Model version management +6. **Monitoring** - Performance metrics and trends + +--- + +## Navigation Tabs + +### 1. Dashboard Tab + +**Purpose**: Overview of system health and statistics + +**What you'll see:** +- **System Health Card**: Shows baseline model status, agent status, and LLM availability +- **Agent Statistics Card**: + - Error detection threshold + - Total errors detected + - Corrections made + - Feedback count +- **Data Statistics Card**: + - Total failed cases + - Corrected cases + - Correction rate percentage + - Average error score +- **Model Information Card**: Current model details (name, parameters, device) +- **Recent Activity**: Log of recent system activities + +**How to use:** +- Click the refresh icon (🔄) on any card to update statistics +- Monitor system health indicators +- Check if all components are operational + +--- + +### 2. Transcribe Tab ⭐ (Main Feature) + +**Purpose**: Upload audio files and get transcriptions with error detection and correction + +**Key Features:** +- Upload audio files (.wav, .mp3, .ogg) +- Select STT model version +- Choose transcription mode (Baseline or Agent) +- View side-by-side comparison of original vs. corrected transcripts + +#### Step-by-Step Transcription Process: + +1. **Select STT Model** (Dropdown): + - **Wav2Vec2 Base**: Baseline model (facebook/wav2vec2-base-960h) + - **Fine-tuned Wav2Vec2**: Improved model after fine-tuning + +2. **Choose Transcription Mode**: + - **Agent (Recommended)**: Full pipeline with error detection and LLM correction + - Processing time: 10-15 seconds (includes LLM processing) + - Shows both original STT transcript and LLM-refined transcript + - **Baseline (Fast)**: Simple transcription without error detection + - Processing time: 1-2 seconds + - No LLM correction + +3. **Agent Options** (only visible in Agent mode): + - **Enable Auto-Correction**: + - ✅ ON: LLM detects errors AND applies corrections + - ❌ OFF: LLM only detects errors but doesn't correct them + - **Record Errors Automatically**: + - ✅ ON: Failed cases are saved for future fine-tuning + - ❌ OFF: Errors detected but not saved + +4. **Upload Audio File**: + - Click the upload area or drag and drop + - Supported formats: WAV, MP3, OGG + - File info will display after selection + +5. **Click "Transcribe Audio"**: + - Button shows loading state during processing + - Results appear below when complete + +#### Understanding Transcription Results: + +**Side-by-Side Comparison:** +- **Left Column (Red border)**: STT Original Transcript + - Raw output from the selected STT model + - May contain errors, especially with base model +- **Right Column (Blue border)**: LLM Refined Transcript (Gold Standard) + - Corrected version after LLM analysis + - Shows what the transcript should be + +**Additional Information:** +- **Model Information**: Selected model and mode +- **Error Detection**: + - Has Errors: Yes/No badge + - Error Count: Number of errors found + - Error Score: Severity score (0-1) +- **Corrections Applied**: Number of corrections made +- **Case Recorded**: Case ID if errors were saved +- **Performance**: Inference time in seconds + +--- + +### 3. Data Management Tab + +**Purpose**: View and manage failed transcription cases + +**Features:** + +#### Failed Cases Section: +- **Search Bar**: Filter cases by keywords +- **Filter Dropdown**: + - All Cases + - Uncorrected (need attention) + - Corrected (already processed) +- **Case List**: Shows case cards with: + - Case ID + - Status badge (Corrected/Uncorrected) + - Transcript preview + - Timestamp + - Error score +- **Pagination**: Navigate through cases (Previous/Next) + +**Clicking a Case:** +- Opens a modal with full case details +- Shows original and corrected transcripts +- Displays error types +- Option to add manual corrections + +#### Dataset Preparation Section: +- **Minimum Error Score**: Filter cases by error severity (0.0-1.0) +- **Max Samples**: Limit number of samples in dataset +- **Balance Error Types**: Ensure diverse error types +- **Create Version**: Create a new dataset version +- **Prepare Dataset Button**: Generate fine-tuning dataset + +#### Available Datasets Section: +- Lists all prepared datasets +- Shows dataset IDs and status + +--- + +### 4. Fine-Tuning Tab + +**Purpose**: Manage automated fine-tuning pipeline + +**Features:** + +#### Orchestrator Status: +- **Status**: Operational/Unavailable +- **Ready for Fine-tuning**: Yes/No indicator +- **Total Jobs**: Number of fine-tuning jobs + +#### Trigger Fine-Tuning: +- **Force Trigger**: Bypass readiness checks +- **Trigger Fine-Tuning Button**: Manually start a fine-tuning job + +#### Fine-Tuning Jobs: +- List of all fine-tuning jobs +- Shows job ID, status, creation time, and dataset used +- Click to view job details + +**Note**: Fine-tuning requires sufficient failed cases and proper configuration. + +--- + +### 5. Models Tab + +**Purpose**: View and manage model versions + +**Features:** + +#### Current Model: +- Model name and parameters +- Device information +- Trainable parameters + +#### Deployed Model: +- Currently deployed model version +- Deployment timestamp +- Model metadata + +#### Model Versions: +- List of all model versions +- Status badges (deployed/available) +- Creation timestamps +- Click to view version details + +--- + +### 6. Monitoring Tab + +**Purpose**: Track system performance over time + +**Features:** + +#### Performance Metrics: +- Total inferences +- Average inference time +- Error detection rate +- Correction rate + +#### Performance Trends: +- Select metric (WER or CER) +- Select time window (7/30/90 days) +- View trend data (visualization can be added) + +--- + +## Model Selection Guide + +### Understanding Model Versions + +#### Wav2Vec2 Base (Baseline) +- **Model**: facebook/wav2vec2-base-960h +- **Framework**: PyTorch +- **Performance**: Baseline accuracy (~36% WER on real-world data) +- **Use Case**: Demonstrates baseline performance before fine-tuning +- **When to use**: Show the "before" state in your demo + +#### Fine-tuned Wav2Vec2 (Improved) +- **Model**: Fine-tuned Wav2Vec2 (trained on failed cases) +- **Framework**: PyTorch +- **Performance**: Improved accuracy after fine-tuning +- **Use Case**: Shows improvement after fine-tuning on domain-specific data +- **When to use**: Demonstrate improved performance after fine-tuning + +### Model Selection Strategy for Demo: + +1. **Start with Baseline**: Upload audio → See baseline transcription +2. **Show Error Detection**: Notice errors in original transcript +3. **Show LLM Correction**: See refined transcript in right column +4. **Explain Fine-tuning**: Mention that errors are saved for training +5. **Switch to Fine-tuned v2/v3**: Upload same audio → See better results + +--- + +## Understanding Results + +### Transcript Comparison + +**Original STT Transcript (Left):** +- Raw output from speech-to-text model +- May contain: + - Spelling errors + - Medical terminology mistakes + - Grammar issues + - Word substitutions + +**LLM Refined Transcript (Right):** +- Corrected by Llama LLM (via Ollama) +- Improvements: + - Fixed spelling errors + - Corrected medical terms + - Improved grammar + - Better context understanding + +### Error Detection Metrics + +- **Has Errors**: Boolean indicating if errors were found +- **Error Count**: Number of individual errors detected +- **Error Score**: Overall quality score (0.0 = perfect, 1.0 = many errors) +- **Error Types**: Categories of errors (medical terminology, spelling, grammar) + +### Case Recording + +When errors are detected and "Record Errors Automatically" is enabled: +- Case is saved to data management system +- Gets a unique Case ID +- Original and corrected transcripts are stored +- Used for future fine-tuning dataset preparation + +--- + +## Data Management + +### Failed Cases Workflow + +1. **Automatic Recording**: + - Errors detected during transcription + - Cases automatically saved if "Record Errors Automatically" is ON + +2. **Manual Review**: + - View cases in Data Management tab + - Filter by status (corrected/uncorrected) + - Click case to view details + +3. **Manual Correction**: + - Open case details + - Add correction if needed + - Save correction + +4. **Dataset Preparation**: + - Set filters (error score, max samples) + - Click "Prepare Dataset" + - Dataset created for fine-tuning + +### Dataset Preparation Tips + +- **Minimum Error Score**: + - Lower (0.3): Include more cases, diverse errors + - Higher (0.7): Only severe errors, focused training +- **Max Samples**: + - Start with 100-500 for testing + - Use 1000+ for production fine-tuning +- **Balance Error Types**: + - ✅ Recommended: Ensures diverse training data + - ❌ Off: May bias toward common error types + +--- + +## Fine-Tuning + +### When Fine-Tuning Triggers + +The system automatically triggers fine-tuning when: +- Sufficient failed cases accumulated (threshold: configurable) +- Error rate is high enough +- System is ready (no ongoing jobs) + +### Manual Trigger + +You can manually trigger fine-tuning: +1. Go to Fine-Tuning tab +2. Check "Force Trigger" if needed (bypasses checks) +3. Click "Trigger Fine-Tuning" +4. Monitor job status + +### Fine-Tuning Process + +1. **Dataset Preparation**: Failed cases converted to training format +2. **Model Training**: Fine-tune on prepared dataset +3. **Validation**: Test against baseline +4. **Deployment**: Deploy if improvements validated +5. **Versioning**: New model version created + +--- + +## Troubleshooting + +### Common Issues + +#### 1. "System Offline" Status +**Problem**: Red status indicator in header +**Solutions**: +- Check if server is running: `./start_control_panel.sh` +- Verify port 8000 is not in use +- Check server logs for errors + +#### 2. Transcription Fails +**Problem**: Error message when transcribing +**Solutions**: +- Check audio file format (WAV, MP3, OGG supported) +- Ensure file is not corrupted +- Check server logs for detailed error +- Verify model is loaded (check Dashboard) + +#### 3. "Fine-tuned model not found" +**Problem**: Fine-tuned model cannot be loaded +**Solutions**: +- Ensure fine-tuned model exists at `models/finetuned_wav2vec2/` +- Run fine-tuning script first if model doesn't exist +- Check server logs for detailed error messages + +#### 4. Slow Transcription +**Problem**: Transcription takes too long +**Solutions**: +- Agent mode takes 10-15 seconds (normal for LLM processing) +- Use Baseline mode for faster results (1-2 seconds) +- Check system resources (CPU/GPU) +- Reduce audio file size if very large + +#### 5. No Results Displayed +**Problem**: Transcription completes but no results shown +**Solutions**: +- Check browser console for JavaScript errors +- Refresh the page +- Check network tab for API errors +- Verify API is responding: `http://localhost:8000/api/health` + +#### 6. Model Not Loading +**Problem**: Model fails to load +**Solutions**: +- Check internet connection (models download from Hugging Face) +- Ensure sufficient disk space (~2-4GB per model) +- Check model name is correct +- Review server logs for specific error + +### Getting Help + +1. **Check Logs**: Server logs show detailed error messages +2. **API Documentation**: Visit `http://localhost:8000/docs` for API details +3. **Health Check**: Visit `http://localhost:8000/api/health` for system status +4. **Browser Console**: Press F12 to see frontend errors + +--- + +## Important Notes + +### System Architecture + +**Components:** +1. **STT Models**: Speech-to-text transcription (Wav2Vec2) +2. **LLM Corrector**: Llama LLM (via Ollama) for error detection and correction +3. **Error Detector**: Heuristic-based error detection +4. **Data Manager**: Stores failed cases and manages datasets +5. **Fine-tuning Coordinator**: Orchestrates model fine-tuning + +### Processing Flow + +1. **Audio Upload** → STT Model transcribes +2. **Error Detection** → Detects errors in transcript +3. **LLM Correction** → Llama LLM refines transcript +4. **Case Recording** → Saves errors if enabled +5. **Fine-tuning** → Uses cases to improve model + +### Best Practices + +1. **For Demos**: + - Start with Base v1 to show poor performance + - Use Agent mode to show full pipeline + - Enable both auto-correction and error recording + - Switch to Fine-tuned models to show improvement + +2. **For Production**: + - Use Fine-tuned v3 for best accuracy + - Monitor error rates in Monitoring tab + - Regularly review failed cases + - Prepare datasets when sufficient cases accumulated + +3. **Audio Files**: + - Use clear audio (minimize background noise) + - WAV format recommended for best quality + - Keep files under 10MB for faster processing + - Sample rate: 16kHz is optimal + +### Performance Expectations + +- **Base Model (Wav2Vec2 Base)**: + - Speed: ~1-2 seconds + - Accuracy: ~36% WER on real-world data (demonstrates need for fine-tuning) + +- **Fine-tuned Model (Fine-tuned Wav2Vec2)**: + - Speed: ~1-2 seconds + - Accuracy: Improved after fine-tuning on domain-specific data + +- **LLM Correction**: + - Processing time: <1 second (with Ollama) + - Improves transcript quality significantly + +### Security & Privacy + +- All processing happens locally (if using local models) +- Audio files are temporarily stored during processing +- Failed cases stored in `data/production/` directory +- No data sent to external services (unless using cloud APIs) + +### Limitations + +1. **Ollama LLM**: Requires Ollama server running locally with Llama models installed +2. **Model Loading**: First load takes time (downloads from Hugging Face) +3. **Memory**: Large models require sufficient RAM +4. **Audio Length**: Very long audio files may timeout + +--- + +## Quick Reference + +### Keyboard Shortcuts +- **F12**: Open browser developer console +- **Ctrl+R / Cmd+R**: Refresh page +- **Ctrl+Shift+R / Cmd+Shift+R**: Hard refresh (clear cache) + +### Important URLs +- **Control Panel**: `http://localhost:8000/app` +- **API Docs**: `http://localhost:8000/docs` +- **Health Check**: `http://localhost:8000/api/health` +- **API Root**: `http://localhost:8000/` + +### File Locations +- **Audio Files**: Upload via UI (temporary storage) +- **Failed Cases**: `data/production/failed_cases/` +- **Datasets**: `data/production/finetuning/` +- **Model Versions**: `data/production/versions/` + +--- + +## Demo Script Example + +Here's a suggested flow for demonstrating the system: + +1. **Introduction** (Dashboard Tab): + - Show system health + - Explain components + +2. **Base Model Demo** (Transcribe Tab): + - Select "Wav2Vec2 Base" + - Upload audio file + - Show baseline transcription in left column + - Explain errors + +3. **LLM Correction**: + - Show refined transcript in right column + - Highlight improvements + - Explain error detection and correction + +4. **Data Collection**: + - Show case was recorded + - Explain this feeds fine-tuning + +5. **Fine-tuned Model** (Transcribe Tab): + - Switch to "Fine-tuned Wav2Vec2" + - Upload same audio + - Show improved transcription + - Compare with base model results + +6. **System Overview**: + - Show Data Management tab (failed cases) + - Show Fine-tuning tab (jobs) + - Show Monitoring tab (metrics) + +--- + +## Support & Resources + +- **Project Documentation**: See `docs/` directory +- **API Documentation**: Built-in at `/docs` endpoint +- **Setup Guide**: See `SETUP_INSTRUCTIONS.md` + +--- + +**Happy Transcribing! 🎤✨** + +For questions or issues, check the troubleshooting section or review server logs. + diff --git a/data/production/failed_cases/failed_cases.jsonl b/data/production/failed_cases/failed_cases.jsonl index ce61fdf..e63b5d6 100644 --- a/data/production/failed_cases/failed_cases.jsonl +++ b/data/production/failed_cases/failed_cases.jsonl @@ -1,3 +1,45 @@ {"case_id": "7375e53e0f08", "audio_path": "audio/user_recording_1.wav", "original_transcript": "THIS IS ALL CAPS", "corrected_transcript": null, "error_types": ["all_caps"], "error_score": 0.7, "metadata": {"error_details": [{"type": "all_caps", "confidence": 0.7}], "inference_time": 0.5, "model_confidence": 0.85}, "timestamp": "2025-11-23T15:55:13.113200"} {"case_id": "748889eb2474", "audio_path": "audio/user_recording_2.wav", "original_transcript": "THIS IS ALL CAPS", "corrected_transcript": null, "error_types": ["all_caps"], "error_score": 0.7, "metadata": {"error_details": [{"type": "all_caps", "confidence": 0.7}], "inference_time": 0.5, "model_confidence": 0.85}, "timestamp": "2025-11-23T15:55:13.116551"} {"case_id": "c73c35991f70", "audio_path": "audio/user_recording_3.wav", "original_transcript": "THIS IS ALL CAPS", "corrected_transcript": null, "error_types": ["all_caps"], "error_score": 0.7, "metadata": {"error_details": [{"type": "all_caps", "confidence": 0.7}], "inference_time": 0.5, "model_confidence": 0.85}, "timestamp": "2025-11-23T15:55:13.117785"} +{"case_id": "d9de728f226c", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmptltrsnxd.wav", "original_transcript": " Add the sum to the product of these three.", "corrected_transcript": " Add the sum to the product of these three.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.17237401008605957, "model_confidence": null}, "timestamp": "2025-12-09T23:38:22.278048"} +{"case_id": "6eba0bc57f47", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp5_5_gxi_.wav", "original_transcript": " Add the sum to the product of these three.", "corrected_transcript": " Add the sum to the product of these three.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.4015941619873047, "model_confidence": null}, "timestamp": "2025-12-09T23:41:34.390110"} +{"case_id": "d328187ebd7c", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp7tynqb1s.wav", "original_transcript": " There is, according to legend, a boiling pot of gold at one end.", "corrected_transcript": " There is, according to legend, a boiling pot of gold at one end.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.43673181533813477, "model_confidence": null}, "timestamp": "2025-12-10T01:17:49.984330"} +{"case_id": "965d3576739a", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp6rkt7fot.wav", "original_transcript": " This latter point is usually important.", "corrected_transcript": " This latter point is usually important.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.1960000991821289, "model_confidence": null}, "timestamp": "2025-12-10T01:19:51.006567"} +{"case_id": "10ad6839cc65", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp78nvfkf_.wav", "original_transcript": " This latter point is usually important.", "corrected_transcript": " This latter point is usually important.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.20566296577453613, "model_confidence": null}, "timestamp": "2025-12-10T01:40:01.858875"} +{"case_id": "3e267b94943c", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmppm69negx.wav", "original_transcript": " Distinguished colleagues Today I will briefly address the evolving landscape of precision medicine Imagine multi-factorial polygeneic disorders As we integrate pharmacogenomics with longitudinal multi-modal biomaker profiling we are redefining the pathophysiology of cardiometabolic and neurodegenerative diseases", "corrected_transcript": " Distinguished colleagues Today I will briefly address the evolving landscape of precision medicine Imagine multi-factorial polygeneic disorders As we integrate pharmacogenomics with longitudinal multi-modal biomaker profiling we are redefining the pathophysiology of cardiometabolic and neurodegenerative diseases", "error_types": ["no_punctuation"], "error_score": 0.3, "metadata": {"inference_time": 0.5361909866333008, "model_confidence": null}, "timestamp": "2025-12-10T01:48:39.222266"} +{"case_id": "9ce5a9598cfb", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp692sbluj.wav", "original_transcript": " Distinguished colleagues Today I will briefly address the evolving landscape of precision medicine Imagine multi-factorial polygeneic disorders As we integrate pharmacogenomics with longitudinal multi-modal biomaker profiling we are redefining the pathophysiology of cardiometabolic and neurodegenerative diseases", "corrected_transcript": " Distinguished colleagues Today I will briefly address the evolving landscape of precision medicine Imagine multi-factorial polygeneic disorders As we integrate pharmacogenomics with longitudinal multi-modal biomaker profiling we are redefining the pathophysiology of cardiometabolic and neurodegenerative diseases", "error_types": ["no_punctuation"], "error_score": 0.3, "metadata": {"inference_time": 0.6433022022247314, "model_confidence": null}, "timestamp": "2025-12-10T03:40:32.431778"} +{"case_id": "aa7df876764e", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpxgap1eyq.wav", "original_transcript": " Distinguished colleagues Today I will briefly address the evolving landscape of precision medicine Imagine multi-factorial polygeneic disorders As we integrate pharmacogenomics with longitudinal multi-modal biomaker profiling we are redefining the pathophysiology of cardiometabolic and neurodegenerative diseases", "corrected_transcript": " Distinguished colleagues Today I will briefly address the evolving landscape of precision medicine Imagine multi-factorial polygeneic disorders As we integrate pharmacogenomics with longitudinal multi-modal biomaker profiling we are redefining the pathophysiology of cardiometabolic and neurodegenerative diseases", "error_types": ["no_punctuation"], "error_score": 0.3, "metadata": {"inference_time": 0.4731431007385254, "model_confidence": null}, "timestamp": "2025-12-10T03:41:17.828072"} +{"case_id": "d6e1b446137c", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpw51x6e_6.wav", "original_transcript": " Add the sum to the product of these three.", "corrected_transcript": " Add the sum to the product of these three.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.3866550922393799, "model_confidence": null}, "timestamp": "2025-12-10T03:44:17.514144"} +{"case_id": "3e11e9b84800", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpriushe7u.wav", "original_transcript": " Add the sum to the product of these three.", "corrected_transcript": " Add the sum to the product of these three.", "error_types": ["length_anomaly_long"], "error_score": 0.7, "metadata": {"inference_time": 0.40584397315979004, "model_confidence": null}, "timestamp": "2025-12-10T03:46:47.494085"} +{"case_id": "2dcd7b3a5050", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpg6pf45kz.wav", "original_transcript": "DISTINGUISH COLETES TO DAY I WILL BRIEFLY ADDRESS THEVOLVING LANDSCAPE OF PESISION MEDICINE INMAGIN MULTY FACTORIAL FALYGENG DISODERS AS YOU INTIGRAT FARMOPIGENAMACS THELOGITNAL MORTIMODAL BYOMAKER PROPISING WE ARE REDIFYING THE PARTOR PHYSIOLOGY OF CARDIOMETABOLIC AND ERLY GENATIV DESEDOES HA RESOLUTION ECOCARDIOGRAPHIC STRATIFICATION CAPADIT TRANSCRIPTOMAC AND PROTIOMIC SURVEILLANS NOW ALLOWS US TO PREAM DECOMPENSATION LONG BEFORE AWORD CLINICAL SYMTEMOLOGY EMERGES YET WE STILL GRAPPLE TIT EACTOGENIC COMLIGATIONS FROM FROM BO ANBOLOC PHENOMENA TO OCARD ANDOCTRYNOM FACT HEIS AND IN NUMAN MEDIATED HEPATOPOXICITY SECONRY TO AGGRESSIVE CANTINUA PLASTIC LEGIMENTS AR CHALLENGE IS TO SYMPRECISE THESE INPLEASINGLY GRADULLY DETER SATS INTO ACTIONABL PATIEN CENTRIC ALGORITENS WITHOUT CIRCUMBLIC DIN ALGORATMIC OPPEACITY ORT HEPUTIC NICOLATION THY FOSTERING CROSS DISCIPLINARY COLLABORATION AMONG CARDIOLOGY USUAL IMENEADOGY OTOR HENAL TEGOLOGY ANCIDICUL CARE WE CAN TRANSFORM EPI SODIC REACTIVE CARE INTO ANTICIPRATRY CONTINUOUSLY OPTOMISE INTERVENTION ARTIMATELY MITIGATING MORBIGITY I ATINUATING WO YES SORRY", "corrected_transcript": "Distinguish coletes to day i will briefly address thevolving landscape of pesision medicine inmagin multy factorial falygeng disoders as you intigrat farmopigenamacs thelogitnal mortimodal byomaker propising we are redifying the partor physiology of cardiometabolic and erly genativ desedoes ha resolution ecocardiographic stratification capadit transcriptomac and protiomic surveillans now allows us to pream decompensation long before aword clinical symtemology emerges yet we still grapple tit eactogenic comligations from from bo anboloc phenomena to ocard andoctrynom fact heis and in numan mediated hepatopoxicity seconry to aggressive cantinua plastic legiments ar challenge is to symprecise these inpleasingly gradully deter sats into actionabl patien centric algoritens without circumblic din algoratmic oppeacity ort heputic nicolation thy fostering cross disciplinary collaboration among cardiology usual imeneadogy otor henal tegology ancidicul care we can transform epi sodic reactive care into anticipratry continuously optomise intervention artimately mitigating morbigity i atinuating wo yes sorry", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 3.998844861984253, "model_confidence": null}, "timestamp": "2025-12-10T03:55:22.116383"} +{"case_id": "fd654596d05e", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmplsxejlbl.wav", "original_transcript": "ADD THE SUM TO THE PRODUCT OF THESE THREE", "corrected_transcript": "Add the sum to the product of these three", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.12563800811767578, "model_confidence": null}, "timestamp": "2025-12-10T03:56:54.062551"} +{"case_id": "6b760bca8acb", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmppjzfzzm4.wav", "original_transcript": "HIS LATRPAR AS USUALLY FORE", "corrected_transcript": "His latrpar as usually fore", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.06222891807556152, "model_confidence": null}, "timestamp": "2025-12-10T03:57:05.775307"} +{"case_id": "4e0d253b0ef1", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpygvo11na.wav", "original_transcript": "THERE IS ACCORDING TO LEGEND A BOILING POT OF GOLD AT ONE END", "corrected_transcript": "There is according to legend a boiling pot of gold at one end", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.08620572090148926, "model_confidence": null}, "timestamp": "2025-12-10T03:57:23.034016"} +{"case_id": "dc0e7a4d0119", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpqoh_5qc0.wav", "original_transcript": "HIS LATRPAR AS USUALLY FORE", "corrected_transcript": "His latrpar as usually fore", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.06853508949279785, "model_confidence": null}, "timestamp": "2025-12-10T03:58:25.535945"} +{"case_id": "ed50ca3916c8", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp_4vesicl.wav", "original_transcript": "HIS LATRPAR AS USUALLY FORE", "corrected_transcript": "His latrpar as usually fore", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.06948208808898926, "model_confidence": null}, "timestamp": "2025-12-10T04:01:24.212633"} +{"case_id": "760e1e0ad7b9", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmphzfft7ma.wav", "original_transcript": "HIS LATRPAR AS USUALLY FORE", "corrected_transcript": "His latrpar as usually fore", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.7921261787414551, "model_confidence": null}, "timestamp": "2025-12-10T04:06:53.998994"} +{"case_id": "5152857f9fdf", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp6yd34qdz.wav", "original_transcript": "AS THE PATIENT IS EXPERIENCAN CHITE PAIN AND SHORT MATAL WAN", "corrected_transcript": "As the patient is experiencan chite pain and short matal wan", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.1928849220275879, "model_confidence": null}, "timestamp": "2025-12-10T04:11:02.218881"} +{"case_id": "e865f7291736", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpcw1dfxb6.wav", "original_transcript": "THE PATIENT IS EXPERIENCING JESS SPANE AND SHARTNESS OF BREAD", "corrected_transcript": "The patient is experiencing jess spane and shartness of bread", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.18187212944030762, "model_confidence": null}, "timestamp": "2025-12-10T04:12:18.323108"} +{"case_id": "9788107a0932", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp2hf7bryp.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "This latter point is usually important.", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.09096026420593262, "model_confidence": null}, "timestamp": "2025-12-10T05:00:44.974630"} +{"case_id": "2d01fd849d32", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp_uuu5q3t.wav", "original_transcript": "This latter point is usually important", "corrected_transcript": "This latter point is usually important.", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.3355419635772705, "model_confidence": null}, "timestamp": "2025-12-10T05:00:58.333384"} +{"case_id": "c4747c18b792", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp7sh7uwzi.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "This latter point is usually important.", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.07509493827819824, "model_confidence": null}, "timestamp": "2025-12-10T05:01:15.150285"} +{"case_id": "c2831ef4f81b", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpdrp3egav.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "This latter point is usually important.", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.06198287010192871, "model_confidence": null}, "timestamp": "2025-12-10T05:09:00.519635"} +{"case_id": "69f9109b3fab", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpk98a086j.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "He\u2019s late as usual, of course.", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.08988595008850098, "model_confidence": null}, "timestamp": "2025-12-10T05:13:32.639878"} +{"case_id": "94de5b3eb59d", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpiyn6uvns.wav", "original_transcript": "Fe ol the heat", "corrected_transcript": "Feel the heat?", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.07331514358520508, "model_confidence": null}, "timestamp": "2025-12-10T05:16:08.928677"} +{"case_id": "47245c20dfb3", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpogqjqjr9.wav", "original_transcript": "Fe ol the heat", "corrected_transcript": "Feel the heat?", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.07196998596191406, "model_confidence": null}, "timestamp": "2025-12-10T05:21:03.503021"} +{"case_id": "c128457bbcbe", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpn0s65k4z.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.2, "metadata": {"inference_time": 0.06100797653198242, "model_confidence": null}, "timestamp": "2025-12-10T05:22:13.296998"} +{"case_id": "564bf9bc851b", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp4laavymi.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.08470416069030762, "model_confidence": null}, "timestamp": "2025-12-10T05:24:53.050833"} +{"case_id": "930f812ef3a2", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp9gvpvoc6.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.08349800109863281, "model_confidence": null}, "timestamp": "2025-12-10T05:29:35.272137"} +{"case_id": "e7786acf5d06", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp9uvodvd1.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.08183503150939941, "model_confidence": null}, "timestamp": "2025-12-10T05:31:40.505473"} +{"case_id": "3b3b7fd22abe", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp4d1s4a_s.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.06814384460449219, "model_confidence": null}, "timestamp": "2025-12-10T10:34:19.717515"} +{"case_id": "301cb245ea44", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmptl46hpt4.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "He\u2019s late as usual, of course.", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.5582067966461182, "model_confidence": null}, "timestamp": "2025-12-10T11:10:46.407755"} +{"case_id": "2db063d59f54", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpzr1add50.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.0617070198059082, "model_confidence": null}, "timestamp": "2025-12-10T11:11:44.967279"} +{"case_id": "825ac3dc1b53", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpnp6zixg1.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "He\u2019s late as usual, of course.", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.0583500862121582, "model_confidence": null}, "timestamp": "2025-12-10T11:12:04.545126"} +{"case_id": "f155fa30dafd", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpmf46nphe.wav", "original_transcript": "Fe ol the heat", "corrected_transcript": "Feel the heat?", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.061074256896972656, "model_confidence": null}, "timestamp": "2025-12-10T11:12:41.211453"} +{"case_id": "3aa1b9a3bae9", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpo2qpa3ot.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.07345986366271973, "model_confidence": null}, "timestamp": "2025-12-10T11:14:13.808611"} +{"case_id": "e1c6da79e9b3", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpl5ik9lyr.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "He\u2019s late as usual, of course.", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.09378981590270996, "model_confidence": null}, "timestamp": "2025-12-10T11:16:27.446480"} +{"case_id": "97ccaad861bd", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpiyxw4b_v.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "He\u2019s late as usual, of course.", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.0668642520904541, "model_confidence": null}, "timestamp": "2025-12-10T11:16:41.496064"} +{"case_id": "8d964494b23f", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp_eoznh6q.wav", "original_transcript": "Fe ol the heat", "corrected_transcript": "Feel the heat?", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.0482180118560791, "model_confidence": null}, "timestamp": "2025-12-10T11:16:58.617361"} +{"case_id": "e72ddfb39c90", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp37mztrnl.wav", "original_transcript": "His latrpar as usually fore", "corrected_transcript": "He\u2019s late as usual, of course.", "error_types": ["diff"], "error_score": 1.0, "metadata": {"inference_time": 0.13374614715576172, "model_confidence": null}, "timestamp": "2025-12-10T11:23:14.075318"} +{"case_id": "7290cd472157", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp3xkantbw.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.06048107147216797, "model_confidence": null}, "timestamp": "2025-12-10T11:23:32.073647"} +{"case_id": "d0f8f3011fba", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp5ny03pew.wav", "original_transcript": "It began a book by itsel", "corrected_transcript": "It became a book by itself", "error_types": ["diff"], "error_score": 0.3333333333333333, "metadata": {"inference_time": 0.13086295127868652, "model_confidence": null}, "timestamp": "2025-12-10T12:18:57.552829"} diff --git a/data/production/metadata/inference_stats.jsonl b/data/production/metadata/inference_stats.jsonl index 0b806a0..951df1a 100644 --- a/data/production/metadata/inference_stats.jsonl +++ b/data/production/metadata/inference_stats.jsonl @@ -1,3 +1,45 @@ {"timestamp": "2025-11-23T15:55:13.115454", "audio_path": "audio/user_recording_1.wav", "inference_time": 0.5, "model_confidence": 0.85, "error_detected": true, "corrected": false, "metadata": {"case_id": "7375e53e0f08", "error_score": 0.7}} {"timestamp": "2025-11-23T15:55:13.117087", "audio_path": "audio/user_recording_2.wav", "inference_time": 0.5, "model_confidence": 0.85, "error_detected": true, "corrected": false, "metadata": {"case_id": "748889eb2474", "error_score": 0.7}} {"timestamp": "2025-11-23T15:55:13.118515", "audio_path": "audio/user_recording_3.wav", "inference_time": 0.5, "model_confidence": 0.85, "error_detected": true, "corrected": false, "metadata": {"case_id": "c73c35991f70", "error_score": 0.7}} +{"timestamp": "2025-12-09T23:38:22.278647", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmptltrsnxd.wav", "inference_time": 0.17237401008605957, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "d9de728f226c", "error_score": 0.7}} +{"timestamp": "2025-12-09T23:41:34.390840", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp5_5_gxi_.wav", "inference_time": 0.4015941619873047, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "6eba0bc57f47", "error_score": 0.7}} +{"timestamp": "2025-12-10T01:17:49.984897", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp7tynqb1s.wav", "inference_time": 0.43673181533813477, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "d328187ebd7c", "error_score": 0.7}} +{"timestamp": "2025-12-10T01:19:51.007755", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp6rkt7fot.wav", "inference_time": 0.1960000991821289, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "965d3576739a", "error_score": 0.7}} +{"timestamp": "2025-12-10T01:40:01.860460", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp78nvfkf_.wav", "inference_time": 0.20566296577453613, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "10ad6839cc65", "error_score": 0.7}} +{"timestamp": "2025-12-10T01:48:39.222906", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmppm69negx.wav", "inference_time": 0.5361909866333008, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "3e267b94943c", "error_score": 0.3}} +{"timestamp": "2025-12-10T03:40:32.432219", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp692sbluj.wav", "inference_time": 0.6433022022247314, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "9ce5a9598cfb", "error_score": 0.3}} +{"timestamp": "2025-12-10T03:41:17.829119", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpxgap1eyq.wav", "inference_time": 0.4731431007385254, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "aa7df876764e", "error_score": 0.3}} +{"timestamp": "2025-12-10T03:44:17.514557", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpw51x6e_6.wav", "inference_time": 0.3866550922393799, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "d6e1b446137c", "error_score": 0.7}} +{"timestamp": "2025-12-10T03:46:47.495061", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpriushe7u.wav", "inference_time": 0.40584397315979004, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "3e11e9b84800", "error_score": 0.7}} +{"timestamp": "2025-12-10T03:55:22.116937", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpg6pf45kz.wav", "inference_time": 3.998844861984253, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "2dcd7b3a5050", "error_score": 1.0}} +{"timestamp": "2025-12-10T03:56:54.063048", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmplsxejlbl.wav", "inference_time": 0.12563800811767578, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "fd654596d05e", "error_score": 1.0}} +{"timestamp": "2025-12-10T03:57:05.775609", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmppjzfzzm4.wav", "inference_time": 0.06222891807556152, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "6b760bca8acb", "error_score": 1.0}} +{"timestamp": "2025-12-10T03:57:23.034286", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpygvo11na.wav", "inference_time": 0.08620572090148926, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "4e0d253b0ef1", "error_score": 1.0}} +{"timestamp": "2025-12-10T03:58:25.537079", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpqoh_5qc0.wav", "inference_time": 0.06853508949279785, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "dc0e7a4d0119", "error_score": 1.0}} +{"timestamp": "2025-12-10T04:01:24.213725", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp_4vesicl.wav", "inference_time": 0.06948208808898926, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "ed50ca3916c8", "error_score": 1.0}} +{"timestamp": "2025-12-10T04:06:53.999467", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmphzfft7ma.wav", "inference_time": 0.7921261787414551, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "760e1e0ad7b9", "error_score": 1.0}} +{"timestamp": "2025-12-10T04:11:02.219475", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp6yd34qdz.wav", "inference_time": 0.1928849220275879, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "5152857f9fdf", "error_score": 1.0}} +{"timestamp": "2025-12-10T04:12:18.323622", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpcw1dfxb6.wav", "inference_time": 0.18187212944030762, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "e865f7291736", "error_score": 1.0}} +{"timestamp": "2025-12-10T05:00:44.975196", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp2hf7bryp.wav", "inference_time": 0.09096026420593262, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "9788107a0932", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:00:58.333691", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp_uuu5q3t.wav", "inference_time": 0.3355419635772705, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "2d01fd849d32", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:01:15.150595", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp7sh7uwzi.wav", "inference_time": 0.07509493827819824, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "c4747c18b792", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:09:00.520652", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpdrp3egav.wav", "inference_time": 0.06198287010192871, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "c2831ef4f81b", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:13:32.640279", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpk98a086j.wav", "inference_time": 0.08988595008850098, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "69f9109b3fab", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:16:08.928967", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpiyn6uvns.wav", "inference_time": 0.07331514358520508, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "94de5b3eb59d", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:21:03.503429", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpogqjqjr9.wav", "inference_time": 0.07196998596191406, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "47245c20dfb3", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:22:13.298005", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpn0s65k4z.wav", "inference_time": 0.06100797653198242, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "c128457bbcbe", "error_score": 0.2}} +{"timestamp": "2025-12-10T05:24:53.052225", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp4laavymi.wav", "inference_time": 0.08470416069030762, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "564bf9bc851b", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T05:29:35.272524", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp9gvpvoc6.wav", "inference_time": 0.08349800109863281, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "930f812ef3a2", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T05:31:40.505848", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp9uvodvd1.wav", "inference_time": 0.08183503150939941, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "e7786acf5d06", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T10:34:19.718645", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp4d1s4a_s.wav", "inference_time": 0.06814384460449219, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "3b3b7fd22abe", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T11:10:46.408238", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmptl46hpt4.wav", "inference_time": 0.5582067966461182, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "301cb245ea44", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:11:44.968237", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpzr1add50.wav", "inference_time": 0.0617070198059082, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "2db063d59f54", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T11:12:04.545429", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpnp6zixg1.wav", "inference_time": 0.0583500862121582, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "825ac3dc1b53", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:12:41.212525", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpmf46nphe.wav", "inference_time": 0.061074256896972656, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "f155fa30dafd", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:14:13.809567", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpo2qpa3ot.wav", "inference_time": 0.07345986366271973, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "3aa1b9a3bae9", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T11:16:27.447650", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpl5ik9lyr.wav", "inference_time": 0.09378981590270996, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "e1c6da79e9b3", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:16:41.496351", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmpiyxw4b_v.wav", "inference_time": 0.0668642520904541, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "97ccaad861bd", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:16:58.617754", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp_eoznh6q.wav", "inference_time": 0.0482180118560791, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "8d964494b23f", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:23:14.075940", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp37mztrnl.wav", "inference_time": 0.13374614715576172, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "e72ddfb39c90", "error_score": 1.0}} +{"timestamp": "2025-12-10T11:23:32.073996", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp3xkantbw.wav", "inference_time": 0.06048107147216797, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "7290cd472157", "error_score": 0.3333333333333333}} +{"timestamp": "2025-12-10T12:18:57.553465", "audio_path": "/var/folders/_g/zgsfg2gs0gn5zzx3rq5dx3nm0000gn/T/tmp5ny03pew.wav", "inference_time": 0.13086295127868652, "model_confidence": null, "error_detected": true, "corrected": true, "metadata": {"case_id": "d0f8f3011fba", "error_score": 0.3333333333333333}} diff --git a/data/recordings_for_test/p232_155.wav b/data/recordings_for_test/p232_155.wav new file mode 100644 index 0000000..66814f4 Binary files /dev/null and b/data/recordings_for_test/p232_155.wav differ diff --git a/data/recordings_for_test/p232_173.wav b/data/recordings_for_test/p232_173.wav new file mode 100644 index 0000000..a1d00cd Binary files /dev/null and b/data/recordings_for_test/p232_173.wav differ diff --git a/data/recordings_for_test/p232_181.wav b/data/recordings_for_test/p232_181.wav new file mode 100644 index 0000000..ad65788 Binary files /dev/null and b/data/recordings_for_test/p232_181.wav differ diff --git a/data/recordings_for_test/p232_182.wav b/data/recordings_for_test/p232_182.wav new file mode 100644 index 0000000..13980a8 Binary files /dev/null and b/data/recordings_for_test/p232_182.wav differ diff --git a/data/recordings_for_test/p232_183.wav b/data/recordings_for_test/p232_183.wav new file mode 100644 index 0000000..b06c6fd Binary files /dev/null and b/data/recordings_for_test/p232_183.wav differ diff --git a/data/recordings_for_test/p232_184.wav b/data/recordings_for_test/p232_184.wav new file mode 100644 index 0000000..745e8a7 Binary files /dev/null and b/data/recordings_for_test/p232_184.wav differ diff --git a/data/recordings_for_test/p232_185.wav b/data/recordings_for_test/p232_185.wav new file mode 100644 index 0000000..ec5d10e Binary files /dev/null and b/data/recordings_for_test/p232_185.wav differ diff --git a/data/recordings_for_test/p232_186.wav b/data/recordings_for_test/p232_186.wav new file mode 100644 index 0000000..8a6f61a Binary files /dev/null and b/data/recordings_for_test/p232_186.wav differ diff --git a/data/recordings_for_test/p232_187.wav b/data/recordings_for_test/p232_187.wav new file mode 100644 index 0000000..241bdd5 Binary files /dev/null and b/data/recordings_for_test/p232_187.wav differ diff --git a/data/recordings_for_test/p232_188.wav b/data/recordings_for_test/p232_188.wav new file mode 100644 index 0000000..1ec1027 Binary files /dev/null and b/data/recordings_for_test/p232_188.wav differ diff --git a/data/recordings_for_test/p232_189.wav b/data/recordings_for_test/p232_189.wav new file mode 100644 index 0000000..8930404 Binary files /dev/null and b/data/recordings_for_test/p232_189.wav differ diff --git a/data/recordings_for_test/p232_190.wav b/data/recordings_for_test/p232_190.wav new file mode 100644 index 0000000..786bb56 Binary files /dev/null and b/data/recordings_for_test/p232_190.wav differ diff --git a/frontend/app.js b/frontend/app.js index 3ce0d8a..e83319c 100644 --- a/frontend/app.js +++ b/frontend/app.js @@ -3,11 +3,13 @@ const API_BASE_URL = window.location.origin; let selectedFile = null; let currentPage = 0; const PAGE_SIZE = 20; +let performanceMock = null; // ==================== INITIALIZATION ==================== document.addEventListener('DOMContentLoaded', () => { initializeTabs(); initializeTranscriptionMode(); + initializeModelSelector(); checkSystemHealth(); loadDashboard(); @@ -15,6 +17,55 @@ document.addEventListener('DOMContentLoaded', () => { setInterval(checkSystemHealth, 30000); }); +// Initialize model selector by loading available models from backend +async function initializeModelSelector() { + const modelSelector = document.getElementById('model-selector'); + if (!modelSelector) return; + + try { + const response = await fetch(`${API_BASE_URL}/api/models/available`); + if (!response.ok) { + throw new Error('Failed to fetch available models'); + } + + const data = await response.json(); + const models = data.models || []; + const defaultModel = data.default || 'wav2vec2-base'; + + // Clear existing options + modelSelector.innerHTML = ''; + + // Add options for each available model + models.forEach(model => { + if (model.is_available) { + const option = document.createElement('option'); + option.value = model.id; // Use the actual model identifier + option.textContent = model.display_name || model.name; + if (model.id === defaultModel || model.is_current) { + option.selected = true; + } + modelSelector.appendChild(option); + } + }); + + // If no models found, add a fallback + if (modelSelector.options.length === 0) { + const option = document.createElement('option'); + option.value = 'wav2vec2-base'; + option.textContent = 'Wav2Vec2 Base'; + option.selected = true; + modelSelector.appendChild(option); + } + } catch (e) { + console.error('Could not initialize model selector:', e); + // Fallback to default options + modelSelector.innerHTML = ` + + + `; + } +} + // ==================== TAB NAVIGATION ==================== function initializeTabs() { const tabButtons = document.querySelectorAll('.tab-btn'); @@ -28,6 +79,18 @@ function initializeTabs() { } function switchTab(tabName) { + // Clear any existing auto-refresh intervals when switching tabs + if (window.finetuningRefreshInterval) { + clearInterval(window.finetuningRefreshInterval); + window.finetuningRefreshInterval = null; + } + + // Clear any job polling intervals + if (window.jobPollInterval) { + clearInterval(window.jobPollInterval); + window.jobPollInterval = null; + } + // Update buttons document.querySelectorAll('.tab-btn').forEach(btn => { btn.classList.remove('active'); @@ -57,16 +120,27 @@ function loadTabData(tabName) { case 'finetuning': refreshFinetuningStatus(); refreshJobs(); + // Auto-refresh fine-tuning status every 5 seconds when tab is active + clearInterval(window.finetuningRefreshInterval); + window.finetuningRefreshInterval = setInterval(() => { + if (document.getElementById('finetuning').classList.contains('active')) { + refreshFinetuningStatus(); + refreshJobs(); + } + }, 5000); break; case 'models': loadModelInfo(); - loadDeployedModel(); refreshModelVersions(); break; case 'monitoring': refreshPerformanceMetrics(); refreshTrends(); break; + case 'transcribe': + // Ensure model selector is set to fine-tuned if available + initializeModelSelector(); + break; } } @@ -93,6 +167,12 @@ async function checkSystemHealth() { function updateHealthDisplay(health) { const container = document.getElementById('health-info'); + performanceMock = { + total_inferences: health.components?.agent?.total_inferences || 0, + avg_inference_time: 0.0, + avg_error_score: 0.0 + }; + const html = `
Failed to load model information.
'; showToast('Failed to load model information', 'error'); } } @@ -268,6 +310,15 @@ function handleFileSelect(event) {STT processing...
'; + } + + if (llmBox) { + if (mode === 'agent') { + llmBox.innerHTML = 'LLM is analyzing and refining transcript... (this may take 10-15 seconds)
'; + } else { + llmBox.innerHTML = 'No LLM correction in baseline mode
'; + } + } + + // Show the result container early so user sees loading state + const resultContainer = document.getElementById('transcription-result'); + resultContainer.classList.remove('hidden'); try { const formData = new FormData(); formData.append('file', selectedFile); let url = `${API_BASE_URL}/api/transcribe/${mode}`; + const params = new URLSearchParams(); + params.append('model', selectedModel); if (mode === 'agent') { - url += `?auto_correction=${autoCorrection}&record_if_error=${recordErrors}`; + params.append('auto_correction', autoCorrection); + params.append('record_if_error', recordErrors); } + url += `?${params.toString()}`; const response = await fetch(url, { method: 'POST', @@ -305,24 +387,64 @@ async function transcribeAudio() { } const result = await response.json(); - displayTranscriptionResult(result, mode); + displayTranscriptionResult(result, mode, selectedModel); showToast('Transcription completed successfully', 'success'); } catch (error) { showToast('Transcription failed: ' + error.message, 'error'); + document.getElementById('stt-original-transcript').innerHTML = 'Error: ' + error.message + '
'; + document.getElementById('llm-refined-transcript').innerHTML = 'Error occurred
'; } finally { transcribeBtn.disabled = false; transcribeBtn.innerHTML = originalText; } } -function displayTranscriptionResult(result, mode) { +function displayTranscriptionResult(result, mode, selectedModel) { const container = document.getElementById('transcription-result'); container.classList.remove('hidden'); + // Get transcripts - use original_transcript for STT and transcript (or corrected) for LLM refined + const sttOriginal = result.original_transcript || result.transcript || 'No transcription available'; + + // Update the side-by-side transcript display + const sttBox = document.getElementById('stt-original-transcript'); + const llmBox = document.getElementById('llm-refined-transcript'); + + if (sttBox) { + sttBox.innerHTML = `${sttOriginal}
`; + } + + if (llmBox) { + if (mode === 'baseline') { + // Baseline mode: no LLM correction, show same as STT + llmBox.innerHTML = `No LLM correction in baseline mode. Use Agent mode to see LLM-refined transcript.
`; + } else { + // Agent mode: show LLM refined transcript + const llmRefined = result.corrected_transcript || result.transcript || 'No refined transcription available'; + llmBox.innerHTML = `${llmRefined}
`; + } + } + + // Remove any existing additional info sections (except transcripts-comparison) + const existingSections = container.querySelectorAll('.result-section'); + existingSections.forEach(section => { + if (!section.closest('.transcripts-comparison')) { + section.remove(); + } + }); + + // Build additional info section let html = `Case ID: ${result.case_id}
This error case will be used for fine-tuning the model.
No failed cases found
'; - return; - } - - const html = data.cases.map(caseItem => ` -No datasets available
'; + if (!data.files || data.files.length === 0) { + container.innerHTML = ` +
+ No files found. Add audio files to ${samplePath} and click refresh.
+
+ Failed to list files. Ensure the server can read ${samplePath}.
+
Fine-tuning coordinator not available
'; - return; + // Add cache-busting timestamp to ensure fresh data + const timestamp = new Date().getTime(); + const response = await fetch(`${API_BASE_URL}/api/finetuning/status?t=${timestamp}`, { + cache: 'no-cache', + headers: { + 'Cache-Control': 'no-cache' + } + }); + if (!response.ok) { + throw new Error('Failed to fetch status'); } - const data = await response.json(); - const container = document.getElementById('finetuning-status'); + const orchestrator = data.orchestrator || {}; + const status = data.status || 'unknown'; + const errorCount = orchestrator.error_cases_count || 0; + const totalJobs = orchestrator.total_jobs || 0; + const activeJobs = orchestrator.active_jobs || 0; + const minErrorCases = orchestrator.min_error_cases || 100; + const casesNeeded = orchestrator.cases_needed || 0; + const casesNeededMessage = orchestrator.cases_needed_message || ''; + const shouldTrigger = orchestrator.should_trigger || false; + + // Determine status badge color + let statusBadgeClass = 'badge-secondary'; + if (status === 'ready' || status === 'operational') { + statusBadgeClass = 'badge-success'; + } else if (status === 'active') { + statusBadgeClass = 'badge-info'; + } else if (status === 'unavailable') { + statusBadgeClass = 'badge-secondary'; + } else if (status === 'error') { + statusBadgeClass = 'badge-danger'; + } else { + statusBadgeClass = 'badge-warning'; + } + const html = `Failed to load status
'; + container.innerHTML = ` +Fine-tuning coordinator not available
'; - return; + if (!response.ok) { + throw new Error('Failed to fetch jobs'); } - const data = await response.json(); - const container = document.getElementById('jobs-list'); + const jobs = data.jobs || []; - if (data.jobs.length === 0) { - container.innerHTML = 'No fine-tuning jobs yet
'; + if (jobs.length === 0) { + container.innerHTML = 'No fine-tuning jobs found
'; return; } - const html = data.jobs.map(job => ` -Failed to load jobs
'; + container.innerHTML = `Failed to load jobs: ${error.message}
`; } } // ==================== MODELS ==================== -async function loadDeployedModel() { - try { - const response = await fetch(`${API_BASE_URL}/api/models/deployed`); - - if (response.status === 503) { - document.getElementById('deployed-model-info').innerHTML = - 'Model management not available
'; - return; - } - - const data = await response.json(); - const container = document.getElementById('deployed-model-info'); - - if (!data.deployed) { - container.innerHTML = 'No model deployed
'; - return; - } - - const html = ` -Failed to load deployed model
'; - } -} - async function refreshModelVersions() { + const container = document.getElementById('model-versions-list'); + let versions = []; + try { const response = await fetch(`${API_BASE_URL}/api/models/versions`); - - if (response.status === 503) { - document.getElementById('model-versions-list').innerHTML = - 'Model management not available
'; - return; + if (response.ok) { + const data = await response.json(); + versions = data.versions || []; } + } catch (e) { + console.warn('Could not fetch model versions:', e); + // Fallback to defaults + versions = [ + { + version_id: 'baseline', + model_name: 'Wav2Vec2 Base', + is_current: false, + created_at: null, + parameters: 95000000 + } + ]; + } + + // If no versions found, show baseline + if (versions.length === 0) { + versions = [ + { + version_id: 'baseline', + model_name: 'Wav2Vec2 Base', + is_current: false, + created_at: null, + parameters: 95000000 + } + ]; + } + + const html = versions.map(version => { + const isCurrent = version.is_current !== undefined ? version.is_current : (version.status === 'current'); + const isBaseline = version.version_id === 'wav2vec2-base' || version.model_id === 'wav2vec2-base'; + // Display WER/CER instead of parameters + const wer = version.wer !== null && version.wer !== undefined ? `${(version.wer * 100).toFixed(1)}%` : 'N/A'; + const cer = version.cer !== null && version.cer !== undefined ? `${(version.cer * 100).toFixed(1)}%` : 'N/A'; + const metrics = version.is_finetuned !== false ? `WER: ${wer} / CER: ${cer}` : 'N/A'; + const createdDate = version.created_at ? new Date(version.created_at).toLocaleString() : 'N/A'; - const data = await response.json(); - const container = document.getElementById('model-versions-list'); - - if (data.versions.length === 0) { - container.innerHTML = 'No model versions registered
'; - return; + // Determine badge text and class - only show badges for baseline and current models + let badgeHtml = ''; + if (isBaseline) { + badgeHtml = `Baseline`; + } else if (isCurrent) { + badgeHtml = `Current`; } + // No badge for intermediate models (neither baseline nor current) - const html = data.versions.map(version => ` -Failed to load model versions
'; - } + +Failed to load performance metrics
'; + if (response.ok) { + data = await response.json(); + } + } catch (e) { + // ignore, fallback to defaults + } + + // Get evaluation results (WER/CER) from dedicated endpoint + let evalData = { baseline: { wer: 0.36, cer: 0.13 }, finetuned: { wer: 0.36, cer: 0.13 } }; + try { + const evalResponse = await fetch(`${API_BASE_URL}/api/models/evaluation`); + if (evalResponse.ok) { + evalData = await evalResponse.json(); + } + } catch (e) { + console.warn('Could not fetch evaluation results:', e); } + + const stats = data?.overall_stats || {}; + + // Use evaluation results for baseline and current (fine-tuned) model + const baselineWer = evalData.baseline?.wer ?? stats.baseline_wer ?? 0.36; + const baselineCer = evalData.baseline?.cer ?? stats.baseline_cer ?? 0.13; + const currentWer = evalData.finetuned?.wer ?? stats.finetuned_wer ?? baselineWer; + const currentCer = evalData.finetuned?.cer ?? stats.finetuned_cer ?? baselineCer; + + performanceMock = { + total_inferences: stats.total_inferences ?? 0, + avg_inference_time: stats.avg_inference_time ?? 0.0, + avg_error_score: stats.avg_error_score ?? 0.0, + wer_baseline: baselineWer, + wer_finetuned: currentWer, + cer_baseline: baselineCer, + cer_finetuned: currentCer + }; + + const html = ` +No trend data available
'; - return; - } - - // Simple text-based trend display (you could integrate Chart.js for visual charts) - const html = ` -- Trend data available. Integrate Chart.js or similar library for visual representation. -
- `; - - container.innerHTML = html; - } catch (error) { - document.getElementById('trends-chart').innerHTML = - 'Failed to load trend data
'; + // Use performance data to build a two-point trend (baseline vs fine-tuned) + // Only show if WER/CER data is available + if (performanceMock?.wer_baseline === undefined && performanceMock?.cer_baseline === undefined) { + container.innerHTML = 'WER/CER data not available
'; + return; } + const baseVal = metric === 'wer' ? (performanceMock?.wer_baseline ?? 0) * 100 : (performanceMock?.cer_baseline ?? 0) * 100; + const currentVal = metric === 'wer' ? (performanceMock?.wer_finetuned ?? 0) * 100 : (performanceMock?.cer_finetuned ?? 0) * 100; + const points = [ + { label: 'Baseline', value: baseVal }, + { label: 'Current Model', value: currentVal } + ]; + + const html = ` +No recent activity
-