Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"permissions": {
"allow": [
"Bash(npm install)",
"Bash(npm install:*)",
"Bash(if exist node_modules rmdir /s /q node_modules)",
"Bash(if exist package-lock.json del package-lock.json)",
"Bash(powershell:*)",
"Read(//c/Users/qc_de/simple-whisper-transcription/**)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Absolute Paths in Config Files Cause Cross-Platform Issues

Local configuration files (.claude/settings.local.json and whisper/.claude/settings.local.json) were committed with hardcoded, user- and Windows-specific absolute paths. This prevents the configuration from working for other developers or on different operating systems.

Additional Locations (1)

Fix in Cursor Fix in Web

],
"deny": [],
"ask": []
}
}
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
node_modules
.cursor
.cursor
*.onnx
whisper/models
models
141 changes: 141 additions & 0 deletions WHISPER_INTEGRATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Whisper AI Integration for ScrumAI

This integration adds real-time speech transcription to the ScrumAI application using OpenAI's Whisper model.

## Features

- **Real-time Speech Transcription**: Converts speech to text in real-time during meetings
- **Live Display**: Transcripts appear instantly in the "Transcript" tab
- **Automatic Saving**: When a meeting ends, transcripts are automatically saved as `meetingnotes_[timestamp].txt` to your home directory
- **Keyword Extraction**: Automatically extracts keywords from the transcript for the Keywords tab
- **Cross-platform**: Works on Windows, macOS, and Linux

## Setup Instructions

### 1. Install Python Dependencies

Run the setup script to create a Python virtual environment and install dependencies:

```bash
setup_whisper.bat
```

This will:
- Create a Python virtual environment (`whisper_env`)
- Install required packages (numpy, sounddevice, onnxruntime, PyYAML)
- Set up the Whisper models

### 2. Verify Installation

The application will automatically detect if the required files are present:
- Python executable in `whisper_env` or system PATH
- Whisper model files in `whisper/models/`
- Configuration file at `whisper/config.yaml`

### 3. Run the Application

Start the ScrumAI application as usual:

```bash
npm start
```

## How to Use

1. **Start Meeting**: Click the "Start Meeting" button
- This will initialize the Whisper transcription service
- A microphone permission dialog may appear - grant permission
- You'll see status messages in the console

2. **Begin Speaking**: Start talking normally
- Real-time transcripts will appear in the "Transcript" tab
- Keywords will be automatically extracted and shown in the "Keywords" tab
- Timestamps are added to each transcript entry

3. **Stop Meeting**: Click the "Stop Meeting" button
- This stops the transcription
- Automatically saves the full transcript as `meetingnotes_[timestamp].txt` in your home directory
- Shows a confirmation dialog with the saved file location

## File Structure

```
scrumAI/
├── whisper/
│ ├── transcriber_for_nodejs.py # Main transcription script
│ ├── standalone_model.py # Whisper model wrapper
│ ├── standalone_whisper.py # Whisper implementation
│ ├── config.yaml # Configuration
│ ├── mel_filters.npz # Mel filter coefficients
│ ├── requirements_minimal.txt # Python dependencies
│ └── models/
│ ├── WhisperEncoder.onnx # Encoder model
│ └── WhisperDecoder.onnx # Decoder model
├── src/
│ ├── services/
│ │ └── whisperService.js # Node.js Whisper service wrapper
│ └── electron/
│ ├── main.js # Updated with Whisper integration
│ └── preload.js # Updated with IPC methods
├── whisper_env/ # Python virtual environment
└── setup_whisper.bat # Setup script
```

## Configuration

The Whisper service can be configured by editing `whisper/config.yaml`:

```yaml
# Audio settings
sample_rate: 16000 # Audio sample rate in Hz
chunk_duration: 4 # Duration of each audio chunk in seconds
channels: 1 # Number of audio channels (1 for mono)

# Processing settings
max_workers: 4 # Number of parallel transcription workers
silence_threshold: 0.001 # Threshold for silence detection
queue_timeout: 1.0 # Timeout for audio queue operations

# Model paths
encoder_path: "whisper/models/WhisperEncoder.onnx"
decoder_path: "whisper/models/WhisperDecoder.onnx"
```

## Troubleshooting

### Common Issues

1. **"Python not found"**
- Ensure Python 3.8+ is installed
- Run `setup_whisper.bat` to create virtual environment

2. **"Model files not found"**
- Ensure the Whisper ONNX models are in `whisper/models/`
- Check that `WhisperEncoder.onnx` and `WhisperDecoder.onnx` exist

3. **"Microphone access denied"**
- Grant microphone permissions to the application
- Check your operating system's privacy settings

4. **No transcript appearing**
- Check the console for error messages
- Ensure you're speaking loud enough (above silence threshold)
- Verify the microphone is working in other applications

### Performance Tips

- For better performance on lower-end hardware, reduce `max_workers` in config.yaml
- Increase `silence_threshold` if picking up too much background noise
- Decrease `chunk_duration` for more responsive transcription (but higher CPU usage)

## Technical Details

The integration works by:

1. **Electron Main Process** spawns a Python child process running the Whisper transcriber
2. **Python Process** captures audio from the microphone and processes it through the Whisper model
3. **IPC Communication** sends transcript data back to the Electron app via JSON over stdout
4. **Renderer Process** receives transcript events and updates the UI in real-time
5. **File I/O** saves the complete transcript when the meeting ends

The system is designed to be resilient and will gracefully handle errors like microphone access issues or model loading problems.
4 changes: 4 additions & 0 deletions meetingnotes_2025-09-14T08-44-59-295Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Meeting Session: meeting_20250914_084458
Started: 2025-09-14 08:44:59
============================================================

4 changes: 4 additions & 0 deletions meetingnotes_2025-09-14T08-47-47-141Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Meeting Session: meeting_20250914_084746
Started: 2025-09-14 08:47:47
============================================================

4 changes: 4 additions & 0 deletions meetingnotes_2025-09-14T08-49-08-891Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Meeting Session: meeting_20250914_084908
Started: 2025-09-14 08:49:08
============================================================

4 changes: 4 additions & 0 deletions meetingnotes_2025-09-14T08-49-42-449Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Meeting Session: meeting_20250914_084941
Started: 2025-09-14 08:49:42
============================================================

28 changes: 28 additions & 0 deletions meetingnotes_2025-09-14T08-50-09-832Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
Meeting Session: meeting_20250914_085009
Started: 2025-09-14 08:50:09
============================================================

[08:50:15]: , please. Hello, my name is Sean.
[08:50:19]: , I'm a Coron's two-roomed.
[08:50:26]: , finally why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is it? why is
[08:50:27]: , yeah, can you talk to me more about.
[08:50:31]: , but NYU Car Art, you know, I don't know anything much.
[08:50:35]: , thank you for your time.
[08:50:39]: the other one.
[08:50:43]: , please.
[08:50:47]: , you.
[08:50:51]: , it's not doing this, can you do one thing?
[08:50:55]: , you do this. Your data is getting...
[08:50:59]: , but I think that's what I want to look at.
[08:51:03]: , he's basically ever submitted in like now. He's not that good.
[08:51:07]: , you know we signed by just a minute.
[08:51:11]: the problem. Nobody
[08:51:15]: , and then we will remove the Ds store, all that stuff and then put it in like properly, you'll put it in.
[08:51:19]: , I'll keep the...
[08:51:23]: , and then make it like the first thing called the White Cash, all the random things.
[08:51:27]: , and ex-piles it.
[08:51:31]: , but it will come and emerge this time.
[08:51:35]: , but it's not the same.
[08:51:39]: , you can more job. Keep that way. Yeah.
[08:51:43]: , and we'll see you next time.
[08:51:47]: , okay.
31 changes: 31 additions & 0 deletions meetingnotes_2025-09-14T09-04-00-968Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
Meeting Session: meeting_20250914_090400
Started: 2025-09-14 09:04:00
============================================================

[09:04:08]: , this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this
[09:04:10]: , but it doesn't
[09:04:14]: , it is.
[09:04:18]: , thanks a lot.
[09:04:27]: , I'm not a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy.
[09:04:27]: , with 26 letters and 10 numbers.
[09:04:30]: , but I also forgot to consider other characters.
[09:04:34]: the animation.
[09:04:38]: , like, at the eight hashtag dollar, or same sign.
[09:04:48]: the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character
[09:04:49]: , and I'm a bad guy here for two.
[09:04:50]: , and I'm so happy to be here.
[09:04:54]: , oh my god, this is such a terrible animal.
[09:04:59]: , okay. Do you want me to keep talking? Yes, yes.
[09:05:03]: , so let's keep talking. Let's talk about something go ahead and like which it can understand in like...
[09:05:06]: , yeah. Okay. Yeah. Um,
[09:05:10]: , usually is an AI powered event discovery social media platform.
[09:05:14]: , that addresses the loneliness epidemic.
[09:05:18]: , where 103 people from our generation suffer.
[09:05:22]: , where from chronic loneliness and we believe that the best
[09:05:26]: , and I think that's why I'm here.
[09:05:30]: the same thing.
[09:05:34]: , and when you register for the event.
[09:05:38]: , for an event, you're a match of what's so good. We then...
[09:05:42]: , so they can host a data.
[09:05:46]: , so that way we have an equal system.
[09:05:50]: , and businesses with the B2B and the B2B.
7 changes: 7 additions & 0 deletions meetingnotes_2025-09-14T11-34-49-983Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[04:34:02] , hello, hello, hello, hello.
[04:34:06] , what is happening.
[04:34:14] , hello.
[04:34:18] , where am I is this New York?
[04:34:24] the other side. What are you guys doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing? What are you doing
[04:34:26] , where are you sleeping so much wake up please?
[04:34:34] , how is it good morning?
2 changes: 2 additions & 0 deletions meetingnotes_2025-09-14T12-29-57-555Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[05:29:41] , hello, it's Al, how are you guys doing?
[05:29:49] , what is up? Is this New York?
4 changes: 4 additions & 0 deletions meetingnotes_2025-09-14T12-31-12-563Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[05:30:58] , hi hi hello this is new y'all
[05:31:02] , hello hello and see you.
[05:31:06] , and keep coming, and then keep coming.
[05:31:10] , keep coming in all that.
7 changes: 7 additions & 0 deletions meetingnotes_2025-09-14T13-26-56-089Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[06:26:29] , all it is okay with it.
[06:26:33] , all it is okay with that.
[06:26:37] , with it. Yeah, yeah. Oh.
[06:26:41] , what is this guy? I listen to it.
[06:26:45] , what was this?
[06:26:49] the person was done.
[06:26:53] , we should be very smart, you're not full.
9 changes: 9 additions & 0 deletions meetingnotes_2025-09-14T15-01-14-612Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[08:00:29] , hello, nor constant.
[08:00:33] , hello, hi nice.
[08:00:37] , okay thank you. I broke that out. The keywords are the
[08:00:41] , is broken. I would be ignored.
[08:00:45] , I am the best.
[08:00:49] , it's a message.
[08:00:53] , yeah.
[08:00:57] , oh we have grown one second so in the other end.
[08:01:12] , it's generally not only after it's done, it's no pressure to do it, it's a buffer.
6 changes: 6 additions & 0 deletions meetingnotes_2025-09-14T15-04-27-507Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[08:04:04] , and I'm going to work on this.
[08:04:08] , what do you think about that?
[08:04:13] , I'll go on his voice, but I'm going to work on this.
[08:04:16] the same thing.
[08:04:21] , and then we will get the best of you. And then, yeah, I'll tell you something more.
[08:04:24] the dot net devocable is going to be taking this time you know.
27 changes: 27 additions & 0 deletions meetingnotes_2025-09-14T16-05-55-698Z.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
[09:04:08] , this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this is a this
[09:04:10] , but it doesn't
[09:04:14] , it is.
[09:04:18] , thanks a lot.
[09:04:27] , I'm not a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy. I'm a bad guy.
[09:04:27] , with 26 letters and 10 numbers.
[09:04:30] , but I also forgot to consider other characters.
[09:04:34] the animation.
[09:04:38] , like, at the eight hashtag dollar, or same sign.
[09:04:48] the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character of the character
[09:04:49] , and I'm a bad guy here for two.
[09:04:50] , and I'm so happy to be here.
[09:04:54] , oh my god, this is such a terrible animal.
[09:04:59] , okay. Do you want me to keep talking? Yes, yes.
[09:05:03] , so let's keep talking. Let's talk about something go ahead and like which it can understand in like...
[09:05:06] , yeah. Okay. Yeah. Um,
[09:05:10] , usually is an AI powered event discovery social media platform.
[09:05:14] , that addresses the loneliness epidemic.
[09:05:18] , where 103 people from our generation suffer.
[09:05:22] , where from chronic loneliness and we believe that the best
[09:05:26] , and I think that's why I'm here.
[09:05:30] the same thing.
[09:05:34] , and when you register for the event.
[09:05:38] , for an event, you're a match of what's so good. We then...
[09:05:42] , so they can host a data.
[09:05:46] , so that way we have an equal system.
[09:05:50] , and businesses with the B2B and the B2B.
Loading