OpenAI Whisper Speech-to-Text Transcription in the Browser with Backblaze B2

A JavaScript example app that runs OpenAI's Whisper automatic speech recognition (ASR) model entirely in the browser using Transformers.js and WebAssembly — no server GPU required. Audio files and transcripts are stored in Backblaze B2 cloud storage.

Upload audio (MP3, WAV, M4A, FLAC, OGG, WEBM), transcribe it to text client-side with Whisper, and save both the recording and the transcript to S3-compatible Backblaze B2 object storage — all from a single-page web app.

Why Client-Side Whisper?

No GPU server costs — the Whisper model runs in your browser via WebAssembly, so there's no inference server to provision or pay for
Privacy — audio never leaves the user's device for transcription
Simple to deploy — a static frontend + a lightweight Node.js backend for pre-signed URLs is all you need

Technologies

Transformers.js — Run Hugging Face AI models like Whisper in the browser with WebAssembly
OpenAI Whisper — State-of-the-art open-source automatic speech recognition (ASR) model
Backblaze B2 — S3-compatible cloud object storage at $6/TB/month

What This Demonstrates

Client-side AI transcription: Run OpenAI Whisper entirely in the browser — no server GPU required
Cost-effective cloud storage: Store audio files and transcripts in Backblaze B2
Secure direct uploads: Browser-to-cloud uploads using S3 pre-signed URLs
Simple architecture: End-to-end flow from upload → transcribe → store

Architecture

User → Upload Audio → B2 Storage
                    ↓
Browser Whisper (Transformers.js) → Transcribe
                    ↓
      Transcript → B2 Storage

Flow

User selects/drops audio file in browser
Backend generates pre-signed PUT URL for B2
Browser uploads audio directly to B2
Browser loads Whisper model (Xenova/whisper-tiny.en)
Browser transcribes audio locally
Backend generates pre-signed PUT URL for transcript
Browser uploads transcript JSON to B2

Quick Start

Prerequisites

Node.js 18+
Backblaze B2 Account (free tier available)
- Create a bucket
- Generate an Application Key with readFiles, writeFiles, writeBuckets permissions

1. Clone & Install

git clone https://github.com/backblaze-b2-samples/b2-whisper-transformersjs-transcriber.git
cd b2-whisper-transformersjs-transcriber/backend
npm install

2. Configure B2 Credentials

cp .env.example .env

Edit .env with your B2 credentials:

B2_ENDPOINT=https://s3.us-west-002.backblazeb2.com
B2_REGION=us-west-002
B2_KEY_ID=your_key_id_here
B2_APP_KEY=your_app_key_here
B2_BUCKET=your-bucket-name

Get your B2 endpoint and region from your bucket details page

3. Start the App

npm start

That's it! The server automatically:

✅ Configures B2 CORS for browser uploads
✅ Serves both frontend and API
✅ Opens at http://localhost:3000

4. Use the App

Open http://localhost:3000 in your browser
Upload an audio file (MP3, WAV, M4A, etc.)
Click "Transcribe with Whisper"
View transcription and access files in B2

⚠️ First run downloads the Whisper model (~40MB) - this takes 1-2 minutes

Manual CORS Setup

If auto-setup fails (missing permissions), run manually:

npm run setup-cors

Required B2 Key Permissions:

listBuckets
readFiles
writeFiles
writeBucketSettings ← Required for CORS setup

Alternative - B2 CLI:

b2 update-bucket --cors-rules '[
  {
    "corsRuleName": "allowBrowserUploads",
    "allowedOrigins": ["*"],
    "allowedHeaders": ["*"],
    "allowedOperations": ["s3_put", "s3_get", "s3_head"],
    "maxAgeSeconds": 3600
  }
]' <bucket-name> allPublic

Alternative - B2 Web Console:

Go to https://secure.backblaze.com/b2_buckets.htm
Click your bucket → Bucket Settings → CORS Rules
Add the rules shown above

Usage

Open the frontend in your browser
Ensure the Backend API URL is correct (default: http://localhost:3000)
Drag and drop an audio file or click to browse
- Or download this example audio clip to test
Audio automatically uploads to B2
Click "Transcribe with Whisper"
Wait for transcription (first run downloads model)
View results and access files in B2

Deployment

Deploy Backend

Railway / Render / Fly.io:

Set environment variables from .env
Deploy backend/ directory
Update frontend apiUrl to deployed URL

Docker:

FROM node:18-alpine
WORKDIR /app
COPY backend/package*.json ./
RUN npm install
COPY backend/ ./
CMD ["node", "server.js"]

Deploy Frontend

Static Hosting (Netlify, Vercel, Cloudflare Pages):

Deploy frontend/ directory
Set API URL in settings or hardcode in index.html:170

B2 Static Hosting:

Upload frontend/index.html to B2 bucket
Enable website hosting on bucket
Access via B2 website URL

B2 Configuration

Bucket Settings

Create bucket (Private or Public based on needs)
For public access to audio/transcripts, set bucket to Public
Enable CORS if frontend hosted on different domain:

[
  {
    "corsRuleName": "allowAll",
    "allowedOrigins": ["*"],
    "allowedHeaders": ["*"],
    "allowedOperations": ["s3_put", "s3_get"],
    "maxAgeSeconds": 3600
  }
]

Generate B2 Keys

# Using B2 CLI
b2 create-key <keyName> listBuckets,readFiles,writeFiles

Or use B2 Web UI → App Keys → Create Key

API Endpoints

POST /api/presign-audio

Request:

{
  "filename": "audio.mp3",
  "contentType": "audio/mpeg"
}

Response:

{
  "uploadUrl": "https://...",
  "publicUrl": "https://...",
  "key": "audio/uuid.mp3",
  "fileId": "uuid"
}

POST /api/presign-transcript

Request:

{
  "fileId": "uuid"
}

Response:

{
  "uploadUrl": "https://...",
  "publicUrl": "https://...",
  "key": "transcripts/uuid.json"
}

Technical Details

Whisper Model

This example uses the Xenova/whisper-tiny.en model, a quantized version of OpenAI's Whisper optimized for in-browser inference via Transformers.js. You can swap it for larger Whisper variants (base, small, medium) for higher accuracy at the cost of longer load times.

Model: Xenova/whisper-tiny.en (English only, 39M params)
Library: Transformers.js — Run Hugging Face transformer models in the browser
Quantization: q8 (8-bit) for faster WebAssembly inference
Size: ~40MB download (cached in browser after first load)
Speed: ~30 seconds to transcribe 1 minute of audio

Storage

Provider: Backblaze B2
API: S3-compatible API with pre-signed URLs
Pricing: $6/TB/month storage, uploads are FREE
Documentation: B2 S3-Compatible API Docs

Supported Audio Formats

MP3, WAV, OGG, M4A, WEBM, FLAC

Browser Compatibility

Chrome 90+
Edge 90+
Firefox 90+
Safari 15.4+

Requires WebAssembly and ES6 modules support.

Limitations

First transcription loads model (~40MB, one-time)
Whisper-tiny less accurate than base/small/medium
English only (use Xenova/whisper-tiny for multilingual)
Browser must stay open during transcription
Large files (>30min) may be slow

Potential Improvements

Add recording directly in browser (MediaRecorder API)
Support larger Whisper models (base, small, medium)
Progress callback for transcription
Batch processing multiple files
Word-level timestamps
Speaker diarization
Multi-language support using Xenova/whisper-tiny (not -tiny.en)

Related Resources

Transformers.js Documentation — Run Hugging Face AI models in the browser with WebAssembly
Transformers.js GitHub — Source code and examples
OpenAI Whisper — Original Whisper automatic speech recognition model
Whisper Models on Hugging Face — Pre-trained Whisper model variants (tiny, base, small, medium, large)
Backblaze B2 Documentation — Cloud storage API docs
B2 S3-Compatible API — Use standard S3 SDKs with Backblaze B2

Troubleshooting

CORS Error: "Access to fetch has been blocked by CORS policy"

Problem: Browser shows CORS error when uploading audio.

Solution:

Run npm run setup-cors in the backend directory
Or manually configure CORS on your B2 bucket (see Setup section)
Verify CORS is set: Go to B2 Console → Your Bucket → Settings → CORS Rules

Required CORS settings:

Allowed Origins: * (or specific origins like http://localhost:8080)
Allowed Methods: GET, PUT, HEAD
Allowed Headers: *

Backend Connection Error

Problem: Frontend can't connect to backend API.

Solution:

Verify backend is running: curl http://localhost:3000/health
Check API URL in frontend matches backend (default: http://localhost:3000)
Look for CORS errors in backend logs

Transcription Fails or Hangs

Problem: Whisper model fails to load or transcribe.

Solution:

First run takes time: Model downloads ~40MB, wait 1-2 minutes
Check browser console: Look for specific errors
Try smaller file: Test with <1 minute audio first
Clear cache: Hard refresh browser (Ctrl+Shift+R / Cmd+Shift+R)
Use supported browser: Chrome, Edge, or Firefox recommended

Upload Works but Can't Access Files

Problem: Files upload but URLs don't work.

Solution:

Check bucket is public or URLs are pre-signed
Verify endpoint URL matches bucket region
Try accessing URL directly in browser
Check B2 bucket lifecycle rules aren't deleting files

ContentScript.bundle.js Errors

Problem: Console shows errors from contentScript.bundle.js.

Solution: These are from browser extensions (like Claude Code). Safe to ignore - they don't affect the app.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

OpenAI Whisper Speech-to-Text Transcription in the Browser with Backblaze B2

Why Client-Side Whisper?

Technologies

What This Demonstrates

Architecture

Flow

Quick Start

Prerequisites

1. Clone & Install

2. Configure B2 Credentials

3. Start the App

4. Use the App

Manual CORS Setup

Usage

Deployment

Deploy Backend

Deploy Frontend

B2 Configuration

Bucket Settings

Generate B2 Keys

API Endpoints

POST /api/presign-audio

POST /api/presign-transcript

Technical Details

Whisper Model

Storage

Supported Audio Formats

Browser Compatibility

Limitations

Potential Improvements

Related Resources

Troubleshooting

CORS Error: "Access to fetch has been blocked by CORS policy"

Backend Connection Error

Transcription Fails or Hangs

Upload Works but Can't Access Files

ContentScript.bundle.js Errors

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages