fix: WHIP/WHEP inactive status, CouchDB resilience, and CORS centralization by LucasMaupin · Pull Request #194 · Eyevinn/intercom-manager

LucasMaupin · 2026-02-26T16:06:42Z

Summary

Fixes three related issues with WHIP/WHEP session management and CouchDB stability:

WHIP/WHEP sessions never shown as inactive: All three API response mappings (api_productions.ts x2, production_manager.ts x1) forced isActive: true for WHIP sessions, so they'd jump straight from active to gone when expired. Now passes through the actual DB value.
CouchDB crash loop (146+ restarts on intercomprod): Transient network errors (ECONNRESET, socket hang up) on CouchDB calls were unhandled, crashing the Node.js process. Adds withRetry() wrapper around ALL nano DB operations with exponential backoff, connection retry (5 attempts), and 10s request timeout.
CORS centralization: Moves wildcard CORS from per-route onSend hooks in api_whip.ts/api_whep.ts to a centralized delegator in the @fastify/cors plugin. WHIP/WHEP routes get origin: '*', everything else uses CORS_ORIGIN.

Changes

`src/db/couchdb.ts`

withRetry<T>() — generic retry wrapper for all DB operations (3 retries, 100/200/400ms backoff)
isTransientError() — detects ECONNRESET, ETIMEDOUT, socket hang up, EHOSTUNREACH
connect() — retry with backoff (5 attempts, 1/2/4/8/16s)
nano client configured with 10s request timeout
Proper 404 handling in getNextSequence (no longer swallows non-404 errors)

`src/db/couchdb.test.ts`

360 lines of new tests covering retry behavior, transient error handling, connection failures

`src/server.ts`

process.exit(1) in uncaughtException handler for clean pod restart

`src/api_productions.ts`

Lines 514, 948: isActive: s.isWhip ? true : ... → isActive: !!s.isActive

`src/production_manager.ts`

Line 490: same isWhip override removed

`src/api.ts`

Centralized CORS delegator: WHIP/WHEP routes get origin: '*', all other routes use CORS_ORIGIN

`src/api_whip.ts` / `src/api_whep.ts`

Removed redundant per-route onSend CORS hooks

Test plan

All 228 backend tests pass
Start a WHIP session → verify it shows as active
Stop the WHIP source → verify it transitions to "(inactive)" before disappearing
Simulate CouchDB transient error → verify retry succeeds without process crash
Verify WHIP/WHEP endpoints respond with Access-Control-Allow-Origin: *
Verify non-WHIP/WHEP endpoints use restrictive CORS from CORS_ORIGIN
Deploy to beta and confirm pod restart count stabilizes

🤖 Generated with Claude Code

Both participant API endpoints were overriding isActive to always return true for WHIP/WHEP sessions, preventing them from ever appearing as inactive in the frontend. Sessions would jump straight from active to gone when they expired. Now the actual DB value is passed through so the frontend can show the inactive state. Co-Authored-By: Claude Opus 4.6 <[email protected]>

The existing insertWithRetry only covered insert operations. Transient network errors (ECONNRESET, socket hang up) on any CouchDB call would still crash the process. This adds: - withRetry() wrapper around ALL nano DB operations (get, list, insert, destroy) with exponential backoff (3 retries, 100/200/400ms) - isTransientError() detection for ECONNRESET, ETIMEDOUT, socket hang up, and EHOSTUNREACH - Connection retry with backoff (5 attempts) in connect() - 10s request timeout in nano client config - process.exit(1) in uncaughtException handler to ensure clean restart instead of zombie process - 360 lines of new CouchDB tests covering retry behavior Fixes the recurring pod restarts (146+ on intercomprod) caused by unhandled socket hang up errors from intermediate network components. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Move wildcard CORS from per-route onSend hooks in api_whip.ts and api_whep.ts to a centralized delegator in the @fastify/cors plugin. WHIP/WHEP routes get origin: '*', everything else uses CORS_ORIGIN. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Third location where isActive was forced to true for WHIP sessions, in the getUsersForLine response mapping. Also adds production manager resilience tests for session lifecycle edge cases. Co-Authored-By: Claude Opus 4.6 <[email protected]>

LucasMaupin requested a review from birme as a code owner February 26, 2026 16:06

LucasMaupin and others added 3 commits February 26, 2026 17:11

LucasMaupin changed the title ~~fix: WHIP/WHEP sessions never shown as inactive~~ fix: WHIP/WHEP inactive status, CouchDB resilience, and CORS centralization Feb 26, 2026

birme merged commit 8fc168b into main Feb 26, 2026
4 checks passed

LucasMaupin deleted the fix/whip-whep-inactive-status branch February 27, 2026 11:08

LucasMaupin mentioned this pull request Feb 27, 2026

Recover user sessions after Manager redeploy/crash #84

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: WHIP/WHEP inactive status, CouchDB resilience, and CORS centralization#194

fix: WHIP/WHEP inactive status, CouchDB resilience, and CORS centralization#194
birme merged 4 commits intomainfrom
fix/whip-whep-inactive-status

LucasMaupin commented Feb 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LucasMaupin commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

src/db/couchdb.ts

src/db/couchdb.test.ts

src/server.ts

src/api_productions.ts

src/production_manager.ts

src/api.ts

src/api_whip.ts / src/api_whep.ts

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LucasMaupin commented Feb 26, 2026 •

edited

Loading

`src/db/couchdb.ts`

`src/db/couchdb.test.ts`

`src/server.ts`

`src/api_productions.ts`

`src/production_manager.ts`

`src/api.ts`

`src/api_whip.ts` / `src/api_whep.ts`