Kaiser Echo - Bilingual Voice Agent Demo

A sophisticated voice agent web application showcasing full-stack development skills with real-time voice interaction, conversation memory, and a customizable animated avatar.

✨ Features

🎤 Voice Interaction

Real-time speech-to-text using Web Speech API
Text-to-speech responses with natural voice
Bilingual support: English and German
Visual feedback during listening, processing, and speaking

🧠 Memory System (Star Feature)

Remembers user information across conversations
Extracts facts automatically: name, job, interests, location, preferences, goals
Context-aware responses that reference past information
Persistent memory across sessions with user authentication

🎨 Customizable Avatar

Animated 2D canvas-based avatar
Lip-sync animation based on audio
Emotion states: neutral, happy, thinking
Full customization: skin tone, hair style/color, eye color, glasses, accessories
Smooth animations and professional design

🔐 Authentication

Anonymous mode: try immediately without signup
Passwordless magic link authentication
Seamless migration from anonymous to authenticated
Data persistence across devices

💰 Cost Controls

Hard limits: 2000 conversations/month
Per-user limits: 20 conversations/day
Usage tracking in database
Friendly error messages when limits reached

🛠️ Tech Stack

Frontend

React 18 with TypeScript
Vite for fast development
Tailwind CSS for styling
Zustand for state management
Web Speech API for voice recognition (built-in, no cost)
Canvas API for avatar animation

Backend

Cloudflare Workers (serverless, generous free tier)
REST API architecture

Services

Groq API with Llama 3.1 (fast, free tier)
Web Speech Synthesis API for TTS (built-in, no cost)
Supabase (PostgreSQL with free tier)

📁 Project Structure

kaiser-echo/
├── src/                          # Frontend source
│   ├── components/               # React components
│   │   ├── Avatar.tsx           # Animated avatar with lip-sync
│   │   ├── VoiceInterface.tsx   # Voice interaction controls
│   │   ├── ChatHistory.tsx      # Conversation display
│   │   ├── AvatarCustomizer.tsx # Avatar customization panel
│   │   ├── AuthModal.tsx        # Authentication modal
│   │   └── LanguageSelector.tsx # Language switcher
│   ├── hooks/                   # Custom React hooks
│   │   ├── useSpeechRecognition.ts
│   │   └── useSpeechSynthesis.ts
│   ├── store/                   # State management
│   │   └── useAppStore.ts       # Zustand store
│   ├── utils/                   # Utilities
│   │   └── api.ts               # API client
│   ├── types.ts                 # TypeScript types
│   ├── App.tsx                  # Main app component
│   ├── main.tsx                 # Entry point
│   └── index.css                # Global styles
├── api/                         # Backend API
│   ├── handlers/                # Request handlers
│   │   ├── chat.ts             # Chat with memory
│   │   ├── facts.ts            # Facts retrieval
│   │   ├── auth.ts             # Magic link auth
│   │   └── avatar.ts           # Avatar customization
│   ├── services/                # Business logic
│   │   ├── llm.ts              # LLM integration
│   │   └── factExtraction.ts  # Fact extraction
│   ├── utils/                   # Utilities
│   │   ├── cors.ts             # CORS handling
│   │   ├── supabase.ts         # Database client
│   │   └── limits.ts           # Usage tracking
│   ├── router.ts                # Request routing
│   └── index.ts                 # Worker entry
├── supabase-schema.sql          # Database schema
├── package.json
├── vite.config.ts
├── tailwind.config.js
├── wrangler.toml                # Cloudflare Workers config
└── README.md

🚀 Setup Instructions

Prerequisites

Node.js 18+ and npm/pnpm/yarn
A Cloudflare account (free)
A Supabase account (free)
A Groq API key (free)

1. Clone and Install

cd kaiser-echo
npm install

2. Set Up Supabase

Go to supabase.com and create a new project
Once created, go to SQL Editor
Copy the contents of supabase-schema.sql and run it
Get your project URL and service key from Settings → API

3. Get API Keys

Groq API (for LLM)

Go to console.groq.com
Sign up and get your API key
Free tier: 30 requests/minute

4. Configure Environment Variables

Create .env file:

cp .env.example .env

Fill in your keys:

# Groq API
VITE_GROQ_API_KEY=your_groq_api_key_here

# Supabase
VITE_SUPABASE_URL=your_supabase_project_url
VITE_SUPABASE_ANON_KEY=your_supabase_anon_key

# API URL (use localhost for development)
VITE_API_URL=http://localhost:8787

5. Configure Cloudflare Workers

Create .dev.vars file for local development:

GROQ_API_KEY=your_groq_api_key
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_KEY=your_supabase_service_key
ENVIRONMENT=development

6. Run Development Servers

Terminal 1 - Frontend:

npm run dev

Terminal 2 - Backend:

npm run worker:dev

Visit http://localhost:3000

📦 Deployment

Deploy Backend to Cloudflare Workers

Install Wrangler CLI:

npm install -g wrangler

Login to Cloudflare:

wrangler login

Set secrets:

wrangler secret put GROQ_API_KEY
wrangler secret put SUPABASE_URL
wrangler secret put SUPABASE_SERVICE_KEY

Deploy:

npm run worker:deploy

Note the deployed URL (e.g., https://kaiser-echo-api.your-subdomain.workers.dev)

Deploy Frontend to Vercel

Install Vercel CLI:

npm install -g vercel

Update .env with your Worker URL:

VITE_API_URL=https://kaiser-echo-api.your-subdomain.workers.dev

Deploy:

npm run build
vercel --prod

🎯 Usage

First Visit

Click the microphone button
Allow microphone access
Start speaking in English or German
The avatar responds with voice and animation

Using Memory

The system automatically extracts and remembers:

Your name when you introduce yourself
Your job when mentioned
Your interests and hobbies
Your location
Your preferences
Your goals

Example:

You: "Hi, I'm Thomas. I'm a software developer from Berlin."
Kaiser: "Nice to meet you, Thomas! How can I help you today?"
[Later in conversation]
Kaiser: "As a software developer in Berlin, you might be interested in..."

Customization

Click "Customize Avatar" to change appearance
Click language buttons to switch between English and German
All changes are saved automatically

Authentication

After 3 conversation exchanges, you'll see a "Save Progress" prompt
Enter your email to receive a magic link
Click the link to authenticate
Your conversations and avatar are now saved across devices

💡 Implementation Highlights

Memory System Architecture

Instead of complex RAG with embeddings, we use a simple but impressive approach:

Store all conversations in database
Extract facts using pattern matching every 6 messages
Inject facts into LLM system prompt
Retrieve recent history (last 10 messages)
Summarize old conversations for context

Lip-Sync Algorithm

// Analyze speaking state
if (voiceState === 'speaking') {
  // Simulate talking with random mouth openness
  setMouthState({
    openness: Math.random() * 0.8 + 0.2
  })
}

Cost Optimization

Web Speech API for STT (free, built-in)
Web Speech Synthesis for TTS (free, built-in)
Groq for LLM (free tier: 30 req/min)
Cloudflare Workers (free tier: 100K req/day)
Supabase (free tier: 500MB database, 2GB bandwidth)

Total cost: $0/month for demo usage!

🔒 Security Considerations

Service key stored securely in Cloudflare Workers secrets
Row Level Security (RLS) enabled on Supabase tables
CORS properly configured
Magic links expire after 30 minutes
Rate limiting via usage tracking

🧪 Testing

The app works best in:

Chrome, Edge (full Web Speech API support)
Safari (limited but functional)
Not recommended: Firefox (limited Web Speech API support)

📊 Database Schema

users               # Authenticated users
sessions            # Anonymous and authenticated sessions
messages            # Conversation history
user_facts          # Extracted knowledge
daily_usage         # Usage tracking
magic_tokens        # Authentication tokens

🎨 Design Decisions

Canvas over SVG: Better performance for animations
Zustand over Redux: Simpler state management
Cloudflare Workers over Express: Serverless, cheaper
Pattern matching over LLM: Faster, cheaper fact extraction
Magic links over passwords: Better UX, more secure

🚧 Future Enhancements

LLM-based fact extraction for better accuracy
Google Cloud TTS for higher quality voices
More avatar styles and animations
Conversation export
Multi-turn context with summarization
Voice activity detection
Mobile app version

📝 License

This is a portfolio demo project. Feel free to use it as inspiration!

👨‍💻 Developer

Built to showcase full-stack development skills including:

Modern React patterns and hooks
Canvas animation and audio processing
Real-time voice interaction
RESTful API design
Serverless architecture
Database design and optimization
Authentication flows
Cost-conscious architecture

Note: This is a demo application designed to showcase development skills. For production use, add proper error handling, monitoring, analytics, and scalability considerations.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
api		api
scripts		scripts
src		src
.claudeignore		.claudeignore
.env.example		.env.example
.gitignore		.gitignore
AI_GENERATION_SETUP.md		AI_GENERATION_SETUP.md
AVATAR_FEATURES.md		AVATAR_FEATURES.md
DEPLOYMENT.md		DEPLOYMENT.md
FILES_DELIVERED.md		FILES_DELIVERED.md
PROJECT_OVERVIEW.md		PROJECT_OVERVIEW.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
supabase-schema-fixed.sql		supabase-schema-fixed.sql
supabase-schema.sql		supabase-schema.sql
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts
wrangler.toml		wrangler.toml

kaiser-data/kaiser-echo

Folders and files

Latest commit

History

Repository files navigation