A voice-first AI assistant that runs entirely in your browser. Talk to Nala, and she talks back — no cloud APIs, no API keys, no data leaving your device.
Built with React, Express, PostgreSQL, WebLLM (browser-local LLM), and the Web Speech API.
- Voice in, voice out - Speak naturally, hear Nala respond aloud
- Fully local AI - LLM runs in your browser via WebGPU (no external API calls)
- Conversation memory - All chats persist in PostgreSQL across sessions
- Animated avatar - Lion character with lip-synced mouth animation while speaking
- Auto-continue - Conversation flows naturally; Nala listens again after responding
- Say "stop" - End a conversation hands-free with a voice command
- Mobile responsive - Works on desktop and mobile with adaptive layout
- Guided onboarding - Starter prompts help new users get started immediately
The entire app runs with a single Docker command. You need Docker installed.
git clone https://github.com/jim-schwoebel/nala-cal-poly-demo.git
cd nala-cal-poly-demo
docker compose up --buildOpen http://localhost:5173 in Chrome (113+ required for WebGPU).
On first load, the AI model downloads (~1-4 GB). It's cached in your browser after that.
Browser (React + Vite + TypeScript)
├── WebLLM - Local LLM inference via WebGPU
├── Web Speech API - Speech-to-text (mic input)
├── Kokoro TTS * - Neural text-to-speech (optional)
└── REST client - Talks to Express API
│
Express.js API Server (persistence only)
│
PostgreSQL - Conversations + messages
* Kokoro TTS is opt-in via VITE_TTS=kokoro. Default uses the browser's built-in speech synthesis for lower latency.
- You tap the mic and speak
- Web Speech API converts your voice to text
- Your message is saved to PostgreSQL
- WebLLM generates a response locally in the browser
- The response is saved to PostgreSQL
- Nala speaks the response aloud (with lip-synced avatar)
- The mic automatically re-activates for your next turn
nala/
├── client/ # React + Vite frontend
│ └── src/
│ ├── components/ # UI components (avatar, waveform, chat, etc.)
│ ├── hooks/ # Custom hooks (voice, LLM, conversations)
│ └── services/ # API client
├── server/ # Express.js backend
│ └── src/
│ ├── routes/ # REST API endpoints
│ └── db/ # PostgreSQL connection, queries, migrations
├── shared/ # Shared TypeScript types
├── docker-compose.yml # Run everything with one command
└── CLAUDE.md # AI coding assistant context
docker compose up --buildSource files are volume-mounted — edit code and see changes instantly via hot reload.
| Service | URL | Purpose |
|---|---|---|
| Client | http://localhost:5173 | React app (Vite dev server) |
| Server | http://localhost:3001 | Express API |
| Database | localhost:5432 | PostgreSQL |
Prerequisites: Node.js 20+, PostgreSQL 14+
# Install dependencies
npm install
# Set up database
createdb nala
psql nala -f server/src/db/migrations/001-initial-schema.sql
# Start server (terminal 1)
cd server && npm run dev
# Start client (terminal 2)
cd client && npm run dev# Client tests (all 38 pass)
npm run test:client
# Server tests (requires PostgreSQL)
createdb nala_test
cd server && DATABASE_URL=postgresql://localhost:5432/nala_test npx vitest run| Environment Variable | Default | Description |
|---|---|---|
VITE_TTS |
(unset) | Set to kokoro for neural TTS voice (higher quality, slower first load) |
DATABASE_URL |
postgresql://localhost:5432/nala |
PostgreSQL connection string |
PORT |
3001 |
Express server port |
API_URL |
http://localhost:3001 |
Backend URL for Vite proxy (Docker sets this automatically) |
- Chrome 113+ (required for WebGPU)
- Microphone access (prompted on first use)
- ~2-4 GB free memory for the LLM model
Firefox and Safari do not yet support WebGPU and will not work.
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | React 18, Vite 5, TypeScript | UI framework |
| AI | WebLLM (Llama 3.1 8B) | Browser-local LLM |
| Speech Input | Web Speech API | Voice-to-text |
| Speech Output | Web Speech API / Kokoro TTS | Text-to-voice |
| Visualization | Canvas API | Animated waveform orb |
| Avatar | SVG + requestAnimationFrame | Lion with lip sync |
| Backend | Express.js, TypeScript | REST API (persistence only) |
| Database | PostgreSQL 16 | Conversation storage |
| DevOps | Docker Compose | One-command setup |
This project was built live in a single Claude Code session (~2 hours) as a tutorial for Cal Poly Pomona's hackathon. Below is the step-by-step process so you can replicate it for your own projects.
- Install Claude Code (
npm install -g @anthropic-ai/claude-code) - Install Docker
- A GitHub account with the GitHub CLI (
gh) installed
Create a project directory, initialize git, and launch Claude Code:
mkdir nala && cd nala
git init
claudeInstall the superpowers plugin inside Claude Code — it dramatically improves planning and spec quality:
/plugin superpowers
/reload-plugins
CLAUDE.md is like long-term memory for Claude. It tells Claude what you're building, your tech stack, coding conventions, and file structure. Give Claude a prompt like:
"Create a CLAUDE.md file for this project. We're building Nala — a voice assistant web app. Stack: React + Vite + TypeScript frontend, Express.js backend, PostgreSQL for conversation memory, Web Speech API for browser-native voice input/output."
Structure it with WHAT (overview), WHY (design decisions), and HOW (coding standards). This keeps Claude focused across long sessions and prevents it from going off-track.
This is the most important step. Use the superpowers design skill:
"Use superpowers to design: Design the complete architecture for Nala. I need PostgreSQL schema, API endpoints, React component tree, and data flow."
Superpowers treats you like a colleague — it asks clarifying questions one at a time:
- "Should you have a users table with no auth, or full authentication?"
- "How many messages should be stored for context?"
- "When should the waveform display — recording only, or also while thinking?"
- "Monorepo or separate repos?"
Answer each question. The back-and-forth takes ~10 minutes but produces a comprehensive design spec saved to docs/superpowers/specs/. This spec is the foundation for everything that follows.
Key insight: If you skip this step and go straight to coding, you'll take side roads. As Jim put it in the session: "You can run a marathon, but if you don't know the direction you're headed, you might land in a forest and not the ocean."
Once the spec is approved, say:
"Approved, please proceed toward implementation."
Claude writes a detailed implementation plan with 18 bite-sized tasks — each with exact file paths, code, test commands, and commit messages. The plan is saved to docs/superpowers/plans/ and goes through a review loop before execution.
Choose subagent-driven development when prompted. Claude dispatches a fresh subagent for each task, reviews between tasks, and commits after each one:
- Project scaffolding (monorepo, workspaces, configs)
- Database schema + migrations + query functions
- Express API routes (conversations, messages)
- Client API service
- React hooks (conversations, messages, WebLLM, voice I/O, audio analyser)
- UI components (chat history, sidebar, waveform, mic button, model loader)
- App shell wiring everything together
Each task follows TDD: write the test, verify it fails, implement, verify it passes, commit. You can let this run and go make coffee — it will ask if it hits a blocker.
Instead of installing PostgreSQL locally, containerize everything:
"Can you just containerize this application and put Postgres as a Docker container and have the whole app be a container?"
Claude writes a docker-compose.yml with three services (db, server, client), volume mounts for hot reload, and auto-migration. One command to run: docker compose up --build.
This is where the magic happens. Use the app, notice what feels wrong, and tell Claude in natural language:
"The sessions should auto-record after the AI responds so I don't have to keep clicking. When I say 'stop', it should stop the conversation. The visualization isn't engaging enough. I had trouble knowing what to say initially."
Claude implements all four changes in one shot. Keep iterating:
- "Do a UX audit and suggest 3-5 improvements" — Claude reads all your code, identifies 40+ issues, and recommends the highest-impact fixes
- "Make it look like Stripe" — redesigns the sidebar with Stripe's color palette, typography, and spacing
- "Make Nala an avatar like from the Lion King with lip sync" — creates an SVG lion with animated mouth driven by
requestAnimationFrame - "The text-to-speech is too robotic" — swaps Web Speech API for Kokoro TTS (82M neural model running in-browser)
While the code is building, open another terminal and have Claude work on other things simultaneously:
# Terminal 1: Building the app (takes 30+ min)
claude # executing the implementation plan
# Terminal 2: Documentation (takes 5 min)
claude # "Create marketing docs, user personas, competitive analysis"
# Terminal 3: Security (takes 5 min)
claude # "Write a security audit with OWASP ASVS assessment and threat model"Documentation and audits complete much faster than code and don't interfere with each other.
Write a README, add a license, and push:
"Document everything with a README on how to get this setup. Assume Apache 2.0 license. Encourage public contributions."
Push to GitHub:
"Commit and push to GitHub."
- Commit early and often — remind Claude to commit. You can always revert if something breaks
- Plan before you code — a 15-minute spec saves hours of side roads
- Use superpowers for everything — design, implementation, UX audits, docs
- Paste errors directly — when something breaks, paste the error into Claude and it will debug
- Think in swim lanes — code in one terminal, docs in another, security in a third
- Feature flags for experiments — keep both options (e.g.,
VITE_TTS=kokoro) instead of replacing - Security matters — run an automated OWASP audit even for hackathon projects
- Ship your code — version control everything in GitHub so you never lose work
| Metric | Value |
|---|---|
| Total time | ~2 hours |
| Commits | 35+ |
| Lines of code | ~5,000+ |
| Test files | 13 (38 tests passing) |
| Docker services | 3 (client, server, db) |
| Parallel Claude sessions | 3 |
We welcome contributions! See CONTRIBUTING.md for guidelines.
Quick version:
- Fork the repo
- Make your changes
- Run tests:
npm run test:client - Open a PR
Check open issues for ideas on what to work on.
