Nala

A voice-first AI assistant that runs entirely in your browser. Talk to Nala, and she talks back — no cloud APIs, no API keys, no data leaving your device.

Built with React, Express, PostgreSQL, WebLLM (browser-local LLM), and the Web Speech API.

Features

Voice in, voice out - Speak naturally, hear Nala respond aloud
Fully local AI - LLM runs in your browser via WebGPU (no external API calls)
Conversation memory - All chats persist in PostgreSQL across sessions
Animated avatar - Lion character with lip-synced mouth animation while speaking
Auto-continue - Conversation flows naturally; Nala listens again after responding
Say "stop" - End a conversation hands-free with a voice command
Mobile responsive - Works on desktop and mobile with adaptive layout
Guided onboarding - Starter prompts help new users get started immediately

Quick Start

The entire app runs with a single Docker command. You need Docker installed.

git clone https://github.com/jim-schwoebel/nala-cal-poly-demo.git
cd nala-cal-poly-demo
docker compose up --build

Open http://localhost:5173 in Chrome (113+ required for WebGPU).

On first load, the AI model downloads (~1-4 GB). It's cached in your browser after that.

Architecture

Browser (React + Vite + TypeScript)
  ├── WebLLM         - Local LLM inference via WebGPU
  ├── Web Speech API  - Speech-to-text (mic input)
  ├── Kokoro TTS *    - Neural text-to-speech (optional)
  └── REST client     - Talks to Express API
        │
Express.js API Server (persistence only)
        │
PostgreSQL - Conversations + messages

* Kokoro TTS is opt-in via VITE_TTS=kokoro. Default uses the browser's built-in speech synthesis for lower latency.

Data Flow

You tap the mic and speak
Web Speech API converts your voice to text
Your message is saved to PostgreSQL
WebLLM generates a response locally in the browser
The response is saved to PostgreSQL
Nala speaks the response aloud (with lip-synced avatar)
The mic automatically re-activates for your next turn

Project Structure

nala/
├── client/               # React + Vite frontend
│   └── src/
│       ├── components/   # UI components (avatar, waveform, chat, etc.)
│       ├── hooks/        # Custom hooks (voice, LLM, conversations)
│       └── services/     # API client
├── server/               # Express.js backend
│   └── src/
│       ├── routes/       # REST API endpoints
│       └── db/           # PostgreSQL connection, queries, migrations
├── shared/               # Shared TypeScript types
├── docker-compose.yml    # Run everything with one command
└── CLAUDE.md             # AI coding assistant context

Development

With Docker (recommended)

docker compose up --build

Source files are volume-mounted — edit code and see changes instantly via hot reload.

Service	URL	Purpose
Client	http://localhost:5173	React app (Vite dev server)
Server	http://localhost:3001	Express API
Database	localhost:5432	PostgreSQL

Without Docker

Prerequisites: Node.js 20+, PostgreSQL 14+

# Install dependencies
npm install

# Set up database
createdb nala
psql nala -f server/src/db/migrations/001-initial-schema.sql

# Start server (terminal 1)
cd server && npm run dev

# Start client (terminal 2)
cd client && npm run dev

Running Tests

# Client tests (all 38 pass)
npm run test:client

# Server tests (requires PostgreSQL)
createdb nala_test
cd server && DATABASE_URL=postgresql://localhost:5432/nala_test npx vitest run

Configuration

Environment Variable	Default	Description
`VITE_TTS`	(unset)	Set to `kokoro` for neural TTS voice (higher quality, slower first load)
`DATABASE_URL`	`postgresql://localhost:5432/nala`	PostgreSQL connection string
`PORT`	`3001`	Express server port
`API_URL`	`http://localhost:3001`	Backend URL for Vite proxy (Docker sets this automatically)

Browser Requirements

Chrome 113+ (required for WebGPU)
Microphone access (prompted on first use)
~2-4 GB free memory for the LLM model

Firefox and Safari do not yet support WebGPU and will not work.

Tech Stack

Layer	Technology	Purpose
Frontend	React 18, Vite 5, TypeScript	UI framework
AI	WebLLM (Llama 3.1 8B)	Browser-local LLM
Speech Input	Web Speech API	Voice-to-text
Speech Output	Web Speech API / Kokoro TTS	Text-to-voice
Visualization	Canvas API	Animated waveform orb
Avatar	SVG + requestAnimationFrame	Lion with lip sync
Backend	Express.js, TypeScript	REST API (persistence only)
Database	PostgreSQL 16	Conversation storage
DevOps	Docker Compose	One-command setup

Tutorial: Building Nala from Scratch with Claude Code

This project was built live in a single Claude Code session (~2 hours) as a tutorial for Cal Poly Pomona's hackathon. Below is the step-by-step process so you can replicate it for your own projects.

Prerequisites

Install Claude Code (npm install -g @anthropic-ai/claude-code)
Install Docker
A GitHub account with the GitHub CLI (gh) installed

Phase 1: Setup (~2 min)

Create a project directory, initialize git, and launch Claude Code:

mkdir nala && cd nala
git init
claude

Install the superpowers plugin inside Claude Code — it dramatically improves planning and spec quality:

/plugin superpowers
/reload-plugins

Phase 2: Write a CLAUDE.md File (~3 min)

CLAUDE.md is like long-term memory for Claude. It tells Claude what you're building, your tech stack, coding conventions, and file structure. Give Claude a prompt like:

"Create a CLAUDE.md file for this project. We're building Nala — a voice assistant web app. Stack: React + Vite + TypeScript frontend, Express.js backend, PostgreSQL for conversation memory, Web Speech API for browser-native voice input/output."

Structure it with WHAT (overview), WHY (design decisions), and HOW (coding standards). This keeps Claude focused across long sessions and prevents it from going off-track.

Phase 3: Design the Architecture (~15 min)

This is the most important step. Use the superpowers design skill:

"Use superpowers to design: Design the complete architecture for Nala. I need PostgreSQL schema, API endpoints, React component tree, and data flow."

Superpowers treats you like a colleague — it asks clarifying questions one at a time:

"Should you have a users table with no auth, or full authentication?"
"How many messages should be stored for context?"
"When should the waveform display — recording only, or also while thinking?"
"Monorepo or separate repos?"

Answer each question. The back-and-forth takes ~10 minutes but produces a comprehensive design spec saved to docs/superpowers/specs/. This spec is the foundation for everything that follows.

Key insight: If you skip this step and go straight to coding, you'll take side roads. As Jim put it in the session: "You can run a marathon, but if you don't know the direction you're headed, you might land in a forest and not the ocean."

Phase 4: Generate the Implementation Plan (~10 min)

Once the spec is approved, say:

"Approved, please proceed toward implementation."

Claude writes a detailed implementation plan with 18 bite-sized tasks — each with exact file paths, code, test commands, and commit messages. The plan is saved to docs/superpowers/plans/ and goes through a review loop before execution.

Phase 5: Execute the Plan (~30 min)

Choose subagent-driven development when prompted. Claude dispatches a fresh subagent for each task, reviews between tasks, and commits after each one:

Project scaffolding (monorepo, workspaces, configs)
Database schema + migrations + query functions
Express API routes (conversations, messages)
Client API service
React hooks (conversations, messages, WebLLM, voice I/O, audio analyser)
UI components (chat history, sidebar, waveform, mic button, model loader)
App shell wiring everything together

Each task follows TDD: write the test, verify it fails, implement, verify it passes, commit. You can let this run and go make coffee — it will ask if it hits a blocker.

Phase 6: Containerize (~5 min)

Instead of installing PostgreSQL locally, containerize everything:

"Can you just containerize this application and put Postgres as a Docker container and have the whole app be a container?"

Claude writes a docker-compose.yml with three services (db, server, client), volume mounts for hot reload, and auto-migration. One command to run: docker compose up --build.

Phase 7: Iterate on UX (~30 min)

This is where the magic happens. Use the app, notice what feels wrong, and tell Claude in natural language:

"The sessions should auto-record after the AI responds so I don't have to keep clicking. When I say 'stop', it should stop the conversation. The visualization isn't engaging enough. I had trouble knowing what to say initially."

Claude implements all four changes in one shot. Keep iterating:

"Do a UX audit and suggest 3-5 improvements" — Claude reads all your code, identifies 40+ issues, and recommends the highest-impact fixes
"Make it look like Stripe" — redesigns the sidebar with Stripe's color palette, typography, and spacing
"Make Nala an avatar like from the Lion King with lip sync" — creates an SVG lion with animated mouth driven by requestAnimationFrame
"The text-to-speech is too robotic" — swaps Web Speech API for Kokoro TTS (82M neural model running in-browser)

Phase 8: Work in Parallel

While the code is building, open another terminal and have Claude work on other things simultaneously:

# Terminal 1: Building the app (takes 30+ min)
claude  # executing the implementation plan

# Terminal 2: Documentation (takes 5 min)
claude  # "Create marketing docs, user personas, competitive analysis"

# Terminal 3: Security (takes 5 min)
claude  # "Write a security audit with OWASP ASVS assessment and threat model"

Documentation and audits complete much faster than code and don't interfere with each other.

Phase 9: Ship (~5 min)

Write a README, add a license, and push:

"Document everything with a README on how to get this setup. Assume Apache 2.0 license. Encourage public contributions."

Push to GitHub:

"Commit and push to GitHub."

Tips from the Session

Commit early and often — remind Claude to commit. You can always revert if something breaks
Plan before you code — a 15-minute spec saves hours of side roads
Use superpowers for everything — design, implementation, UX audits, docs
Paste errors directly — when something breaks, paste the error into Claude and it will debug
Think in swim lanes — code in one terminal, docs in another, security in a third
Feature flags for experiments — keep both options (e.g., VITE_TTS=kokoro) instead of replacing
Security matters — run an automated OWASP audit even for hackathon projects
Ship your code — version control everything in GitHub so you never lose work

Session Stats

Metric	Value
Total time	~2 hours
Commits	35+
Lines of code	~5,000+
Test files	13 (38 tests passing)
Docker services	3 (client, server, db)
Parallel Claude sessions	3

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Quick version:

Fork the repo
Make your changes
Run tests: npm run test:client
Open a PR

Check open issues for ideas on what to work on.

License

Apache 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nala

Features

Quick Start

Architecture

Data Flow

Project Structure

Development

With Docker (recommended)

Without Docker

Running Tests

Configuration

Browser Requirements

Tech Stack

Tutorial: Building Nala from Scratch with Claude Code

Prerequisites

Phase 1: Setup (~2 min)

Phase 2: Write a CLAUDE.md File (~3 min)

Phase 3: Design the Architecture (~15 min)

Phase 4: Generate the Implementation Plan (~10 min)

Phase 5: Execute the Plan (~30 min)

Phase 6: Containerize (~5 min)

Phase 7: Iterate on UX (~30 min)

Phase 8: Work in Parallel

Phase 9: Ship (~5 min)

Tips from the Session

Session Stats

Contributing

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Nala

Features

Quick Start

Architecture

Data Flow

Project Structure

Development

With Docker (recommended)

Without Docker

Running Tests

Configuration

Browser Requirements

Tech Stack

Tutorial: Building Nala from Scratch with Claude Code

Prerequisites

Phase 1: Setup (~2 min)

Phase 2: Write a CLAUDE.md File (~3 min)

Phase 3: Design the Architecture (~15 min)

Phase 4: Generate the Implementation Plan (~10 min)

Phase 5: Execute the Plan (~30 min)

Phase 6: Containerize (~5 min)

Phase 7: Iterate on UX (~30 min)

Phase 8: Work in Parallel

Phase 9: Ship (~5 min)

Tips from the Session

Session Stats

Contributing

License