One command for all your AI CLI tools. No API keys required. Also usable as a Python library.
ai sonnet "explain this code"
ai gemini "write a haiku"
ai llama-3.3 "review my PR"Uses your existing CLI subscriptions - not API tokens. If you have Claude Code, Gemini CLI, or Codex installed, this tool dispatches to them directly. No per-token costs, no API key management.
Instead of remembering different CLI tools and flags:
claude --print --model sonnet ...
gemini --model gemini-2.5-flash ...
codex exec --model gpt-5.2-codex ...Just use:
ai <model> "prompt"- No API keys - uses your existing CLI tools (Claude, Gemini, Codex, Qwen, Ollama)
- GLM API - Zhipu AI GLM models via Coding Plan API
- One command for all providers
- Multi-model parallel -
ai opus pro gpt "prompt"runs all 3 simultaneously - Smart aliases -
llama-3.3instead ofmeta-llama/llama-3.3-70b-instruct:free - Auto-discovery -
ai initdetects installed tools and fetches available models - Run mode -
ai rungenerates a command, shows it, and executes on confirm - YOLO mode -
ai yoloauto-approves file edits across all providers - File context -
ai -F file.txtinjects file contents into prompts - Free OpenRouter models - 31 free cloud models when you need them
- Stdin support -
cat code.py | ai sonnet "review this"
# Make executable
chmod +x ai.py
# Create symlink to make 'ai' available globally
ln -sf "/path/to/ai-cli/ai.py" /usr/local/bin/ai
# Or run as a Python module (no symlink needed)
python -m ai_cli <alias> "prompt"
# Initialize (detect installed tools + fetch models)
ai initFor OpenRouter free models (optional):
echo "OPENROUTER_API_KEY=your-key-here" > .env
ai init # re-run to fetch free modelsFor GLM (Zhipu AI) Coding Plan (optional):
echo "ZHIPU_API_KEY=your-key-here" >> .env
ai init # re-run to discover GLM models# Basic usage
ai sonnet "Explain recursion"
ai gemini "Write a haiku"
ai llama-3.3 "Code review this"
# Set a default model
ai default sonnet
# Use default model (no alias needed)
ai "Explain recursion"
# JSON output
ai json sonnet "Return structured data"
# Get terminal command (print only)
ai cmd "list all docker containers"
# Generate, confirm, and execute command
ai run "list all docker containers"
ai sonnet run "stop nginx" # model first also works
ai run -y "stop nginx" # skip confirmation
# YOLO mode (auto-approve file edits)
ai yolo sonnet "refactor main.py"
# Stdin for large prompts
cat file.txt | ai haiku
# List available models
ai list
# Re-initialize to update models
ai init
# Multi-model (parallel execution)
ai opus pro gpt "review this code"
ai sonnet haiku mimo "explain recursion"
# File context (inject file contents into prompt)
ai sonnet -F main.py "explain this code"
ai glm -F main.py,config.json "review these files"
ai opus -F src/ "summarize this directory"Query multiple models simultaneously and compare responses:
ai opus pro gpt "what are the pros and cons of microservices?"Output:
━━━ opus (18.2s) ━━━
[opus response]
━━━ pro (12.1s) ━━━
[pro response]
━━━ gpt (15.3s) ━━━
[gpt response]
- Runs all models in parallel (total time ≈ slowest model, not sum)
- Shows timing for each response
- Errors in one model don't affect others
- Works with stdin:
echo "prompt" | ai opus pro gpt
Tab completion for commands and model aliases:
# Zsh
echo 'eval "$(ai completions zsh)"' >> ~/.zshrc
# Bash
echo 'eval "$(ai completions bash)"' >> ~/.bashrc
# Fish
echo 'ai completions fish | source' >> ~/.config/fish/config.fishThen ai son<TAB> completes to ai sonnet.
If prompts with ?, *, or other special characters fail, add this to ~/.zshrc:
alias ai='noglob ai'This lets you type ai sonnet "what's up?" without quoting.
Use ai-cli as a Python library in your scripts:
from ai_cli import AIClient
client = AIClient()
response = client.call("sonnet", "Explain Python's GIL")
# With options
response = client.call("opus", "List 3 colors", json_mode=True)
# Multi-model parallel
results = client.call_multi(["opus", "pro", "gpt"], "Explain X")
for alias, response in results.items():
print(f"{alias}: {response}")
# List available models
for alias, (provider, model) in client.list_models().items():
print(f"{alias} -> {provider}:{model}")Start a local HTTP server for cross-language access:
ai serve # Auto-generates auth token, prints it
ai serve 3000 # Custom port
ai serve --token mytoken # Use specific token
AI_CLI_SERVER_TOKEN=x ai serve # Token from env var
ai serve --no-auth # Disable auth (not recommended)The server requires Bearer token authentication. Token priority: --token flag > AI_CLI_SERVER_TOKEN env var > auto-generated.
From JavaScript:
const TOKEN = 'your-token-here'; // printed when server starts
const response = await fetch('http://localhost:8765/call', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${TOKEN}`
},
body: JSON.stringify({ alias: 'sonnet', prompt: 'Hello!' })
});
const { result } = await response.json();Endpoints:
GET /health- Health check (no auth required)GET /models- List available modelsGET /providers- List providersPOST /call- Execute prompt ({alias, prompt, json_mode?, yolo?})
| Provider | Aliases | Models |
|---|---|---|
| Claude (Anthropic) | haiku, sonnet, opus |
haiku, sonnet, opus |
| Codex (OpenAI) | gpt, codex-max, codex-mini |
gpt-5.2, gpt-5.1-codex-max, gpt-5.1-codex-mini |
| Gemini (Google) | pro, flash, pro-2.5, flash-2.5 |
gemini-3-pro-preview, gemini-3-flash-preview, gemini-2.5-* |
| Qwen (Alibaba) | qwen, qwen-vision |
coder-model, vision-model |
| Ollama (local) | Auto-generated | Detected via ollama list |
| Provider | Aliases | Models |
|---|---|---|
| GLM (Zhipu AI) | glm, glm4, glm-air |
glm-4.7, glm-4.6, glm-4.5-air, glm-4.5-flash |
Requires ZHIPU_API_KEY in .env or environment.
31 free models fetched from OpenRouter API. Smart aliases auto-generated:
ai trinity # arcee-ai/trinity-mini:free
ai llama-3.3 # meta-llama/llama-3.3-70b-instruct:free
ai mistral # mistralai/mistral-7b-instruct:free
ai deepseek-r1 # deepseek/deepseek-r1-0528:free
Note: OpenRouter's glm alias is overridden by the native GLM provider. Use openrouter:z-ai/glm-4.5-air:free to access OpenRouter's GLM models.
$ ai list
Available models:
claude: (3)
haiku
sonnet
opus
codex: (4)
gpt -> gpt-5.2
codex-max -> gpt-5.1-codex-max
codex-mini -> gpt-5.1-codex-mini
gpt-5.2
gemini: (5)
pro -> gemini-3-pro-preview
flash -> gemini-3-flash-preview
...
glm: (4)
glm -> glm-4.7
glm4 -> glm-4.7
glm-air -> glm-4.5-air
glm-4.6
openrouter: (31)
trinity -> arcee-ai/trinity-mini:free
llama-3.3 -> meta-llama/llama-3.3-70b-instruct:free
...
Installed CLI tools: codex, claude, gemini, qwen, ollama
Default model: sonnet
ai initdetects installed CLI tools and fetches available models- Auto-generates short aliases from model names
- On each call, resolves alias → (provider, model)
- Dispatches to the appropriate provider (CLI subprocess or HTTP API)
- Returns output
ai_cli/
├── client.py # AIClient - library interface
├── providers/ # Provider implementations
│ ├── claude.py # Claude CLI
│ ├── codex.py # Codex CLI
│ ├── gemini.py # Gemini CLI
│ ├── glm.py # GLM API (Zhipu AI)
│ ├── qwen.py # Qwen CLI
│ ├── ollama.py # Ollama CLI
│ └── openrouter.py # OpenRouter HTTP
├── cli.py # CLI entry point
└── server.py # HTTP server
MIT