Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
## graphify

This project has a graphify knowledge graph at graphify-out/.
This project has a knowledge graph at graphify-out/ with god nodes, community structure, and cross-file relationships.

When the user types `/graphify`, invoke the `skill` tool with `skill: "graphify"` before doing anything else.

Rules:
- Before answering architecture or codebase questions, read graphify-out/GRAPH_REPORT.md for god nodes and community structure
- If graphify-out/wiki/index.md exists, navigate it instead of reading raw files
- After modifying code files in this session, run `graphify update .` to keep the graph current (AST-only, no API cost)
- For codebase questions, first run `graphify query "<question>"` when graphify-out/graph.json exists. Use `graphify path "<A>" "<B>"` for relationships and `graphify explain "<concept>"` for focused concepts. These return a scoped subgraph, usually much smaller than GRAPH_REPORT.md or raw grep output.
- Dirty graphify-out/ files are expected after hooks or incremental updates; dirty graph files are not a reason to skip graphify. Only skip graphify if the task is about stale or incorrect graph output, or the user explicitly says not to use it.
- If graphify-out/wiki/index.md exists, use it for broad navigation instead of raw source browsing.
- Read graphify-out/GRAPH_REPORT.md only for broad architecture review or when query/path/explain do not surface enough context.
- After modifying code, run `graphify update .` to keep the graph current (AST-only, no API cost).
48 changes: 33 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,8 @@ Install only what you need:
| `leiden` | Leiden community detection (Python < 3.13 only) | `uv tool install "graphifyy[leiden]"` |
| `ollama` | Ollama local inference | `uv tool install "graphifyy[ollama]"` |
| `openai` | OpenAI / OpenAI-compatible APIs | `uv tool install "graphifyy[openai]"` |
| `minimax` | MiniMax OpenAI-compatible API (`--backend minimax`) | `uv tool install "graphifyy[minimax]"` |
| `nim` | NVIDIA NIM / AI Catalog OpenAI-compatible API (`--backend nim`) | `uv tool install "graphifyy[nim]"` |
| `gemini` | Google Gemini API | `uv tool install "graphifyy[gemini]"` |
| `anthropic` | Anthropic Claude API (`--backend claude`, uses `ANTHROPIC_API_KEY`) | `uv tool install "graphifyy[anthropic]"` |
| `bedrock` | AWS Bedrock (uses IAM, no API key) | `uv tool install "graphifyy[bedrock]"` |
Expand Down Expand Up @@ -312,7 +314,7 @@ See the [full command reference](#full-command-reference) below.

Create a `.graphifyignore` in your project root — same syntax as `.gitignore`, including `!` negation.

**`.gitignore` is respected automatically.** graphify reads the `.gitignore` in each directory. If a `.graphifyignore` is also present, the two are **merged** — `.graphifyignore` patterns are evaluated last, so they win on conflicts (including `!` negations). Adding a `.graphifyignore` only ever excludes more; it never re-includes a file your `.gitignore` already excluded. Subdirectory scoping works the same way as git — an ignore file only affects its own subtree.
**`.gitignore` is respected automatically.** Graphify loads `.gitignore` first, then `.graphifyignore`, so project-wide data/log/vendor exclusions apply and graphify-specific rules can override them with normal last-match-wins semantics. Subdirectory scoping works the same way as git — an ignore file only affects its own subtree.

```
# .graphifyignore
Expand Down Expand Up @@ -401,23 +403,36 @@ docker run -p 8080:8080 -v "$(pwd)/graphify-out:/data" graphify \

## Environment variables

These are only needed for **headless / CI extraction** (`graphify extract`). When running via the `/graphify` skill inside your IDE, the model API is provided by your IDE session — no extra keys needed.
These are only needed for **headless / CI extraction** (`graphify extract`) or when you want the `/graphify` skill to use a direct backend instead of the host assistant's own model. Automatic semantic extraction starts with local Ollama for laptop-safe <=8B-class models, tries the local fallback chain (`qwen2.5-coder:3b` → `gemma3:4b` by default), and uses MiniMax as the final spillover when local chunks are slow, too large, or laptop load is high. NVIDIA NIM remains available only when explicitly selected.

| Variable | Used for | When required |
|---|---|---|
| `OLLAMA_BASE_URL` | Ollama local inference URL | optional — default `http://localhost:11434/v1` |
| `GRAPHIFY_OLLAMA_MODEL` or `OLLAMA_MODEL` | Ollama model name | optional — default `qwen2.5-coder:3b`; must include a size and stay within the <=8B local safety class |
| `GRAPHIFY_OLLAMA_FALLBACK_MODELS` | Ordered local Ollama fallback models | optional — default `qwen2.5-coder:3b,gemma3:4b`; set `none` to disable local model fallback |
| `GRAPHIFY_OLLAMA_TOKEN_BUDGET` | Ollama semantic chunk packing cap | optional — default `20000`; keeps prompt + output inside the 32k local context before adaptive retry |
| `GRAPHIFY_OLLAMA_NUM_CTX` | Override Ollama KV-cache window size | optional — auto-sized by default |
| `GRAPHIFY_OLLAMA_KEEP_ALIVE` | Time to keep Ollama model loaded | optional — default `30s`; set `0` to unload after each chunk |
| `GRAPHIFY_OLLAMA_NUM_GPU` | Ollama GPU layer offload target | optional — default `999` to keep the local model on GPU |
| `GRAPHIFY_OLLAMA_MAIN_GPU` | Ollama GPU index | optional — default `0` |
| `GRAPHIFY_OLLAMA_NUM_THREAD` | Ollama CPU helper thread cap | optional — default `min(4, CPU/4)` with floor `2`; keeps GPU-fed local runs responsive without stealing daily-driving CPU |
| `GRAPHIFY_OLLAMA_BALANCE` | Ollama/MiniMax balancing | optional — `auto` (default), `local`, `remote`, or `defer` |
| `GRAPHIFY_OLLAMA_MINIMAX_MAX_FRACTION` | Cost cap for dynamic MiniMax spillover | optional — default `0.25` |
| `GRAPHIFY_DISABLE_MINIMAX_FALLBACK` | Disable Ollama→MiniMax cloud fallback | optional — set `1` for strict local-only semantic extraction |
| `MINIMAX_API_KEY` or `GRAPHIFY_MINIMAX_API_KEY` | MiniMax OpenAI-compatible token-plan fallback | `--backend minimax` or dynamic spill/fallback when Ollama is slow or fails |
| `GRAPHIFY_MINIMAX_MODEL` or `MINIMAX_MODEL` | MiniMax model override | optional — default `MiniMax-M3` |
| `NVIDIA_NIM_API_KEY`, `GRAPHIFY_NVIDIA_NIM_API_KEY`, `NVIDIA_API_KEY`, or `NGC_API_KEY` | NVIDIA NIM / AI Catalog backend | explicit `--backend nim` only |
| `GRAPHIFY_NVIDIA_NIM_MODEL`, `NVIDIA_NIM_MODEL`, or `NIM_MODEL` | NVIDIA NIM model override | optional — default `meta/llama-3.1-8b-instruct` |
| `NVIDIA_NIM_BASE_URL` or `NIM_BASE_URL` | NVIDIA NIM endpoint override | optional — default `https://integrate.api.nvidia.com/v1` |
| `ANTHROPIC_API_KEY` | Claude (Anthropic) backend | `--backend claude` |
| `ANTHROPIC_BASE_URL` | Anthropic-compatible endpoint URL (LiteLLM proxy, gateways, ...) | `--backend claude` (default: `https://api.anthropic.com`) |
| `ANTHROPIC_MODEL` | Model name for the Claude backend — for custom endpoints, use the model name/alias your server exposes | `--backend claude` (default: `claude-sonnet-4-6`) |
| `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Google Gemini backend | `--backend gemini` |
| `OPENAI_API_KEY` | OpenAI or OpenAI-compatible APIs | `--backend openai` (local servers accept any non-empty value) |
| `OPENAI_BASE_URL` | OpenAI-compatible server URL (llama.cpp, vLLM, LM Studio, ...) | `--backend openai` (default: `https://api.openai.com/v1`) |
| `OPENAI_MODEL` | Model name for the OpenAI backend — for self-hosted servers, use the model name/alias your server exposes (check its `/v1/models` endpoint), e.g. `LFM2.5-8B-A1B-UD-Q4_K_XL` for llama.cpp | `--backend openai` (default: `gpt-4.1-mini`) |
| `OPENAI_MODEL` | Model name for the OpenAI backend — for self-hosted servers, use the model name/alias your server exposes | `--backend openai` (default: `gpt-4.1-mini`) |
| `DEEPSEEK_API_KEY` | DeepSeek backend | `--backend deepseek` |
| `MOONSHOT_API_KEY` | Kimi Code backend | `--backend kimi` |
| `OLLAMA_BASE_URL` | Ollama local inference URL | `--backend ollama` (default: `http://localhost:11434`) |
| `OLLAMA_MODEL` | Ollama model name | `--backend ollama` (default: auto-detect) |
| `GRAPHIFY_OLLAMA_NUM_CTX` | Override Ollama KV-cache window size | optional — auto-sized by default |
| `GRAPHIFY_OLLAMA_KEEP_ALIVE` | Minutes to keep Ollama model loaded | optional — set `0` to unload after each chunk |
| `AZURE_OPENAI_API_KEY` | Azure OpenAI Service backend | `--backend azure` |
| `AZURE_OPENAI_ENDPOINT` | Azure resource endpoint URL | `--backend azure` (required alongside API key) |
| `AZURE_OPENAI_API_VERSION` | Azure API version override | optional — default `2024-12-01-preview` |
Expand All @@ -437,14 +452,18 @@ These are only needed for **headless / CI extraction** (`graphify extract`). Whe
| `GRAPHIFY_MAX_GRAPH_BYTES` | Override the 512 MiB graph.json size cap — e.g. `700MB`, `2GB`, or plain bytes | optional — useful for very large corpora |
| `GRAPHIFY_LLM_TEMPERATURE` | Override LLM temperature for semantic extraction — e.g. `0.7`, or `none` to omit | optional — auto-omitted for o1/o3/o4/gpt-5 reasoning models |

For user-wide MiniMax defaults that work even when a coding agent is launched without your shell environment, put the key in `~/.graphify/credentials.json` as `{"api_keys":{"MINIMAX_API_KEY":"..."}}` and keep that file out of git.

For semantic rebuilds that can wait, run daytime commands with `GRAPHIFY_OLLAMA_BALANCE=defer`; graphify writes `graphify-out/semantic-rebuild-queue.jsonl` with the night-window rebuild hint. Use `graphify update .` immediately for low-load AST indexing, then run queued semantic rebuilds after 20:00 when the laptop is idle (03:00-06:00 remains the safest window).

---

## Privacy

- **Code files** — processed locally via tree-sitter. Nothing leaves your machine. A code-only corpus requires no API key — `graphify extract` runs fully offline.
- **Video / audio** — transcribed locally with faster-whisper. Nothing leaves your machine.
- **Docs, PDFs, images** — sent to your AI assistant for semantic extraction (via the `/graphify` skill, using whatever model your IDE session runs). Headless `graphify extract` requires `GEMINI_API_KEY` / `GOOGLE_API_KEY` (Gemini), `MOONSHOT_API_KEY` (Kimi), `ANTHROPIC_API_KEY` (Claude), `OPENAI_API_KEY` (OpenAI), `DEEPSEEK_API_KEY` (DeepSeek), a running Ollama instance (`OLLAMA_BASE_URL`), AWS credentials via the standard provider chain (Bedrock - no API key needed, uses IAM), or the `claude` CLI binary (Claude Code - no API key needed, uses your Claude subscription). The `--dedup-llm` flag uses the same key.
- **Data residency** — `graphify extract` auto-detects which provider to use based on which API key is set (priority: Gemini → Kimi → Claude → OpenAI → DeepSeek → Azure → Bedrock → Ollama). For code with data-residency requirements, use `--backend ollama` (fully local) or pass an explicit `--backend` flag. Kimi (`MOONSHOT_API_KEY`) routes to Moonshot AI servers in China.
- **Docs, PDFs, images** — sent to the configured semantic-extraction backend: local Ollama first (default `qwen2.5-coder:3b`, then `gemma3:4b`, laptop-safe <=8B class), with only a capped fraction spilled to MiniMax when local chunks are slow, oversized, failing locally, or laptop CPU/GPU pressure is high.
- **Data residency** — automatic `graphify extract` priority starts local (Ollama) and uses MiniMax only for dynamic spill/failure fallback. Ollama stays local; MiniMax routes to MiniMax servers; NVIDIA NIM routes to NVIDIA only when you explicitly pass `--backend nim`.
- No telemetry, no usage tracking, no analytics.
- **Query logging** — every `graphify query`, `graphify path`, `graphify explain`, and MCP `query_graph` call is logged to `~/.cache/graphify-queries.log` in JSON Lines format (timestamp, question, corpus, nodes returned, duration). Full subgraph responses are **not** stored by default. Set `GRAPHIFY_QUERY_LOG_DISABLE=1` to opt out, or `GRAPHIFY_QUERY_LOG=/dev/null` to silence without disabling the code path.

Expand Down Expand Up @@ -600,10 +619,10 @@ graphify devin uninstall
graphify antigravity install # .agents/rules + .agents/workflows (Google Antigravity)
graphify antigravity uninstall

graphify extract ./docs # headless LLM extraction for CI (no IDE needed)
graphify extract ./docs --backend gemini # explicit backend: gemini, kimi, claude, openai, deepseek, ollama, bedrock, or claude-cli
graphify extract ./docs # headless LLM extraction; auto: laptop-safe Ollama primary, capped MiniMax spillover
graphify extract ./docs --backend gemini # explicit backend: ollama, minimax, nim, gemini, kimi, claude, openai, deepseek, bedrock, or claude-cli
graphify extract ./docs --backend gemini --model gemini-3.1-pro-preview
graphify extract ./docs --backend ollama # local Ollama (set OLLAMA_BASE_URL / OLLAMA_MODEL) - no API key needed for loopback
graphify extract ./docs --backend ollama # local Ollama (default qwen2.5-coder:3b) - no API key needed for loopback
OPENAI_BASE_URL=http://localhost:8080/v1 OPENAI_MODEL=my-model graphify extract ./docs --backend openai # any OpenAI-compatible server (llama.cpp, vLLM, LM Studio)
ANTHROPIC_BASE_URL=http://localhost:4000 ANTHROPIC_MODEL=my-model graphify extract ./docs --backend claude # any Anthropic-compatible endpoint (LiteLLM proxy, gateways)
GRAPHIFY_OLLAMA_NUM_CTX=32768 graphify extract ./docs --backend ollama # override KV-cache window (auto-sized by default)
Expand Down Expand Up @@ -649,7 +668,7 @@ graphify clone https://github.com/karpathy/nanoGPT
graphify merge-graphs a.json b.json --out merged.json
graphify --version # print installed version
graphify watch ./src
graphify check-update ./src
graphify check-update ./src # prints pending semantic/night-window hints; never runs heavy work
graphify update ./src
graphify update ./src --no-cluster # skip reclustering, write raw AST graph only
graphify update ./src --force # overwrite even if new graph has fewer nodes
Expand All @@ -659,8 +678,7 @@ graphify cluster-only ./my-project --max-concurrency 16 --batch-size 200 # para
graphify cluster-only ./my-project --resolution 1.5 # more, smaller communities
graphify cluster-only ./my-project --exclude-hubs 99 # exclude p99 degree nodes from partitioning
graphify cluster-only ./my-project --no-label # keep "Community N" placeholders
graphify cluster-only ./my-project --backend=gemini # backend for community naming
graphify cluster-only ./my-project --backend=gemini --model gemini-2.5-pro # specific model
graphify cluster-only ./my-project --backend=ollama # backend for community naming
graphify label ./my-project # (re)name communities with the configured backend
graphify label ./my-project --backend=openai --model gpt-4o # force a specific backend and model
```
Expand Down
Loading