"If your docs are clear enough, AI writes the code. The real work IS the documentation. Code is just the printout." — Stream Coding methodology by Francesco Marinoni Moretto
Stream Coding is a documentation-first development methodology for AI-assisted coding. Instead of writing code and hoping it works, you write specs so precise that code generation becomes automatic — and verifiable.
This repository is a portable methodology kit for AI coding agents. Install it into any project and your agent gains structured skills, quality gates, and curated design intelligence — all auto-loaded from a single config directory.
| Agent | Config File | Status |
|---|---|---|
| Antigravity (Google) | GEMINI.md + .agents/ |
✅ Verified |
| Claude Code (Anthropic) | CLAUDE.md + .agents/ |
✅ Verified |
| Gemini CLI (Google) | GEMINI.md + .agents/ |
✅ Verified |
All platforms share the same
.agents/directory — skills, rules, scripts, and templates. Only the root config file differs per agent.
Most AI coding methodologies rely on the agent's self-discipline — rules that say "you should" but can't verify "you did." Stream Coding is different:
-
Deterministic gates, not guidelines. Quality is enforced by Python scripts (
verify.py,spec_precheck.py,spec_conformance.py,tdd_check.py,security_scan.py) that run real checks and produce pass/fail — not by hoping the AI follows instructions. A process rule has ~70% compliance. A script has ~100%. -
Full pipeline, not a skill pack. This isn't a collection of tips — it's a 6-stage pipeline (DISCOVER → STRATEGY → SPEC → BUILD → VERIFY → HARDEN) where each gate blocks the next stage. You can't ship unverified code because the commit gate won't let you.
-
Self-improving. Every project you ship feeds learnings back (
/learn-eval) and the methodology evolves (/methodology-reviewwith ablation testing). Stale gates get removed. New gaps get filled. The system compounds. -
5% research, 80% documentation, 15% code. Most methodologies optimize the coding step. Stream Coding optimizes the input to the coding step — because a 10/10 spec produces code that works on the first pass, and a 7/10 spec produces code that needs 30% rework.
-
Agent-agnostic from day one. One
.agents/directory, verified on 3 platforms. No vendor lock-in, no platform-specific markup. Markdown in, quality out.
Without Stream Coding:
Vague prompt → AI guesses → looks right → breaks in prod → rework → 2-3x velocity
With Stream Coding:
Clear spec → AI executes → verified by gates → minimal rework → 10-20x velocity
The difference isn't the AI model. It's the quality of what you feed it.
git clone https://github.com/Pyl-Tech/stream-coding.git
cd agy-stream-coding
# Interactive setup — choose your platforms
./install.sh /path/to/your/project
# All platforms (non-interactive)
./install.sh --all /path/to/your/project
# Specific platforms
./install.sh --platforms agy,claude /path/to/your/project
# Preview without copying
./install.sh --dry-run --all /path/to/your/project| Flag | Effect |
|---|---|
--force |
Overwrite existing files |
--dry-run |
Preview what would be installed |
--all |
Install all platforms (skip selection) |
--platforms |
Comma-separated: agy, gemini, claude |
This copies methodology files, skills, and platform configs into your project. Your AI agent loads them automatically on every session.
Every significant feature follows this flow:
📋 Write a spec → ⛔ Pass the Spec Gate → 🧪 Write tests first → ⚡ Implement → 🔍 Verify → ✅ Commit
You spend 80% of your time on documentation, not code. That's by design — it's where the leverage is.
| When you need to... | Run |
|---|---|
| Research before building | /research |
| Shape a vague idea into a concept | /clarify |
| Clarify what you're building | /strategy |
| Check if your spec is ready | /spec-gate |
| Write tests from the spec | /tdd |
| Build a feature end-to-end | /orchestrate |
| Verify everything works | /verification-loop |
| Document an existing codebase | /reverse-spec |
| Debug a multi-cause issue | /trace |
The full methodology lives in GEMINI.md at your project root. Read it once — it's the operating manual for how any AI agent behaves on your project. CLAUDE.md is a lean router (~50 lines, ~423 tokens) that gives Claude Code the same enforcement without token bloat.
| Stage | Icon | What You Do | Time |
|---|---|---|---|
| DISCOVER | 🔬 | Search for existing solutions, shape vague ideas | — |
| STRATEGY | 🎯 | Answer 7 questions: what, why, for whom, what NOT | 40% |
| SPEC | 📋 | Write AI-ready docs, pass the 16-item Spec Gate | 40% |
| BUILD | ⚡ | Tests first (RED), then implement (GREEN) | 5% |
| VERIFY | 🔍 | Build, lint, test, security scan, code review | 5% |
| HARDEN | ✅ | Sync docs, clean up, extract learnings, COMMIT | 5% |
⛔ Commits only happen in HARDEN. Not before. This prevents shipping unverified code.
Every gate is a hard stop. If it fails, you fix it — not skip it.
| Gate | Question It Answers | When |
|---|---|---|
| Spec Gate (16 items) | Can AI implement this without asking questions? | Before any code |
| TDD Gate | Do failing tests exist before implementation? | Before writing production code |
| Verify Gate | Does everything build, pass, and scan clean? | After implementation |
| Commit Gate | verify + doc-update + refactor-clean + learn-eval all complete? | Before commit |
- Documentation IS the work. Code is just the printout.
- When code fails, fix the spec — then regenerate the code.
- A 7/10 spec produces 7/10 code that needs 30% rework. Aim for 10/10.
- Never silently fill a spec gap. If the spec doesn't say it, ask — don't guess.
1. /research — Is there an existing solution?
2. /clarify — Shape the idea if it's vague (skip if already sharp)
3. /strategy — Clarify the 7 Questions
4. Write the spec — Anti-patterns, test cases, error matrix, deep links
5. /spec-gate — Score must be 10/10
6. /adversarial-review — Stress-test with a different model
7. /plan — Break into implementation steps
8. /tdd — Write tests from the spec (they must FAIL)
9. Implement — Write code to make tests pass
10. /verification-loop — Build + lint + test + security
11. /code-review — Does code match spec?
12. /doc-update — Sync documentation
13. /learn-eval — Extract what worked / what didn't
14. Commit & push
Skip to BUILD — but still write the test first:
1. /tdd — Write a test that reproduces the bug (RED)
2. Fix the code — Make the test pass (GREEN)
3. /verification-loop — Verify nothing else broke
4. Commit
1. /trace — Form 2+ hypotheses, test each one
2. /build-fix — If it's a build issue
3. /self-debug — If the agent itself is looping
1. /research — What are the options?
2. /experiment — If it needs data, run an experiment
3. /strategy — Document the decision as an ADR
1. Install SC — ./install.sh /path/to/legacy-app
2. /reverse-spec — Produces SC-compatible specs from existing code
3. /strategy — Define the migration target
4. Write migration spec — Delta between as-is and to-be
5. /spec-gate — Validate the migration spec
6. /tdd → implement — Build the migration
7. /verification-loop — Verify everything works
8. Commit & push
1. /refactor — Diagnose pain points, propose target architecture, plan migration
(Strangler Fig by default — one step at a time, tests throughout)
1. /dep-upgrade — Identify breaking changes → audit callsites → choose strategy
→ behavior tests first → migrate → verify checklist
1. /test-retrofit — Scope by risk → discover intended behavior (docs → callers → git)
→ distinguish intended vs. bug → write characterization tests
→ integrate into CI as permanent safety net
| Change Type | What to do |
|---|---|
| Typo, rename | Direct fix → lint → commit |
| Bug fix with clear repro | TDD + fix → verify → commit |
| New feature, clear criteria | SPEC → BUILD → VERIFY → HARDEN |
| New feature, ambiguous | Full pipeline (DISCOVER through HARDEN) |
| Legacy app, needs docs first | /reverse-spec → then normal pipeline |
| High-stakes migration | /reverse-spec → full pipeline + ADR + adversarial review |
This is not a static rulebook. The methodology updates itself from real project experience.
Every project you ship
↓
/learn-eval ← capture what worked and what failed
↓
.agents/learnings.md ← LOCAL cross-project pattern registry
↓
Learnings Hub (opt-in) ← REMOTE shared registry (universal learnings only)
↓
/methodology-review
├─ 2a: Ablation test ← is each gate still load-bearing?
│ run a task WITH and WITHOUT the component
│ if zero findings across 3+ projects → candidate for removal
│
└─ 2b: Gap fill ← what's missing based on accumulated learnings?
generate fix candidates (structural > process)
A/B test each on real data that exposed the gap
prefer additive changes to existing scripts
↓
Methodology updated ← rules, gates, skills, GEMINI.md
↓
Better specs → better AI output → next project starts smarter
↓
[loop]
Run after every BUILD → HARDEN cycle. Classify what happened:
| Outcome | What to extract |
|---|---|
| Perfect first pass | The spec pattern that made it work — capture it to reproduce |
| Minor rework (1–2 fixes) | The ambiguity that caused it — add to anti-patterns |
| Major rework | The gap — strengthen the Spec Gate |
| Spec was fundamentally wrong | The failing assumption — add to the Assumptions checklist |
Learnings are saved to .agents/learnings.md with evidence, scope (project-specific vs universal), and a link to the spec that produced them.
The Learnings Hub is an optional MCP server that centralizes universal learnings across all your workspaces. It runs on Cloud Run and is connected automatically by the installer for all platforms (Antigravity, Gemini CLI, Claude Code).
| Question | Answer |
|---|---|
| What gets sent? | Only universal, methodology-level learnings — spec-writing patterns, review calibration findings, methodology gaps. Never project-specific data, never source code, never business logic. |
| What stays local? | Everything in .agents/learnings.md. Project-specific learnings ("BigQuery streaming inserts block DML", "chi router prefix resolution") are never submitted. |
| Authentication | Google OAuth 2.0 with PKCE. First use triggers a browser-based consent flow. Credentials are cached locally. |
| Where does data live? | Git repository on Cloud Run (europe-west1). Data is stored in a private Git repo, not in any third-party database. |
| Is it required? | No. If the MCP server is unavailable or not configured, the agent reports it to you and continues — local .agents/learnings.md always works standalone. |
| Can I opt out? | Yes. Remove learnings-hub from your mcp_config.json. No data is sent without the MCP connection. |
| GDPR | No personal data is collected beyond the Google account email used for auth. Learnings are methodology patterns ("always define error envelopes before endpoint implementation"), not user data. You can request deletion at any time. |
Scope filter enforced by the agent: Only learnings that would apply to ANY project regardless of language, framework, or domain are submitted. The agent checks this before every submission.
/methodology-review does two things:
2a — Ablation (remove what's stale): For each gate and skill, test whether removing it degrades output quality. Models improve; gates go stale. A component that produces zero findings across 3+ projects is dead weight.
2b — Gap fill (add what's missing): When learnings reveal a recurring failure pattern, generate fix candidates, A/B test them against real data, and promote the winning fix into the narrowest possible surface:
| Routing question | Surface |
|---|---|
| Should this apply on every task, unconditionally? | Rule (.agents/rules/) |
| Is this an on-demand playbook or slash command? | Skill (.agents/skills/<name>/SKILL.md) |
| Is this a one-shot deterministic check? | Script (inside an existing skill's scripts/) |
Structural beats process. A script that auto-checks has ~100% compliance. A self-enforced process step has ~70%. When a gap can be automated, it must be.
Every project you ship makes the methodology marginally better for every project that comes after it. The learning registry, the gates, and the rules all compound. This is the mechanism that separates a living methodology from a static style guide.
This repository is the shared methodology. Everyone uses it, everyone improves it.
- Fork this repo
- Make your change — new rule, skill improvement, gate refinement, data update
- Run
/spec-gateon any new rules or skills - Run
/methodology-review— ablation before addition. Can something be removed or simplified instead? - Submit a PR with:
- What changed
- Why (what learning or failure prompted this)
- Evidence (link to the project where this was discovered)
| Type | Where | Example |
|---|---|---|
| New coding standard | .agents/rules/ |
"Always use structured logging in Python" |
| New language rule | .agents/skills/<lang>/SKILL.md |
Adding a Dart or Ruby language skill |
| Skill improvement | .agents/skills/<name>/ |
Better /tdd instructions for E2E tests |
| Gate refinement | spec-gate or verification-loop skill |
New Spec Gate check item |
| Skill enhancement | .agents/skills/<skill>/ |
Better design intelligence data |
| Template update | .agents/templates/ |
More practical ADR template |
| Methodology change | GEMINI.md |
New stage skip rule, better trigger behavior |
- Every change must be spec-referenced. Don't add rules "because it feels right" — link to a real failure or learning that prompted it.
- Structural enforcement > process enforcement. If a check can be automated as a script, it should be — AI self-discipline checks have ~70% compliance; scripts have ~100%.
- Methodology changes require ablation. Before adding, ask: "What can I remove?" Run
/methodology-reviewto evaluate.
| File | Purpose |
|---|---|
GEMINI.md |
Full methodology (repo root — auto-loaded by Antigravity & Gemini CLI) |
CLAUDE.md |
Lean router for Claude Code (~423 tokens, routes to GEMINI.md + .agents/) |
coding-standards.md |
Immutability, file organization, error handling, validation |
git-discipline.md |
Commit format, -F protocol, staging safety rules |
harden-stage.md |
Commit gate, ADR triggers, divergence prevention |
build-execution.md |
Spec-Test-Implement loop, smallest diff |
spec-writing.md |
4 mandatory sections, granularity, constraint tiers |
strategy-stage.md |
7 Questions, documentation audit, exit criteria |
verify-stage.md |
Verify gate, evaluator separation |
workflow-orchestration.md |
Decision tree, session-start protocol, skill reference |
antigravity-integration.md |
Antigravity-specific mapping and artifact paths |
Every capability lives in .agents/skills/<name>/SKILL.md. Slash commands (/research, /tdd, etc.) and auto-invoked skills share the same structure — the agent loads them when the task matches.
| Command | Stage | Purpose |
|---|---|---|
/research |
🔬 DISCOVER | Research before implementation (Adopt > Adapt > Build) |
/experiment |
🔬 DISCOVER | Data-driven decision making (run experiments, not debates) |
/clarify |
🔬 DISCOVER | Shape vague ideas into sharp concepts (convergence test exit gate) |
/strategy |
🎯 STRATEGY | Clarify WHAT and WHY (7 Questions Framework) |
/audit |
🎯 STRATEGY | Clean existing documentation (target 40-50% reduction) |
/reverse-spec |
📋 SPEC | Reverse-engineer SC-compatible specs from existing code |
/spec-gate |
📋 SPEC | Verify spec completeness before coding (16-item checklist) |
/adversarial-review |
📋 SPEC | Stress-test specs with epistemic pre-scan + adversarial prompt |
/plan |
⚡ BUILD | Create implementation plan from validated spec |
/orchestrate |
⚡ BUILD | Multi-step pipeline (feature/bugfix/refactor/security) |
/tdd |
⚡ BUILD | Test-driven development cycle (includes E2E) |
/build-fix |
⚡ BUILD | Incremental build error resolution |
/trace |
⚡ BUILD | Causal debugging with parallel hypothesis testing |
/verification-loop |
🔍 VERIFY | Full quality check (build, lint, test, security) |
/code-review |
🔍 VERIFY | Audit code ↔ spec conformance |
/security-review |
🔍 VERIFY | Full security audit (OWASP, secrets, dependencies) |
/doc-update |
✅ HARDEN | Documentation ↔ code synchronization |
/refactor-clean |
✅ HARDEN | Dead code detection and cleanup |
/learn-eval |
✅ HARDEN | Extract spec patterns for continuous learning |
/update-codemaps |
✅ HARDEN | Generate token-lean architecture documentation |
/retrocheck |
✅ HARDEN | Retroactive alignment against current specs and rules |
/aside |
— | Ask a question mid-task without losing context |
/self-debug |
— | Agent failure recovery and loop detection |
/methodology-review |
— | Ablation and optimization of the methodology itself |
/reload |
— | Force full reload of all agent configuration |
| Skill | Trigger Context | Key Assets |
|---|---|---|
| ui-ux-pro-max | Frontend design, landing pages, dashboards | 4,800+ records: 85 styles, 161 palettes, 1,924 Google Fonts, 16 stacks |
| performance-optimization | Slow endpoints, bottlenecks | Measure → Identify → Fix → Verify → Guard cycle |
| ci-cd | Deployment, pipelines, quality gates | Pipeline templates, rollback thresholds |
| database | Database operations, migrations, modeling | PostgreSQL-first patterns, migration safety |
| compliance | Personal data, GDPR, privacy | Privacy checklist, data handling rules |
| adk-agent | Google ADK, AgentEngine, VertexAI deployment | Project structure, tool_context, eval patterns |
| mcp-builder | MCP servers, LLM tool integrations | FastMCP/MCP SDK patterns, evaluation criteria |
| mcp-ge | Gemini Enterprise, Agent Designer, Cloud Run | OAuth, StreamableHTTP, identity resolution |
| a2ui | Agent-to-UI (A2UI), GE rich responses | v0.8 component reference, silent failure patterns |
| terraform-gcp | GCP infrastructure as code | Module structure, IAM patterns, state management |
| refactor | Architectural refactoring — extract modules, invert deps, migrate patterns | 4-phase process: Diagnose → Target → Plan → Execute |
| dep-upgrade | Dependency upgrades and library migrations with breaking change impact analysis | 5-phase: Identify → Audit → Strategy → Migrate → Verify |
| test-retrofit | Retroactive test coverage for legacy code with no tests — characterize before locking | 5-phase: Scope → Discover → Distinguish → Write → Guard |
| cross-spec-validator | Multi-spec projects (≥3 specs), cross-spec coherence | 6 checks: singletons, constants, bitfields, reentrancy, coverage, cycles |
| api-design | REST API endpoint design and review | Endpoint checklist, error envelope, versioning |
| sentrux | Architectural structural governance | .sentrux/rules.toml, DSM, quality signal |
| shipping | Deployment, feature flags, rollback plans | Release checklist, monitoring setup |
| gcp | Google Cloud Platform conventions | IAM patterns, service configuration |
| golang | Go conventions and constraints | Formatting, error handling, concurrency, testing |
| python | Python conventions and constraints | Type hints, formatting, testing, Django/FastAPI |
| typescript | TypeScript/JavaScript conventions | Type safety, React patterns, validation |
| java | Java conventions and constraints | Immutability, Spring patterns, modern Java |
| kotlin | Kotlin conventions and constraints | Null safety, coroutines, Compose patterns |
| swift | Swift conventions and constraints | Concurrency, memory, SwiftUI patterns |
| rust | Rust conventions and constraints | Ownership, borrowing, error handling, unsafe |
| cpp | C++ conventions and constraints | RAII, memory safety, modern C++ |
| php | PHP conventions and constraints | PSR-12, strict types, Laravel/Symfony |
| skill-creator | Creating or improving skills | Eval harness, description optimization, packaging |
| docx | Word document generation and editing | validate.py, XML conformance |
| xlsx | Excel spreadsheet generation | recalc.py, formula verification |
| pdf | PDF creation, manipulation, extraction | pypdf-based scripts |
| pdf-reading | PDF inspection and content extraction | Text extraction, page rasterization |
| pptx | PowerPoint generation and editing | pptxgenjs-based patterns |
| Template | Purpose |
|---|---|
STRATEGIC_BLUEPRINT.md |
Strategic blueprint for 🎯 STRATEGY |
SPEC_GATE_CHECKLIST.md |
16-item spec completeness gate |
ADR_TEMPLATE.md |
Architecture Decision Records |
RUNBOOK.md |
Operational runbook for deployments |
VERIFICATION_REPORT.md |
Verification loop output format |
TRACE_REPORT.md |
Causal debugging trace format |
sentrux-rules.toml |
Sentrux architectural constraint template |
your-repo/
├── GEMINI.md # Full methodology (Antigravity + Gemini CLI)
├── CLAUDE.md # Lean router (Claude Code — ~423 tokens)
├── .claude/
│ └── hooks/
│ └── commit-gate.sh # Deterministic commit gate (Claude Code)
└── .agents/ # Shared across ALL agents
├── rules/ # 9 rules (methodology core)
│ ├── coding-standards.md
│ ├── git-discipline.md
│ ├── build-execution.md
│ └── ... # Stage-specific methodology
├── skills/ # 58 skills (slash commands + auto-invoked)
│ ├── golang/ # Language conventions (on-demand)
│ ├── python/
│ ├── typescript/
│ ├── java/
│ ├── kotlin/
│ ├── swift/
│ ├── rust/
│ ├── cpp/
│ ├── php/
│ ├── security-review/ # Gate enforcement (on-demand)
│ ├── spec-gate/
│ └── <skill-name>/
│ ├── SKILL.md
│ ├── scripts/ # Executable scripts
│ └── data/ # Curated databases
└── templates/ # 7 document templates
This repository is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.
- No endorsement — You may not use the names of the original authors in a way that suggests they endorse your use.
Full license text: https://creativecommons.org/licenses/by/4.0/legalcode
This repository builds on the following works. Attribution is provided in compliance with each source license.
The core Stream Coding methodology, mantras, stage names, gate definitions, and Golden Rules originate from:
Stream Coding by Francesco Marinoni Moretto Source: https://github.com/frmoretto/stream-coding License: CC BY 4.0 Copyright © 2025 Francesco Marinoni Moretto
Changes made in this repository relative to the original:
- Extended to support multiple AI agents: Antigravity, Claude Code, and Gemini CLI (verified via PoC)
- Extended the gate system: 6-item Spec Gate → 16-item gate with Language Constraint Gate and Frontend Design Token Gate
- Added 58 on-demand skills (25 slash-command + 33 auto-invoked) with executable scripts and curated data assets
- Added deterministic shell hooks (commit gate) for Claude Code
- Added agentic-specific security vectors (prompt injection via CI output, unauthorized code execution)
- Added Session-Start Protocol, ADR trigger criteria, API error envelope standard, and Constraint Tiers table
- Extended rules coverage: GCP, ADK, Terraform, MCP, A2UI, Sentrux, 12 language-specific skill files (Go, Python, TypeScript, Java, Kotlin, Swift, Rust, C++, PHP + 3 domain-specific)
- Translated all methodology documentation to English (English-only requirement for agentic reproducibility)
The deterministic structural pre-check scripts (spec_precheck.py and related tools) are adapted from:
specverify by Francesco Marinoni Moretto Source: https://github.com/frmoretto/specverify License: MIT
The curated design intelligence database used by the ui-ux-pro-max skill (85 styles, 161 palettes, 1,924 Google Fonts, 16 framework stacks) is sourced from:
UI/UX Pro Max Skill Source: https://github.com/nextlevelbuilder/ui-ux-pro-max-skill License: MIT
This repository was assembled and extended by:
Jeremy Garreau — jeremy-garreau Adaptation and extensions: Stream Coding for AI coding agents (Antigravity, Claude Code, Gemini CLI) License: CC BY 4.0
If you use this repository in your own work, you must provide attribution to the original Stream Coding methodology by Francesco Marinoni Moretto as described above.