Skip to content

Pyl-Tech/stream-coding

Stream Coding

"If your docs are clear enough, AI writes the code. The real work IS the documentation. Code is just the printout."Stream Coding methodology by Francesco Marinoni Moretto

What Is This?

Stream Coding is a documentation-first development methodology for AI-assisted coding. Instead of writing code and hoping it works, you write specs so precise that code generation becomes automatic — and verifiable.

This repository is a portable methodology kit for AI coding agents. Install it into any project and your agent gains structured skills, quality gates, and curated design intelligence — all auto-loaded from a single config directory.

Verified Platforms

Agent Config File Status
Antigravity (Google) GEMINI.md + .agents/ ✅ Verified
Claude Code (Anthropic) CLAUDE.md + .agents/ ✅ Verified
Gemini CLI (Google) GEMINI.md + .agents/ ✅ Verified

All platforms share the same .agents/ directory — skills, rules, scripts, and templates. Only the root config file differs per agent.

Why This One?

Most AI coding methodologies rely on the agent's self-discipline — rules that say "you should" but can't verify "you did." Stream Coding is different:

  1. Deterministic gates, not guidelines. Quality is enforced by Python scripts (verify.py, spec_precheck.py, spec_conformance.py, tdd_check.py, security_scan.py) that run real checks and produce pass/fail — not by hoping the AI follows instructions. A process rule has ~70% compliance. A script has ~100%.

  2. Full pipeline, not a skill pack. This isn't a collection of tips — it's a 6-stage pipeline (DISCOVER → STRATEGY → SPEC → BUILD → VERIFY → HARDEN) where each gate blocks the next stage. You can't ship unverified code because the commit gate won't let you.

  3. Self-improving. Every project you ship feeds learnings back (/learn-eval) and the methodology evolves (/methodology-review with ablation testing). Stale gates get removed. New gaps get filled. The system compounds.

  4. 5% research, 80% documentation, 15% code. Most methodologies optimize the coding step. Stream Coding optimizes the input to the coding step — because a 10/10 spec produces code that works on the first pass, and a 7/10 spec produces code that needs 30% rework.

  5. Agent-agnostic from day one. One .agents/ directory, verified on 3 platforms. No vendor lock-in, no platform-specific markup. Markdown in, quality out.

The core insight

Without Stream Coding:
  Vague prompt → AI guesses → looks right → breaks in prod → rework → 2-3x velocity

With Stream Coding:
  Clear spec → AI executes → verified by gates → minimal rework → 10-20x velocity

The difference isn't the AI model. It's the quality of what you feed it.


Quick Start

1. Install into your project

git clone https://github.com/Pyl-Tech/stream-coding.git
cd agy-stream-coding

# Interactive setup — choose your platforms
./install.sh /path/to/your/project

# All platforms (non-interactive)
./install.sh --all /path/to/your/project

# Specific platforms
./install.sh --platforms agy,claude /path/to/your/project

# Preview without copying
./install.sh --dry-run --all /path/to/your/project
Flag Effect
--force Overwrite existing files
--dry-run Preview what would be installed
--all Install all platforms (skip selection)
--platforms Comma-separated: agy, gemini, claude

This copies methodology files, skills, and platform configs into your project. Your AI agent loads them automatically on every session.

2. Understand the core loop

Every significant feature follows this flow:

📋 Write a spec  →  ⛔ Pass the Spec Gate  →  🧪 Write tests first  →  ⚡ Implement  →  🔍 Verify  →  ✅ Commit

You spend 80% of your time on documentation, not code. That's by design — it's where the leverage is.

3. Key commands

When you need to... Run
Research before building /research
Shape a vague idea into a concept /clarify
Clarify what you're building /strategy
Check if your spec is ready /spec-gate
Write tests from the spec /tdd
Build a feature end-to-end /orchestrate
Verify everything works /verification-loop
Document an existing codebase /reverse-spec
Debug a multi-cause issue /trace

4. Read the methodology

The full methodology lives in GEMINI.md at your project root. Read it once — it's the operating manual for how any AI agent behaves on your project. CLAUDE.md is a lean router (~50 lines, ~423 tokens) that gives Claude Code the same enforcement without token bloat.


How It Works

The 6 Stages

Stage Icon What You Do Time
DISCOVER 🔬 Search for existing solutions, shape vague ideas
STRATEGY 🎯 Answer 7 questions: what, why, for whom, what NOT 40%
SPEC 📋 Write AI-ready docs, pass the 16-item Spec Gate 40%
BUILD Tests first (RED), then implement (GREEN) 5%
VERIFY 🔍 Build, lint, test, security scan, code review 5%
HARDEN Sync docs, clean up, extract learnings, COMMIT 5%

Commits only happen in HARDEN. Not before. This prevents shipping unverified code.

The 4 Gates

Every gate is a hard stop. If it fails, you fix it — not skip it.

Gate Question It Answers When
Spec Gate (16 items) Can AI implement this without asking questions? Before any code
TDD Gate Do failing tests exist before implementation? Before writing production code
Verify Gate Does everything build, pass, and scan clean? After implementation
Commit Gate verify + doc-update + refactor-clean + learn-eval all complete? Before commit

The Golden Rules

  1. Documentation IS the work. Code is just the printout.
  2. When code fails, fix the spec — then regenerate the code.
  3. A 7/10 spec produces 7/10 code that needs 30% rework. Aim for 10/10.
  4. Never silently fill a spec gap. If the spec doesn't say it, ask — don't guess.

Day-to-Day Usage

"I need to build a new feature"

1. /research          — Is there an existing solution?
2. /clarify           — Shape the idea if it's vague (skip if already sharp)
3. /strategy          — Clarify the 7 Questions
4. Write the spec     — Anti-patterns, test cases, error matrix, deep links
5. /spec-gate         — Score must be 10/10
6. /adversarial-review — Stress-test with a different model
7. /plan              — Break into implementation steps
8. /tdd               — Write tests from the spec (they must FAIL)
9. Implement          — Write code to make tests pass
10. /verification-loop — Build + lint + test + security
11. /code-review      — Does code match spec?
12. /doc-update       — Sync documentation
13. /learn-eval       — Extract what worked / what didn't
14. Commit & push

"I need to fix a bug"

Skip to BUILD — but still write the test first:

1. /tdd               — Write a test that reproduces the bug (RED)
2. Fix the code       — Make the test pass (GREEN)
3. /verification-loop — Verify nothing else broke
4. Commit

"I need to debug something complex"

1. /trace             — Form 2+ hypotheses, test each one
2. /build-fix         — If it's a build issue
3. /self-debug        — If the agent itself is looping

"I need to make an architecture decision"

1. /research          — What are the options?
2. /experiment        — If it needs data, run an experiment
3. /strategy          — Document the decision as an ADR

"I have an existing app I want to migrate / document"

1. Install SC           — ./install.sh /path/to/legacy-app
2. /reverse-spec        — Produces SC-compatible specs from existing code
3. /strategy            — Define the migration target
4. Write migration spec — Delta between as-is and to-be
5. /spec-gate           — Validate the migration spec
6. /tdd → implement     — Build the migration
7. /verification-loop   — Verify everything works
8. Commit & push

"I want to restructure the architecture of an existing codebase"

1. /refactor            — Diagnose pain points, propose target architecture, plan migration
                          (Strangler Fig by default — one step at a time, tests throughout)

"I need to upgrade a dependency or migrate a library"

1. /dep-upgrade         — Identify breaking changes → audit callsites → choose strategy
                          → behavior tests first → migrate → verify checklist

"We have no tests on this legacy module and I need to add some"

1. /test-retrofit       — Scope by risk → discover intended behavior (docs → callers → git)
                          → distinguish intended vs. bug → write characterization tests
                          → integrate into CI as permanent safety net

Not every change needs the full pipeline

Change Type What to do
Typo, rename Direct fix → lint → commit
Bug fix with clear repro TDD + fix → verify → commit
New feature, clear criteria SPEC → BUILD → VERIFY → HARDEN
New feature, ambiguous Full pipeline (DISCOVER through HARDEN)
Legacy app, needs docs first /reverse-spec → then normal pipeline
High-stakes migration /reverse-spec → full pipeline + ADR + adversarial review

Stream Coding Is Self-Improving

This is not a static rulebook. The methodology updates itself from real project experience.

Every project you ship
        ↓
  /learn-eval                      ← capture what worked and what failed
        ↓
  .agents/learnings.md             ← LOCAL cross-project pattern registry
        ↓
  Learnings Hub (opt-in)           ← REMOTE shared registry (universal learnings only)
        ↓
  /methodology-review
    ├─ 2a: Ablation test           ← is each gate still load-bearing?
    │       run a task WITH and WITHOUT the component
    │       if zero findings across 3+ projects → candidate for removal
    │
    └─ 2b: Gap fill                ← what's missing based on accumulated learnings?
            generate fix candidates (structural > process)
            A/B test each on real data that exposed the gap
            prefer additive changes to existing scripts
        ↓
  Methodology updated              ← rules, gates, skills, GEMINI.md
        ↓
  Better specs → better AI output → next project starts smarter
        ↓
        [loop]

Step 1 — Extract learnings with /learn-eval

Run after every BUILD → HARDEN cycle. Classify what happened:

Outcome What to extract
Perfect first pass The spec pattern that made it work — capture it to reproduce
Minor rework (1–2 fixes) The ambiguity that caused it — add to anti-patterns
Major rework The gap — strengthen the Spec Gate
Spec was fundamentally wrong The failing assumption — add to the Assumptions checklist

Learnings are saved to .agents/learnings.md with evidence, scope (project-specific vs universal), and a link to the spec that produced them.

Learnings Hub — shared learning registry

The Learnings Hub is an optional MCP server that centralizes universal learnings across all your workspaces. It runs on Cloud Run and is connected automatically by the installer for all platforms (Antigravity, Gemini CLI, Claude Code).

Question Answer
What gets sent? Only universal, methodology-level learnings — spec-writing patterns, review calibration findings, methodology gaps. Never project-specific data, never source code, never business logic.
What stays local? Everything in .agents/learnings.md. Project-specific learnings ("BigQuery streaming inserts block DML", "chi router prefix resolution") are never submitted.
Authentication Google OAuth 2.0 with PKCE. First use triggers a browser-based consent flow. Credentials are cached locally.
Where does data live? Git repository on Cloud Run (europe-west1). Data is stored in a private Git repo, not in any third-party database.
Is it required? No. If the MCP server is unavailable or not configured, the agent reports it to you and continues — local .agents/learnings.md always works standalone.
Can I opt out? Yes. Remove learnings-hub from your mcp_config.json. No data is sent without the MCP connection.
GDPR No personal data is collected beyond the Google account email used for auth. Learnings are methodology patterns ("always define error envelopes before endpoint implementation"), not user data. You can request deletion at any time.

Scope filter enforced by the agent: Only learnings that would apply to ANY project regardless of language, framework, or domain are submitted. The agent checks this before every submission.

Step 2 — Evolve the methodology with /methodology-review

/methodology-review does two things:

2a — Ablation (remove what's stale): For each gate and skill, test whether removing it degrades output quality. Models improve; gates go stale. A component that produces zero findings across 3+ projects is dead weight.

2b — Gap fill (add what's missing): When learnings reveal a recurring failure pattern, generate fix candidates, A/B test them against real data, and promote the winning fix into the narrowest possible surface:

Routing question Surface
Should this apply on every task, unconditionally? Rule (.agents/rules/)
Is this an on-demand playbook or slash command? Skill (.agents/skills/<name>/SKILL.md)
Is this a one-shot deterministic check? Script (inside an existing skill's scripts/)

Structural beats process. A script that auto-checks has ~100% compliance. A self-enforced process step has ~70%. When a gap can be automated, it must be.

The result

Every project you ship makes the methodology marginally better for every project that comes after it. The learning registry, the gates, and the rules all compound. This is the mechanism that separates a living methodology from a static style guide.


Contributing

This repository is the shared methodology. Everyone uses it, everyone improves it.

How to contribute

  1. Fork this repo
  2. Make your change — new rule, skill improvement, gate refinement, data update
  3. Run /spec-gate on any new rules or skills
  4. Run /methodology-review — ablation before addition. Can something be removed or simplified instead?
  5. Submit a PR with:
    • What changed
    • Why (what learning or failure prompted this)
    • Evidence (link to the project where this was discovered)

What you can contribute

Type Where Example
New coding standard .agents/rules/ "Always use structured logging in Python"
New language rule .agents/skills/<lang>/SKILL.md Adding a Dart or Ruby language skill
Skill improvement .agents/skills/<name>/ Better /tdd instructions for E2E tests
Gate refinement spec-gate or verification-loop skill New Spec Gate check item
Skill enhancement .agents/skills/<skill>/ Better design intelligence data
Template update .agents/templates/ More practical ADR template
Methodology change GEMINI.md New stage skip rule, better trigger behavior

Contribution rules

  • Every change must be spec-referenced. Don't add rules "because it feels right" — link to a real failure or learning that prompted it.
  • Structural enforcement > process enforcement. If a check can be automated as a script, it should be — AI self-discipline checks have ~70% compliance; scripts have ~100%.
  • Methodology changes require ablation. Before adding, ask: "What can I remove?" Run /methodology-review to evaluate.

Reference

Rules (9 files — methodology core, loaded on stage entry)

File Purpose
GEMINI.md Full methodology (repo root — auto-loaded by Antigravity & Gemini CLI)
CLAUDE.md Lean router for Claude Code (~423 tokens, routes to GEMINI.md + .agents/)
coding-standards.md Immutability, file organization, error handling, validation
git-discipline.md Commit format, -F protocol, staging safety rules
harden-stage.md Commit gate, ADR triggers, divergence prevention
build-execution.md Spec-Test-Implement loop, smallest diff
spec-writing.md 4 mandatory sections, granularity, constraint tiers
strategy-stage.md 7 Questions, documentation audit, exit criteria
verify-stage.md Verify gate, evaluator separation
workflow-orchestration.md Decision tree, session-start protocol, skill reference
antigravity-integration.md Antigravity-specific mapping and artifact paths

Skills (58 — all capabilities, on-demand)

Every capability lives in .agents/skills/<name>/SKILL.md. Slash commands (/research, /tdd, etc.) and auto-invoked skills share the same structure — the agent loads them when the task matches.

Pipeline Skills (25 slash commands)

Command Stage Purpose
/research 🔬 DISCOVER Research before implementation (Adopt > Adapt > Build)
/experiment 🔬 DISCOVER Data-driven decision making (run experiments, not debates)
/clarify 🔬 DISCOVER Shape vague ideas into sharp concepts (convergence test exit gate)
/strategy 🎯 STRATEGY Clarify WHAT and WHY (7 Questions Framework)
/audit 🎯 STRATEGY Clean existing documentation (target 40-50% reduction)
/reverse-spec 📋 SPEC Reverse-engineer SC-compatible specs from existing code
/spec-gate 📋 SPEC Verify spec completeness before coding (16-item checklist)
/adversarial-review 📋 SPEC Stress-test specs with epistemic pre-scan + adversarial prompt
/plan ⚡ BUILD Create implementation plan from validated spec
/orchestrate ⚡ BUILD Multi-step pipeline (feature/bugfix/refactor/security)
/tdd ⚡ BUILD Test-driven development cycle (includes E2E)
/build-fix ⚡ BUILD Incremental build error resolution
/trace ⚡ BUILD Causal debugging with parallel hypothesis testing
/verification-loop 🔍 VERIFY Full quality check (build, lint, test, security)
/code-review 🔍 VERIFY Audit code ↔ spec conformance
/security-review 🔍 VERIFY Full security audit (OWASP, secrets, dependencies)
/doc-update ✅ HARDEN Documentation ↔ code synchronization
/refactor-clean ✅ HARDEN Dead code detection and cleanup
/learn-eval ✅ HARDEN Extract spec patterns for continuous learning
/update-codemaps ✅ HARDEN Generate token-lean architecture documentation
/retrocheck ✅ HARDEN Retroactive alignment against current specs and rules
/aside Ask a question mid-task without losing context
/self-debug Agent failure recovery and loop detection
/methodology-review Ablation and optimization of the methodology itself
/reload Force full reload of all agent configuration

Auto-Invoked Skills (33 — loaded when context matches)

Skill Trigger Context Key Assets
ui-ux-pro-max Frontend design, landing pages, dashboards 4,800+ records: 85 styles, 161 palettes, 1,924 Google Fonts, 16 stacks
performance-optimization Slow endpoints, bottlenecks Measure → Identify → Fix → Verify → Guard cycle
ci-cd Deployment, pipelines, quality gates Pipeline templates, rollback thresholds

| database | Database operations, migrations, modeling | PostgreSQL-first patterns, migration safety | | compliance | Personal data, GDPR, privacy | Privacy checklist, data handling rules | | adk-agent | Google ADK, AgentEngine, VertexAI deployment | Project structure, tool_context, eval patterns | | mcp-builder | MCP servers, LLM tool integrations | FastMCP/MCP SDK patterns, evaluation criteria | | mcp-ge | Gemini Enterprise, Agent Designer, Cloud Run | OAuth, StreamableHTTP, identity resolution | | a2ui | Agent-to-UI (A2UI), GE rich responses | v0.8 component reference, silent failure patterns | | terraform-gcp | GCP infrastructure as code | Module structure, IAM patterns, state management | | refactor | Architectural refactoring — extract modules, invert deps, migrate patterns | 4-phase process: Diagnose → Target → Plan → Execute | | dep-upgrade | Dependency upgrades and library migrations with breaking change impact analysis | 5-phase: Identify → Audit → Strategy → Migrate → Verify | | test-retrofit | Retroactive test coverage for legacy code with no tests — characterize before locking | 5-phase: Scope → Discover → Distinguish → Write → Guard | | cross-spec-validator | Multi-spec projects (≥3 specs), cross-spec coherence | 6 checks: singletons, constants, bitfields, reentrancy, coverage, cycles | | api-design | REST API endpoint design and review | Endpoint checklist, error envelope, versioning | | sentrux | Architectural structural governance | .sentrux/rules.toml, DSM, quality signal | | shipping | Deployment, feature flags, rollback plans | Release checklist, monitoring setup | | gcp | Google Cloud Platform conventions | IAM patterns, service configuration | | golang | Go conventions and constraints | Formatting, error handling, concurrency, testing | | python | Python conventions and constraints | Type hints, formatting, testing, Django/FastAPI | | typescript | TypeScript/JavaScript conventions | Type safety, React patterns, validation | | java | Java conventions and constraints | Immutability, Spring patterns, modern Java | | kotlin | Kotlin conventions and constraints | Null safety, coroutines, Compose patterns | | swift | Swift conventions and constraints | Concurrency, memory, SwiftUI patterns | | rust | Rust conventions and constraints | Ownership, borrowing, error handling, unsafe | | cpp | C++ conventions and constraints | RAII, memory safety, modern C++ | | php | PHP conventions and constraints | PSR-12, strict types, Laravel/Symfony | | skill-creator | Creating or improving skills | Eval harness, description optimization, packaging | | docx | Word document generation and editing | validate.py, XML conformance | | xlsx | Excel spreadsheet generation | recalc.py, formula verification | | pdf | PDF creation, manipulation, extraction | pypdf-based scripts | | pdf-reading | PDF inspection and content extraction | Text extraction, page rasterization | | pptx | PowerPoint generation and editing | pptxgenjs-based patterns |

Templates (7 files)

Template Purpose
STRATEGIC_BLUEPRINT.md Strategic blueprint for 🎯 STRATEGY
SPEC_GATE_CHECKLIST.md 16-item spec completeness gate
ADR_TEMPLATE.md Architecture Decision Records
RUNBOOK.md Operational runbook for deployments
VERIFICATION_REPORT.md Verification loop output format
TRACE_REPORT.md Causal debugging trace format
sentrux-rules.toml Sentrux architectural constraint template

What Gets Installed

your-repo/
├── GEMINI.md                    # Full methodology (Antigravity + Gemini CLI)
├── CLAUDE.md                    # Lean router (Claude Code — ~423 tokens)
├── .claude/
│   └── hooks/
│       └── commit-gate.sh       # Deterministic commit gate (Claude Code)
└── .agents/                     # Shared across ALL agents
    ├── rules/                   # 9 rules (methodology core)
    │   ├── coding-standards.md
    │   ├── git-discipline.md
    │   ├── build-execution.md
    │   └── ...                  # Stage-specific methodology
    ├── skills/                  # 58 skills (slash commands + auto-invoked)
    │   ├── golang/              # Language conventions (on-demand)
    │   ├── python/
    │   ├── typescript/
    │   ├── java/
    │   ├── kotlin/
    │   ├── swift/
    │   ├── rust/
    │   ├── cpp/
    │   ├── php/
    │   ├── security-review/     # Gate enforcement (on-demand)
    │   ├── spec-gate/
    │   └── <skill-name>/
    │       ├── SKILL.md
    │       ├── scripts/         # Executable scripts
    │       └── data/            # Curated databases
    └── templates/               # 7 document templates

Legal

License

This repository is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, even commercially

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.
  • No endorsement — You may not use the names of the original authors in a way that suggests they endorse your use.

Full license text: https://creativecommons.org/licenses/by/4.0/legalcode


Attribution & Credits

This repository builds on the following works. Attribution is provided in compliance with each source license.

Stream Coding Methodology — CC BY 4.0

The core Stream Coding methodology, mantras, stage names, gate definitions, and Golden Rules originate from:

Stream Coding by Francesco Marinoni Moretto Source: https://github.com/frmoretto/stream-coding License: CC BY 4.0 Copyright © 2025 Francesco Marinoni Moretto

Changes made in this repository relative to the original:

  • Extended to support multiple AI agents: Antigravity, Claude Code, and Gemini CLI (verified via PoC)
  • Extended the gate system: 6-item Spec Gate → 16-item gate with Language Constraint Gate and Frontend Design Token Gate
  • Added 58 on-demand skills (25 slash-command + 33 auto-invoked) with executable scripts and curated data assets
  • Added deterministic shell hooks (commit gate) for Claude Code
  • Added agentic-specific security vectors (prompt injection via CI output, unauthorized code execution)
  • Added Session-Start Protocol, ADR trigger criteria, API error envelope standard, and Constraint Tiers table
  • Extended rules coverage: GCP, ADK, Terraform, MCP, A2UI, Sentrux, 12 language-specific skill files (Go, Python, TypeScript, Java, Kotlin, Swift, Rust, C++, PHP + 3 domain-specific)
  • Translated all methodology documentation to English (English-only requirement for agentic reproducibility)

Spec Pre-Check Scripts — MIT

The deterministic structural pre-check scripts (spec_precheck.py and related tools) are adapted from:

specverify by Francesco Marinoni Moretto Source: https://github.com/frmoretto/specverify License: MIT

UI/UX Pro Max Design Intelligence — MIT

The curated design intelligence database used by the ui-ux-pro-max skill (85 styles, 161 palettes, 1,924 Google Fonts, 16 framework stacks) is sourced from:

UI/UX Pro Max Skill Source: https://github.com/nextlevelbuilder/ui-ux-pro-max-skill License: MIT

Adaptation

This repository was assembled and extended by:

Jeremy Garreaujeremy-garreau Adaptation and extensions: Stream Coding for AI coding agents (Antigravity, Claude Code, Gemini CLI) License: CC BY 4.0


If you use this repository in your own work, you must provide attribution to the original Stream Coding methodology by Francesco Marinoni Moretto as described above.

About

Stream Coding v2.1.0 — Documentation-first development methodology. 59 skills, 5-gate enforcement, multi-agent support (Antigravity, Claude Code, Gemini CLI). When docs are clear enough, code generation becomes automatic.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors