Skip to content

Add autonomous-code-improvement skill#105

Open
telivity-otaip wants to merge 5 commits intoaddyosmani:mainfrom
telivity-otaip:feat/autonomous-code-improvement
Open

Add autonomous-code-improvement skill#105
telivity-otaip wants to merge 5 commits intoaddyosmani:mainfrom
telivity-otaip:feat/autonomous-code-improvement

Conversation

@telivity-otaip
Copy link
Copy Markdown

@telivity-otaip telivity-otaip commented Apr 25, 2026

Add autonomous-code-improvement skill

What this adds

A new Markdown-only skill at skills/autonomous-code-improvement/SKILL.md that documents the discipline for running an unattended code-improvement loop — scan, isolate, execute, multi-persona self-review, adversarial gate, domain guard, PR — with cost ceilings throughout.

No code, no executables, no tests in this PR. The skill is a pure prescription that fits the existing skill anatomy (Overview / When to Use / Process / Common Rationalizations / Red Flags / Verification).

Why it belongs here

Autonomous coding agents are now common (Cursor's background agents, Copilot Workspace, Devin, etc.) but their failure modes are consistent and well-understood:

  1. They guess when they hit domain-specific code rather than asking
  2. They have no cost controls — a runaway loop is a $200 surprise
  3. They let the writing model approve its own work — the writer's review is theater

The existing skills in this repo address how a human-in-the-loop agent should work (TDD, incremental implementation, code review, debugging). This skill addresses what changes when the human is no longer in the loop on every diff. It complements rather than overlaps:

  • It cites and depends on code-review-and-quality, security-and-hardening, and test-driven-development (the three persona prompts pull from those)
  • It cites git-workflow-and-versioning for worktree isolation
  • It introduces concepts that don't appear elsewhere: domain-question markers, adversarial gates, cost checkpoints

What it doesn't do

  • No new package, no executable code. This PR is one Markdown file, matching every other skill in the repo.
  • No changes to top-level README.md, plugin.json, or any existing skill. The skill plugs in by file presence alone — the discovery model already supports new skills via the standard convention.
  • No build/test infrastructure added to the repo. The reference implementation lives in a separate repo (linked in the SKILL).

Reference implementation

A working four-package TypeScript monorepo that implements every node of the pipeline ships separately as asil-monorepo (Autonomous Software Improvement Loop). 278 vitest tests, MIT-licensed, extracted from a production system in use today on an 80+ agent travel AI platform. Linked from the bottom of the SKILL.

This PR doesn't depend on the reference repo existing — the skill stands on its own as a process description. The link is for readers who want a turnkey starting point.

Scope check

Per CLAUDE.md's conventions:

  • ✅ Skill lives at skills/autonomous-code-improvement/SKILL.md
  • ✅ YAML frontmatter with name and description
  • ✅ Description starts with what the skill does, then "Use when…"
  • ✅ Sections: Overview, When to Use, Process, Common Rationalizations, Red Flags, Verification
  • ✅ No supporting files (content fits in one SKILL.md, well under the 100-line threshold for splitting)
  • ✅ No duplication with existing skills — references them instead

Credit

Skill authored by Dušan Milicevic (Telivity) — extracted from a production autonomous system that has been running unattended for months and opening PRs daily.

Dušan Milicevic added 2 commits April 25, 2026 16:44
A Markdown-only skill documenting the discipline for running an
unattended code-improvement loop: scan → isolate → execute →
multi-persona self-review → adversarial gate → domain guard → PR,
with cost checkpoints throughout.

Complements rather than overlaps the existing skills — pulls from
code-review-and-quality, security-and-hardening, test-driven-development,
and git-workflow-and-versioning. Introduces concepts that don't appear
elsewhere: domain-question markers, adversarial gates, cost ceilings.

Pure Markdown — no code, no executables, no tests. Matches the existing
skill anatomy (Overview, When to Use, Process, Common Rationalizations,
Red Flags, Verification).

A working TypeScript reference implementation is linked from the SKILL
but is not a dependency.
The two SKILL.md links pointed to a placeholder GitHub user; replaced
with the canonical telivity-otaip/asil URL.
Copy link
Copy Markdown
Owner

@addyosmani addyosmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll definitely consider this.

This is a useful skill in a real niche, but three things before I can merge:

  1. Drop PR_DESCRIPTION.md from the diff - paste it into the PR description on GitHub instead, that's where it belongs.

  2. Step 6 (Self-review) should compose with the existing agents/code-reviewer.md, agents/security-auditor.md, and agents/test-engineer.md personas rather than listing them inline. Either delegate to /ship's fan-out pattern or explicitly explain how the autonomous-loop variant differs (diff-only scoping, no human merge step, etc.). Same goes for Step 3 - point at git-workflow-and-versioning for the worktree discipline.

  3. The https://github.com/telivity-otaip/asil link is marked '(pending publish)'. Please either publish it before merge or remove the reference-implementation section. Shipping a skill with a dead reference link is worse than shipping it with no reference at all.

Keep the DOMAIN_QUESTION marker design and the (taskId, systemId, agentId) cost-checkpoint primitive - both are genuinely new contributions.

I'll otherwise consider the value prop a little further but don't let that block you from the above.

@telivity-otaip
Copy link
Copy Markdown
Author

All three addressed:

  1. PR_DESCRIPTION.md removed from diff, content moved to the PR body.
  2. Step 3 now references skills/git-workflow-and-versioning; Step 6 now references agents/code-reviewer.md, agents/security-auditor.md, agents/test-engineer.md with the autonomous-loop deltas called out (diff-only scoping, no human merge step). Kept the DOMAIN_QUESTION marker and the (taskId, systemId, agentId) checkpoint as you suggested.
  3. Repo is live — dropped the (pending publish) note. Link: https://github.com/telivity-otaip/asil

@telivity-otaip telivity-otaip requested a review from addyosmani May 2, 2026 05:50
Adds Red Flag, Common Rationalization, and Verification entries based
on 6/10 destructive-replacement diffs observed in a sandbox run on a
real codebase. Implementation guards landed in
telivity-otaip/asil#3.
@telivity-otaip
Copy link
Copy Markdown
Author

Ran the reference implementation on a real pnpm monorepo (HAIP) and found two safety bugs in the executor and scanner. Fixed in telivity-otaip/asil#3 (now merged). Used the findings to strengthen this skill — added an empty-content failure mode to Red Flags, Common Rationalizations, and Verification. The skill now warns explicitly about the destructive-replacement pattern that surfaced in 6/10 transcripts of the run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants