Add autonomous-code-improvement skill#105
Add autonomous-code-improvement skill#105telivity-otaip wants to merge 5 commits intoaddyosmani:mainfrom
Conversation
A Markdown-only skill documenting the discipline for running an unattended code-improvement loop: scan → isolate → execute → multi-persona self-review → adversarial gate → domain guard → PR, with cost checkpoints throughout. Complements rather than overlaps the existing skills — pulls from code-review-and-quality, security-and-hardening, test-driven-development, and git-workflow-and-versioning. Introduces concepts that don't appear elsewhere: domain-question markers, adversarial gates, cost ceilings. Pure Markdown — no code, no executables, no tests. Matches the existing skill anatomy (Overview, When to Use, Process, Common Rationalizations, Red Flags, Verification). A working TypeScript reference implementation is linked from the SKILL but is not a dependency.
The two SKILL.md links pointed to a placeholder GitHub user; replaced with the canonical telivity-otaip/asil URL.
addyosmani
left a comment
There was a problem hiding this comment.
Thank you! I'll definitely consider this.
This is a useful skill in a real niche, but three things before I can merge:
-
Drop
PR_DESCRIPTION.mdfrom the diff - paste it into the PR description on GitHub instead, that's where it belongs. -
Step 6 (Self-review) should compose with the existing
agents/code-reviewer.md,agents/security-auditor.md, andagents/test-engineer.mdpersonas rather than listing them inline. Either delegate to/ship's fan-out pattern or explicitly explain how the autonomous-loop variant differs (diff-only scoping, no human merge step, etc.). Same goes for Step 3 - point atgit-workflow-and-versioningfor the worktree discipline. -
The
https://github.com/telivity-otaip/asillink is marked '(pending publish)'. Please either publish it before merge or remove the reference-implementation section. Shipping a skill with a dead reference link is worse than shipping it with no reference at all.
Keep the DOMAIN_QUESTION marker design and the (taskId, systemId, agentId) cost-checkpoint primitive - both are genuinely new contributions.
I'll otherwise consider the value prop a little further but don't let that block you from the above.
|
All three addressed:
|
Adds Red Flag, Common Rationalization, and Verification entries based on 6/10 destructive-replacement diffs observed in a sandbox run on a real codebase. Implementation guards landed in telivity-otaip/asil#3.
|
Ran the reference implementation on a real pnpm monorepo (HAIP) and found two safety bugs in the executor and scanner. Fixed in telivity-otaip/asil#3 (now merged). Used the findings to strengthen this skill — added an empty-content failure mode to Red Flags, Common Rationalizations, and Verification. The skill now warns explicitly about the destructive-replacement pattern that surfaced in 6/10 transcripts of the run. |
Add
autonomous-code-improvementskillWhat this adds
A new Markdown-only skill at
skills/autonomous-code-improvement/SKILL.mdthat documents the discipline for running an unattended code-improvement loop — scan, isolate, execute, multi-persona self-review, adversarial gate, domain guard, PR — with cost ceilings throughout.No code, no executables, no tests in this PR. The skill is a pure prescription that fits the existing skill anatomy (Overview / When to Use / Process / Common Rationalizations / Red Flags / Verification).
Why it belongs here
Autonomous coding agents are now common (Cursor's background agents, Copilot Workspace, Devin, etc.) but their failure modes are consistent and well-understood:
The existing skills in this repo address how a human-in-the-loop agent should work (TDD, incremental implementation, code review, debugging). This skill addresses what changes when the human is no longer in the loop on every diff. It complements rather than overlaps:
code-review-and-quality,security-and-hardening, andtest-driven-development(the three persona prompts pull from those)git-workflow-and-versioningfor worktree isolationWhat it doesn't do
Reference implementation
A working four-package TypeScript monorepo that implements every node of the pipeline ships separately as
asil-monorepo(Autonomous Software Improvement Loop). 278 vitest tests, MIT-licensed, extracted from a production system in use today on an 80+ agent travel AI platform. Linked from the bottom of the SKILL.This PR doesn't depend on the reference repo existing — the skill stands on its own as a process description. The link is for readers who want a turnkey starting point.
Scope check
Per
CLAUDE.md's conventions:skills/autonomous-code-improvement/SKILL.mdnameanddescriptionCredit
Skill authored by Dušan Milicevic (Telivity) — extracted from a production autonomous system that has been running unattended for months and opening PRs daily.