docs: add AGENTS.md, STYLEGUIDE.md, agent-assisted contribution (#114)#149
docs: add AGENTS.md, STYLEGUIDE.md, agent-assisted contribution (#114)#149lipikaramaswamy wants to merge 4 commits intomainfrom
Conversation
Local Claude Code session worktrees under .claude/worktrees/ shouldn't be tracked. Matches the convention already in the DataDesigner repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…orkflow (#114) - AGENTS.md: architecture overview, pipeline diagrams, structural invariants - STYLEGUIDE.md: code conventions ruff and ty cannot enforce - CLAUDE.md: 3-line redirect to AGENTS.md - CONTRIBUTING.md: new Agent-Assisted Development subsection establishing the plans/<issue-number>/<short-name>.md convention for non-trivial changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: lipikaramaswamy <lramaswamy@nvidia.com>
Greptile SummaryThis PR adds developer-facing documentation —
Confidence Score: 5/5This PR adds only documentation and a .gitignore entry — no production code is changed, so there is no risk to runtime behavior. All five files are documentation or configuration. The cross-links between AGENTS.md, STYLEGUIDE.md, and CONTRIBUTING.md are internally consistent, the pipeline diagrams match the described module boundaries, and the structural invariants align across all three guides. The .gitignore addition is a single, correct path entry. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["CLAUDE.md\n(Claude Code entry point)"] -->|"@AGENTS.md"| B["AGENTS.md\nArchitecture + Invariants"]
B -->|cross-link| C["STYLEGUIDE.md\nCode Conventions"]
B -->|cross-link| D["CONTRIBUTING.md\nWorkflow + Agent-Assisted Dev"]
C -->|cross-link| B
C -->|cross-link| D
D -->|cross-link| B
D -->|cross-link| C
D --> E["plans/issue/name.md\nDesign review PRs"]
F[".gitignore"] -->|ignores| G[".claude/worktrees/\nSession worktrees"]
Reviews (5): Last reviewed commit: "docs: clarify TYPE_CHECKING guidance and..." | Re-trigger Greptile |
| Use `TYPE_CHECKING` blocks for imports only needed for type hints — prevents circular imports and avoids loading heavy libraries at import time: | ||
|
|
||
| ```python | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| if TYPE_CHECKING: | ||
| import pandas as pd | ||
| ``` | ||
|
|
||
| `pandas` is import-time expensive — only import it at the top level where it is actually needed at runtime. |
There was a problem hiding this comment.
TYPE_CHECKING guidance may mislead agents into wrong pandas placement
The example puts pandas inside a TYPE_CHECKING block, and the very next sentence says "only import it at the top level where it is actually needed at runtime." An agent reading these two adjacent statements could interpret "use TYPE_CHECKING blocks for heavy libraries" as universal advice and wrap every pandas import in TYPE_CHECKING — even in workflow files that use DataFrames at runtime — causing NameError at execution time. The intent is that TYPE_CHECKING is for type-hint-only imports, while runtime pandas usage always goes at the top level. Splitting these into two clearly labelled scenarios (type-hint-only vs. runtime-needed) would prevent the misreading.
|
|
||
| Conventions for NeMo Anonymizer that ruff and ty cannot enforce. Read before adding a new module, workflow, or config class. | ||
|
|
||
| NeMo Anonymizer wraps [NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner) (NDD) for LLM column generation. References to NDD below mean that library. |
There was a problem hiding this comment.
Minor naming inconsistency: AGENTS.md calls the library "DataDesigner" while STYLEGUIDE.md calls it "NeMo Data Designer". Both point to the same URL, but an agent doing a grep for one spelling will miss the other. Aligning to the same display name avoids confusion.
| NeMo Anonymizer wraps [NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner) (NDD) for LLM column generation. References to NDD below mean that library. | |
| NeMo Anonymizer wraps [DataDesigner](https://github.com/NVIDIA-NeMo/DataDesigner) (NDD) for LLM column generation. References to NDD below mean that library. |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Required by the copyright-check CI step (tools/codestyle/copyright_fixer.py). Matches the HTML-comment header format used by docs/concepts/*.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: lipikaramaswamy <lramaswamy@nvidia.com>
Address greptile-apps review feedback on PR #149: - Rewrite the TYPE_CHECKING paragraph to explicitly split type-hint-only imports (TYPE_CHECKING block) from runtime use (top-level), and call out the NameError failure mode. Avoids the misread where an agent could wrap all heavy-library imports in TYPE_CHECKING. - Use "DataDesigner" instead of "NeMo Data Designer" so the display name matches AGENTS.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: lipikaramaswamy <lramaswamy@nvidia.com>
Summary
AGENTS.md— architecture overview, pipeline diagrams, and structural invariants for agents working inthe codebase.
STYLEGUIDE.md— code conventions ruff and ty cannot enforce (Pydantic vs dataclass, error handling,column-name constants, prompt construction).
CLAUDE.md— 3-line redirect toAGENTS.mdso Claude Code picks it up.CONTRIBUTING.mdestablishing theplans/<issue-number>/<short-name>.mdconvention for non-trivial changes..claude/worktrees/(Claude Code session worktrees).Filed #148 as a follow-up for three pre-existing column-name string-literal violations in
llm_replace_workflow.py:53-55that the new STYLEGUIDE rule covers.Closes #114.
Test plan
mkdocs build --strict)git check-ignore .claude/worktrees/fooreturns the path (worktrees properly ignored)