Skip to content

Latest commit

Β 

History

History
322 lines (226 loc) Β· 10.1 KB

File metadata and controls

322 lines (226 loc) Β· 10.1 KB

Contributing to NeMo Anonymizer

Thank you for your interest in contributing to NeMo Anonymizer! This document provides guidelines and information for contributors.

Please read our Code of Conduct before contributing.

Table of Contents

Getting Started

Prerequisites

  • Python 3.11+
  • Git
  • uv - Python package manager
  • gh - GitHub CLI (optional, for PR workflows)

Note: Dev tools like ruff and ty are installed automatically by uv sync --group dev.

Setup

  1. Fork the repository on GitHub
  2. Clone your fork:
 git clone https://github.com/<your-username>/anonymizer.git
 cd anonymizer
  1. Set up the development environment:
 # Install Python dependencies (includes ruff, ty, pre-commit, pytest)
 make bootstrap           # dev dependencies
 make install-dev-docs    # dev + docs dependencies (needed for make docs-serve)

 # Install pre-commit hooks
 make install-pre-commit
  1. Add the upstream remote:
 git remote add upstream https://github.com/NVIDIA-NeMo/anonymizer.git

Repository Settings

This repository uses GitHub Rulesets to enforce consistent contribution standards. These rules are automatically enforcedβ€”you don't need to configure anything, but you should understand them to contribute successfully.

Branch Naming Convention

All branches (except main) must follow this naming pattern:

<author>/<description>
<author>/<issue-id>-<description>
<author>/<type>/<description>
<author>/<type>/<issue-id>-<description>

Rules:

  • <author>: Your GitHub username (lowercase, alphanumeric, hyphens allowed)
  • <issue-id>: Optional GitHub issue number prefix (e.g., 123-)
  • <description>: Brief description (lowercase, alphanumeric, hyphens)
  • <type>: Optional category prefix

Valid types: feature, bugfix, hotfix, release, docs, chore, test

Examples:

Branch Name Valid
jsmith/add-login-feature βœ…
jsmith/123-add-login-feature βœ…
jsmith/feature/123-add-login βœ…
aagonzales/bugfix/456-fix-crash βœ…
dev-team/docs/update-readme βœ…
feature/add-login ❌ Missing author
JSmith/123-Add-Login ❌ Must be lowercase

Conventional Commits

All commits merged to main must follow the Conventional Commits specification:

<type>(<scope>): <description>

or without scope:

<type>: <description>

Rules:

  • <type>: Required, must be one of the valid types below
  • <scope>: Optional, indicates the area of the codebase affected
  • <description>: Required, brief description (max 100 characters)
  • Add ! after type/scope for breaking changes

Valid types:

Type Description
feat New feature
fix Bug fix
docs Documentation changes
style Code style changes (formatting, no logic change)
refactor Code refactoring (no feature or fix)
perf Performance improvements
test Adding or updating tests
build Build system or dependencies
ci CI/CD configuration
chore Maintenance tasks
revert Reverting previous commits

Examples:

Commit Message Valid
feat: add user authentication βœ…
fix(auth): resolve token expiration bug βœ…
docs: update API documentation βœ…
chore(deps)!: bump major dependencies βœ… Breaking change
Added new feature ❌ Missing type
fix - resolve bug ❌ Wrong format
FIX: resolve bug ❌ Type must be lowercase

Since we use squash merging, your PR title should follow this format as it becomes the commit message.

Semantic Versioning for Tags

Release tags must follow Semantic Versioning:

MAJOR.MINOR.PATCH[-prerelease][+build]

Examples:

Tag Valid
1.0.0 βœ…
2.1.3 βœ…
1.0.0-alpha βœ…
1.0.0-beta.1 βœ…
1.0.0-rc.1+build.123 βœ…
v1.0.0 ❌ No v prefix
release-1.0 ❌ Wrong format

Branch Protection

The main branch has the following protections:

Rule Setting
Required approvals 1
Code owner review Required
Dismiss stale reviews No
Require conversation resolution Yes
Linear history Required
Force pushes Blocked
Deletions Blocked
Merge strategy Squash only

Pull Request Process

  1. Create an issue first (if one doesn't exist) to discuss the change
  2. Create a branch following the naming convention:
 git checkout -b <username>/<issue-id>-<description>
  1. Make your changes and commit using conventional commits
  2. Run tests locally:
 make test
  1. Push your branch:
 git push origin <your-branch>
  1. Open a Pull Request using the PR template
  2. Address review feedback β€” reviewers from CODEOWNERS will be automatically assigned
  3. Merge β€” once approved, your PR will be squash-merged and the branch auto-deleted

CODEOWNERS

  • All src and tests files: @NVIDIA-NeMo/anonymizer-reviewers
  • All remaining files (pyproject.toml, uv.lock, SECURITY.md, LICENSE, .github/, etc.): @NVIDIA-NeMo/anonymizer-maintainers

Issues and Discussions

Issue Templates

We provide structured issue templates:

  • Bug Report β€” Report a bug with reproduction steps
  • Feature Request β€” Propose a new feature
  • Development Task β€” Track internal development work

Questions

For general questions, please use GitHub Discussions instead of opening an issue.

Developer Certificate of Origin

All contributions must be signed off to certify that you have the right to submit the code. This is done by adding a Signed-off-by line to your commit messages.

Sign off your commits:

git commit -s -m "feat: add new feature"

This adds a line like:

Signed-off-by: Your Name <your.email@example.com>

By signing off, you certify the Developer Certificate of Origin:

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or

(b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications...

See the full DCO file for details.

Testing

Running Tests

# Run unit tests
make test

# Run tests with coverage report
make coverage

# Run end-to-end tests
make test-e2e

# Run a specific test file
uv run pytest tests/engine/test_detection_workflow.py

Test Requirements

Before submitting a PR:

  • All existing tests pass (make test)
  • New features include tests
  • Bug fixes include regression tests

Code Style

Formatting and Checks

We use Ruff for code formatting and import sorting, and ty for type checking. Both run on changed files against main.

# Format code and sort imports (auto-fixes in place)
make format

# Check formatting without modifying files (used in CI)
make format-check

# Type check with ty (advisory, non-blocking)
make typecheck

# Run all read-only checks (format-check + typecheck + lock-check)
make check

Pre-commit Hooks

We recommend setting up pre-commit hooks to catch formatting, linting, and type issues before committing:

make install-pre-commit

This installs hooks that run Ruff (format + lint), ty type checking, and uv lock verification on each commit.

Note: If pyproject.toml changes and uv.lock is stale, the uv-lock hook regenerates uv.lock and then fails the commit. Run git add uv.lock and retry.

Documentation

# Start local docs server with live-reload
make install-dev-docs    # first time only
make docs-serve          # visit http://127.0.0.1:8000

# Build docs locally (strict mode catches broken links)
make docs-build

Thank you for contributing to NeMo Anonymizer!