fix(docker): pre-cache snowflake embedding model in jailbreak Dockerfile#1848
fix(docker): pre-cache snowflake embedding model in jailbreak Dockerfile#1848lokeshrangineni wants to merge 2 commits intoNVIDIA-NeMo:developfrom
Conversation
The Snowflake/snowflake-arctic-embed-m-long embedding model was only downloaded at container startup, requiring internet access at runtime and triggering a trust_remote_code prompt the user never explicitly opted into. Add a RUN step to both Dockerfile and Dockerfile-GPU that instantiates SnowflakeEmbed() at image build time, which calls the same from_pretrained logic (including trust_remote_code, add_pooling_layer, safe_serialization) as at runtime. This ensures the HuggingFace cache is warm when the container starts. Also extract the model name into a module-level constant SNOWFLAKE_EMBED_MODEL to avoid duplication within models.py. This mirrors the existing pattern already in place for GPT2. Fixes NVIDIA-NeMo#1648
Documentation preview |
Greptile SummaryThis PR pre-caches the
|
| Filename | Overview |
|---|---|
| nemoguardrails/library/jailbreak_detection/Dockerfile | Adds HF_HOME/TRANSFORMERS_CACHE env vars and a Snowflake model pre-cache step; the pre-cache command imports a constant from the not-yet-released PyPI package, causing an ImportError at build time. |
| nemoguardrails/library/jailbreak_detection/Dockerfile-GPU | GPU variant with the same pre-cache step and the same ImportError risk as the CPU Dockerfile. |
| nemoguardrails/library/jailbreak_detection/model_based/models.py | Extracts SNOWFLAKE_EMBED_MODEL and SNOWFLAKE_EMBED_REVISION module-level constants and pins the model to a specific commit SHA; clean, no issues. |
| CHANGELOG.md | Adds an Unreleased section with a bug-fix entry for the Snowflake pre-cache; no issues. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[docker build] --> B[pip install requirements.txt\nnemoguardrails>=0.14.0 from PyPI]
B --> C[COPY . . to /app]
C --> D[Pre-cache GPT2\nhardcoded model name]
D --> E[Pre-cache Snowflake\nimport SNOWFLAKE_EMBED_REVISION\nfrom installed nemoguardrails]
E --> F{Constant exists\nin installed pkg?}
F -- Yes - after next PyPI release --> G[AutoTokenizer.from_pretrained\nrevision pinned]
F -- No - current 0.21.0 --> H[ImportError\nBuild fails]
G --> I[AutoModel.from_pretrained\ntrust_remote_code=True\nrevision pinned]
I --> J[Weights cached to\n/models/hf-cache/hub/]
J --> K[Runtime: offline load\nfrom HF_HOME cache]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
nemoguardrails/library/jailbreak_detection/Dockerfile:37-41
The pre-cache step imports `SNOWFLAKE_EMBED_REVISION` from the installed `nemoguardrails` PyPI package. Because `COPY . .` copies only the `jailbreak_detection` directory contents (not the full nemoguardrails package) and Python's `-c` flag does not add the working directory to `sys.path`, this import resolves against the pip-installed package. The constant `SNOWFLAKE_EMBED_REVISION` doesn't exist in any released version of `nemoguardrails` yet, so `pip install nemoguardrails>=0.14.0` (currently resolving to 0.21.0) causes an `ImportError` and the Docker build fails. The GPT2 step one line above avoids this fragility by hardcoding the model name directly — the same pattern should be used here.
```suggestion
RUN python -c "\
from transformers import AutoModel, AutoTokenizer; \
AutoTokenizer.from_pretrained('Snowflake/snowflake-arctic-embed-m-long', revision='92d97331f1f4b6a366c1f161354b9f3390cc219f'); \
AutoModel.from_pretrained('Snowflake/snowflake-arctic-embed-m-long', revision='92d97331f1f4b6a366c1f161354b9f3390cc219f', trust_remote_code=True, add_pooling_layer=False, safe_serialization=True)"
```
### Issue 2 of 2
nemoguardrails/library/jailbreak_detection/Dockerfile-GPU:37-41
Same `ImportError` as the CPU Dockerfile: the import of `SNOWFLAKE_EMBED_REVISION` from the installed `nemoguardrails` PyPI package will fail until a new version containing this constant is released. Hardcoding the values matches the GPT2 pattern and removes the dependency on a specific package version.
```suggestion
RUN python -c "\
from transformers import AutoModel, AutoTokenizer; \
AutoTokenizer.from_pretrained('Snowflake/snowflake-arctic-embed-m-long', revision='92d97331f1f4b6a366c1f161354b9f3390cc219f'); \
AutoModel.from_pretrained('Snowflake/snowflake-arctic-embed-m-long', revision='92d97331f1f4b6a366c1f161354b9f3390cc219f', trust_remote_code=True, add_pooling_layer=False, safe_serialization=True)"
```
Reviews (2): Last reviewed commit: "refactor(docker): address CodeRabbit rev..." | Re-trigger Greptile
📝 WalkthroughWalkthroughThis PR addresses a bug where the Snowflake embedding model was downloaded at container runtime, requiring internet access at startup. The fix pre-caches the model during Docker build time, updates code to use a centralized model identifier constant, and documents the change. ChangesSnowflake Embedding Model Pre-caching
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
nemoguardrails/library/jailbreak_detection/Dockerfile (1)
29-31: ⚡ Quick winConsider setting explicit Hugging Face cache paths for clarity and future robustness.
Lines 29–31 pre-cache models without explicitly setting
HF_HOMEorTRANSFORMERS_CACHE. While the current Dockerfile runs as root (matching the build-time UID), explicitly configuring these paths is a best practice for maintainability. If a non-root user is later added or the runtime UID changes, the cache paths will be preserved. Consider adding environment variables early in the build:ENV HF_HOME=/models/hf-cache ENV TRANSFORMERS_CACHE=/models/hf-cache🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nemoguardrails/library/jailbreak_detection/Dockerfile` around lines 29 - 31, Add explicit Hugging Face cache environment variables in the Dockerfile before pre-downloading the Snowflake embedding model: set HF_HOME and TRANSFORMERS_CACHE (e.g., to /models/hf-cache) early in the Dockerfile so the RUN python -c "from nemoguardrails.library.jailbreak_detection.model_based.models import SnowflakeEmbed; SnowflakeEmbed()" invocation uses those stable cache paths; ensure the directories are created and permissions set appropriately for the build/runtime user to avoid permission issues when SnowflakeEmbed() populates the cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nemoguardrails/library/jailbreak_detection/model_based/models.py`:
- Around line 29-34: The pretrained tokenizer and model are being loaded with
trust_remote_code=True from SNOWFLAKE_EMBED_MODEL without a pinned revision;
update both AutoTokenizer.from_pretrained(...) and
AutoModel.from_pretrained(...) calls to include a specific revision=<commit-sha>
argument (use the repository commit SHA you want to pin) so the tokenizer and
model loads are reproducible and safer when trust_remote_code is enabled.
---
Nitpick comments:
In `@nemoguardrails/library/jailbreak_detection/Dockerfile`:
- Around line 29-31: Add explicit Hugging Face cache environment variables in
the Dockerfile before pre-downloading the Snowflake embedding model: set HF_HOME
and TRANSFORMERS_CACHE (e.g., to /models/hf-cache) early in the Dockerfile so
the RUN python -c "from
nemoguardrails.library.jailbreak_detection.model_based.models import
SnowflakeEmbed; SnowflakeEmbed()" invocation uses those stable cache paths;
ensure the directories are created and permissions set appropriately for the
build/runtime user to avoid permission issues when SnowflakeEmbed() populates
the cache.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3d86f212-4d0e-42cc-aa56-df9acf46bd9e
📒 Files selected for processing (4)
CHANGELOG.mdnemoguardrails/library/jailbreak_detection/Dockerfilenemoguardrails/library/jailbreak_detection/Dockerfile-GPUnemoguardrails/library/jailbreak_detection/model_based/models.py
|
@Pouyanpi - I have created the PR as discussed on the issue-1648. Please find the test evidence below. The one which is failing is pre-existing and not related to the current changes. Please let me know if you recommend any changes. 🧪 Test EvidenceEnvironment: Click to expand full test output$ pytest tests/test_jailbreak_model_based.py tests/test_jailbreak_actions.py tests/test_jailbreak_models.py -vplatform darwin -- Python 3.11.15, pytest-8.4.2, pluggy-1.6.0 collected 27 items tests/test_jailbreak_model_based.py::test_lazy_import_does_not_require_heavy_deps PASSED tests/test_jailbreak_model_based.py::test_model_based_classifier_imports PASSED tests/test_jailbreak_model_based.py::test_model_based_classifier_missing_deps FAILED (*) tests/test_jailbreak_model_based.py::test_initialize_model_with_none_classifier_path PASSED tests/test_jailbreak_model_based.py::test_snowflake_embed_torch_imports PASSED tests/test_jailbreak_model_based.py::test_check_jailbreak_with_classifier PASSED tests/test_jailbreak_model_based.py::test_check_jailbreak_without_classifier PASSED tests/test_jailbreak_model_based.py::test_check_jailbreak_no_classifier_available PASSED tests/test_jailbreak_model_based.py::test_initialize_model_with_valid_path PASSED tests/test_jailbreak_model_based.py::test_nv_embed_e5_removed PASSED tests/test_jailbreak_model_based.py::test_snowflake_embed_still_available PASSED tests/test_jailbreak_model_based.py::test_initialize_model_logging PASSED tests/test_jailbreak_model_based.py::test_check_jailbreak_explicit_none_classifier PASSED tests/test_jailbreak_model_based.py::test_check_jailbreak_valid_classifier_preserved PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_with_nim_base_url PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_api_key_not_set PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_no_api_key_env_var PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_local_runtime_error PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_local_import_error PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_local_success PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_empty_context PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_context_without_user_message PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_legacy_server_endpoint PASSED tests/test_jailbreak_actions.py::TestJailbreakDetectionActions::test_jailbreak_detection_model_none_response_handling PASSED tests/test_jailbreak_models.py::test_jb_model_detected SKIPPED () tests/test_jailbreak_models.py::test_safe SKIPPED () tests/test_jailbreak_models.py::test_check_jailbreak_model SKIPPED (**) 1 failed, 23 passed, 3 skipped in 1.85s
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…o#1848 - Pin SNOWFLAKE_EMBED_REVISION to a specific HuggingFace commit SHA (92d97331) to ensure reproducible builds when trust_remote_code=True, reducing supply-chain risk. - Switch Dockerfile pre-cache steps from SnowflakeEmbed() instantiation to bare from_pretrained calls to avoid loading ~400-600 MB of model weights into RAM during docker build, preventing OOM on constrained CI build agents. - Add HF_HOME and TRANSFORMERS_CACHE env vars in both Dockerfiles so the HuggingFace cache path is stable regardless of runtime user/UID.
a6be550 to
c69efe5
Compare
|
@Pouyanpi - I have incorporated all the code review comments. AFAIK no changes needed from my side. could you let me know if you would like me to make any changes in the PR? |
Description
The
Snowflake/snowflake-arctic-embed-m-longembedding model used by the model-based jailbreak classifier was only downloaded at container startup. This caused two problems:trust_remote_code=Trueprompt fired at runtime without the user ever explicitly opting in during the build.Changes
DockerfileandDockerfile-GPU: Added aRUNstep after the existing GPT2 pre-cache step that instantiatesSnowflakeEmbed()at image build time. This triggers the samefrom_pretrainedcalls (withtrust_remote_code=True,add_pooling_layer=False,safe_serialization=True) as at runtime, warming the HuggingFace disk cache. This mirrors the pattern already in place for GPT2.model_based/models.py: Extracted the model name into a module-level constantSNOWFLAKE_EMBED_MODELto avoid duplication within the class.Why
SnowflakeEmbed()instead of barefrom_pretrainedcalls?Instantiating
SnowflakeEmbed()directly means the Dockerfile never diverges from the runtime call site — if arguments tofrom_pretrainedchange in the future (e.g. a new parameter), the pre-cache step picks it up automatically. The extra.to(device)and.eval()calls are pure in-memory operations with no effect on the on-disk HuggingFace cache.Trade-off
Docker image size increases by ~400–600 MB for the Snowflake model weights. This is the same trade-off already accepted for GPT2.
Test results
Ran on a clean
condaenvironment (Python 3.11,torch 2.11 CPU,transformers 4.49.0):Related Issue(s)
Checklist
Made with Cursor
Summary by CodeRabbit
Bug Fixes
Chores