-
Notifications
You must be signed in to change notification settings - Fork 0
gpu support
This module details how NVIDIA GPU drivers, CUDA toolkits, and machine learning frameworks (PyTorch, TensorFlow, PaddlePaddle) are integrated into the LabNow Docker ecosystem.
GPU compatibility is achieved by wrapping official NVIDIA CUDA development images with LabNow customizations.
Because NVIDIA images start from raw OS configurations, the build system wraps CUDA base images in a multi-step pipeline:
-
Atom Wrap:
docker_atom/atom.Dockerfileis built using the NVIDIA base image (e.g.nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04) passed viaBASE_IMG. This yields annvidia-cudaatom image. -
Base Wrap:
docker_base/base.Dockerfileis built on top of thenvidia-cudaatom image to add Conda, Python, and base tools. -
CUDA Finalize:
docker_cuda/nvidia-cuda.Dockerfileinherits the base-wrapped image, configuresNVIDIA_DISABLE_REQUIRE=1, updates debpython path configurations, compiles and installs GPU monitoring utilities, and cleans up.
- Downloads and builds
nvtopfrom source to display NVIDIA, AMD, and Intel GPU status. - Requires CMake >= 3.18. If the base OS has an older CMake, it temporarily adds the Kitware APT repository during build.
- Compiles
nvtopbinding to host NVML libraries and cleans up compile dependencies post-install to minimize layer sizes.
When ARG_PROFILE_PYTHON is populated with torch, tf2, or paddle, the core docker installation hook runs specialized setup procedures to configure CUDA acceleration.
The build script automatically checks if CUDA compiler compiler (nvcc) is present:
- Evaluates
$CUDA_VERSIONto generate a shortened string$CUDA_VER(e.g.,12.1->121). - Sets
$IDXtocu${CUDA_VER}(e.g.cu121) if a GPU compiler is present, else defaults tocpu.
- Evaluates GPU compatibility: If CUDA version is
< 11.7, it installs PyTorch 1.x, else installs PyTorch 2.x. - Runs
pip installtargeting the official PyTorch wheel index:pip install --no-cache-dir --root-user-action=ignore -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/${IDX}
- Installs either
tensorflow(CPU/v2) ortensorflow-gpu(v1) based on profile version (tf1ortf2).
- Evaluates if NVCC is present to install either
paddlepaddle-gpuorpaddlepaddle. - Uses official index-url
https://www.paddlepaddle.org.cn/packages/stable/${IDX}/.
A major source of layer bloat in GPU images is duplicate NVIDIA CUDA runtime wheels shipped via pip packages (e.g. nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12). These duplicate files already present in the host system.
To drastically reduce image size:
- Searches pip freeze outputs for
nvidia-*packages and purges them:pip freeze | awk -F= 'tolower($1) ~ /^nvidia-/ {print $1}' | xargs -r pip uninstall -y
- Installs lightweight, system-wide C++ shared libraries instead:
apt-get update && apt-get install -y --no-install-recommends libcusparselt0 libnccl2 libnccl-dev
This step typically shaves several gigabytes off the final GPU image layers while maintaining full PyTorch/TensorFlow execution functionality.