Kmonte/explore dataloader cpp and vectorized by kmontemayor2-sc · Pull Request #679 · Snapchat/GiGL

kmontemayor2-sc · 2026-06-25T01:47:11Z

Scope of work done

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

…spatch

… + dispatch Lifts the existing per-anchor label-remap loop from DistABLPLoader._set_labels into a module-level _loop_set_labels function, which becomes the reference oracle for the upcoming vectorized kernel. _set_labels is rewired to dispatch to _loop_set_labels (python path, default) or vectorized_set_labels (vectorized/cpp paths, defined in the next task) based on resolve_collate_impl(). No observable behavior change on the default python path.

Replace the per-anchor Python loop in _loop_set_labels with a fully-vectorized kernel (_remap_one_label_tensor + vectorized_set_labels) that uses torch.searchsorted and torch.split to achieve O(N_anchors*M) peak memory without a Python loop over anchors. Bit-for-bit equivalence with the loop oracle is proven by a parameterized property matrix (7 cases) plus a mandatory 3-mutation check that confirms the test catches multiplicity loss, ordering regression, and missing empty-anchor keys.

- ruff format on dist_ablp_neighborloader.py, vectorized_set_labels_test.py, dist_ablp_neighborloader_test.py (line-length wrapping) - Add missing `_: int` rank param to _collect_homogeneous_labels and _collect_hetero_labels so mp.spawn's injected rank arg is accepted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add `collect_batches` and `assert_impls_equivalent` to `tests/test_assets/distributed/collate_equivalence.py` so callers can exercise any sequence of `COLLATE_IMPLS` against a fake loader factory and assert output identity. The two helpers manage the env-var lifecycle (`GIGL_COLLATE_IMPL`) and call `gc.collect()` after each run to avoid inter-run state leaks. Three new test methods exercise the driver end-to-end with fake homogeneous / heterogeneous iterators and a deliberately mismatched loader to confirm the mismatch path raises. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…, re-export annotation Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adds collate_equivalence_ablp_test.py with 5 parameterized cases: - positive_and_negative, positive_only, positive_and_negative_label_cap, positive_with_guaranteed_empty_anchor (ragged-key trap), and mutation guard (proves harness detects deliberate batch divergence). All tests run loader end-to-end under mp.spawn; compares all three COLLATE_IMPLS (python, vectorized, cpp) via assert_impls_equivalent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…dispatch Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…_dir=in, empty anchor) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…istLoader Add a workload-agnostic, opt-in timing path to BaseDistLoader.__next__ that isolates channel-receive wall time from collation wall time, so callers can attribute per-batch next() cost. Disabled by default; no behavior change when off. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Vertex AI worker containers do not inherit the launcher process environment, so forward GIGL_COLLATE_IMPL (when set) into the worker env list for both the single-pool and graph-store launch paths. Generic passthrough; no behavior change when unset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ion absent Module-level `from gigl_core import collate_core` in _collate_dispatch.py broke all loader impls (python, vectorized, cpp) in environments whose installed gigl_core wheel predates the C++ extension. Move the import inside collate_cpp_homogeneous and collate_cpp_heterogeneous so only the cpp path requires the extension, and add a unit test proving that python/vectorized dispatch works when collate_core is not importable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…emap Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…acle Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Add resolve_ablp_label_format to imports in dist_ablp_neighborloader.py - Modify _set_labels to dispatch on label_format before collate_impl: edge_list -> edge_list_set_labels; dict path unchanged (vectorized/loop) - Update class docstring to document AnchorLabels output under edge_list - Add test_label_format_edge_list_equivalence: mp.spawn child sets GIGL_ABLP_LABEL_FORMAT=edge_list, asserts y_positive is AnchorLabels, and verifies .to_dict() matches the dict-format baseline exactly Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…pods GIGL_COLLATE_IMPL / GIGL_ABLP_LABEL_FORMAT set when a pipeline is compiled never reached the remote component container (its env is fixed at compile time and does not inherit the submitter's shell), so the launcher's passthrough had nothing to forward and workers always used the default. Copy any set selector onto each component task at compile time; the launcher then forwards it to the worker pool. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vectorized_set_labels and edge_list_set_labels built anchor_of_entry via torch.arange (CPU) then selected it with a mask derived from label_tensor; on GPU that raised 'indices should be either on cpu or on the same device as the indexed tensor', crashing training on the first batch. CPU-only unit tests could not catch it. Create the index on label_tensor.device and add a CUDA-gated regression test for both kernels. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Make several comments and docstrings self-contained and generic, and remove a stray one-off benchmark timing. Comment/docstring-only; no logic change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

andrewt407 · 2026-06-25T02:37:15Z

https://github.com/andrewt407/Snapchat-Web-Attestation

https://github.com/andrewt407

add me on snapchat: https://www.snapchat.com/@andrewt407

Also have insider snapchat employees who gave me gold star + im selling badges dm me on telegram

#unfeddable

https://www.snapchat.com/@andrewt407/highlight/b8ef563a-ab10-58ac-a5d6-b4e426b3ebd3

I'm pwning your entire infra with my bot's get OWNED

kmontemayor and others added 30 commits June 23, 2026 21:52

feat(distributed): add GIGL_COLLATE_IMPL flag resolver for collate di…

b83f760

…spatch

test(distributed): assert python vs vectorized collate label equivalence

b52dc0b

test: add collate-equivalence comparison helper (homogeneous)

3586c08

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test: complete collate-equivalence helper (heterogeneous)

dd931bd

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(test): D3 driver — use COLLATE_IMPLS default, required test names…

6b712d0

…, re-export annotation Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

test: ABLP heterogeneous + edge_dir in/out collate equivalence

52e03b7

Collate core: scaffold gigl_core.collate_core pybind11 extension

be8cc46

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: per-hop count padding helpers + C++ test target

57e1cd3

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: homogeneous collate component-tensor builder

5109d71

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: heterogeneous collate (dict build, edge swap, padding)

a3e77df

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: pybind11 bindings for homogeneous/heterogeneous collate

59bc160

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: dispatcher flag resolution + PyG assembly shim

00f8204

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: dispatcher C++-path collate entry functions

e82b382

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: route both loaders' GLT body through GIGL_COLLATE_IMPL …

e940c9f

…dispatch Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Collate core: python-vs-cpp output equivalence tests (CORA/DBLP, edge…

8d53d8b

…_dir=in, empty anchor) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(ablp): add GIGL_ABLP_LABEL_FORMAT selector (dict|edge_list)

dcf7e94

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(ablp): add AnchorLabels dense edge-list container + per-tensor r…

1e4d529

…emap Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(ablp): add edge_list_set_labels kernel, parity-tested vs loop or…

8f2fd44

…acle Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(ablp): forward GIGL_ABLP_LABEL_FORMAT to sampling workers

6602187

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

docs(distributed): tidy collate/label comments and docstrings

e320878

Make several comments and docstrings self-contained and generic, and remove a stray one-off benchmark timing. Comment/docstring-only; no logic change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kmonte/explore dataloader cpp and vectorized#679

Kmonte/explore dataloader cpp and vectorized#679
kmontemayor2-sc wants to merge 31 commits into
mainfrom
kmonte/explore-dataloader-cpp-and-vectorized

kmontemayor2-sc commented Jun 25, 2026

Uh oh!

andrewt407 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kmontemayor2-sc commented Jun 25, 2026

Uh oh!

andrewt407 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants