Commpath#182
Merged
Merged
Conversation
mplegendre
requested changes
May 27, 2026
Previously, chosen_realized_cachepath was copied into set_intercept_readlink_cachepath() chosen_realized_cachepath and chosen_parsed_cachepath were copied into set_should_intercept_cachepath() This PR removes both setter functions and makes the original pointers global.
Removes chosen_cachepath and cachepath_bitindex from spindle_launch.h Updates initialization of matching variables in ldcs_process_data. determineValidCachePaths() moved from spindle_be.cc to ldcs_audit_server_process.c to get ldcs_process_data visibility. Added #include "parseloc.h" to ldcs_audit_server_process.c to get declaration of determineValidCachePaths(). Relocated "parseloc.h" to src/util so ldcs_audit_server_process.c could find it. Trued up signedness of types caused my making "parseloc.h" more visible, e.g., cachepath_bitidx is now uint64_t everywhere.
The three-message-reply response is now a single message with two strings. The symbolic version of the cachepath is no longer communicated as it was not being used.
New name is ldcs_audit_server_md_allreduce_AND(). If we get to the point where we're using other allreduce operations we can solve the problem of duplicating the op list in md-land and cobo-land. For now, we're only using one op in md-land, so the op can go into the function name.
Unlikely it would ever make a difference, but this is much more correct.
The theory being that eager clients are using an uninitialized cachepath variable. By delaying the consensus, the failure should happen more often.
"sending message of type: request_location_path" is now "sending message of type: CHOSEN_CACHEPATH_REQUEST"
Known to affect the symbolic form of candidate cachepaths. Not sure that's ever being used, but it's fixed now.
_message_type_to_str() can now be used in cobo_fe_comm.c. ldcs_audit_server_fe_broadcast() now reports message type. Only two messages are expected to be routed through there, but it's the correct way to report it.
Cleanup now takes both commpath and cachepath and prefixes for removing files created by Spindle.
The original LDCS_LOCATION_MOD checked to see if there were multiple servers running on a node and, if so, modified the location string so that each server had its own location. The code did not handle the case where the directory above the requested directory was not writeable, e.g., if the user passed in --location=/tmp, the code would try to create a directory /tmp-00 for the first server. That fails. With commpath and cachepath replacing location, and with new initialization paths, the existing code would modify only commpath after the commpath directory had been created. If the multiple-server case needs to be supported, commpath- and cachepath-specific code needs to be added back in.
That configure parameter is no longer supported.
Replaced with
--with-cachepaths=/tmp/commpath/cachepath
--with-commpath=/tmp/commpath
Replaced assert() with return -1 in:
src/client/beboot/spindle_bootstrap.c
src/client/client/client.c
Removed assert() with no replacement in:
src/client/client/client.c
src/client/client/intercept_readlink.c
src/client/client/should_intercept.c
Created Issue llnl#187 to remove debugging code in send_cachepath_query()
send_cachepath_query() now has delay_between_retries of 0.1 seconds and max 1000 retries.
Also returns immediately in case of network errors.
Also uses spindle_strdup() instead of strdup().
src/utils/parseloc.c
src/server/auditserver/ldcs_audit_server_handlers.c
Removed/reclassified logging statements.
mplegendre
approved these changes
Jun 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Passed all tests on my clone. Let's see what it does over here.