DAOS-18989 dfs: add hardlink link/unlink support and mask internal mo…#18356
DAOS-18989 dfs: add hardlink link/unlink support and mask internal mo…#18356sherintg wants to merge 4 commits into
Conversation
|
Ticket title is 'Add hardlink link/unlink support APIs and mask internal mode bit' |
6e65b37 to
c9ce944
Compare
…de bit Implement dfs_link for regular files and maintain hardlink metadata through GIT entries. Handle hardlink-aware unlink path, including xattr cleanup/migration behavior. Ensure stat/mode APIs expose external mode only, and add DFS unit coverage for link/remove workflows. Allow-unstable-test: true Signed-off-by: Sherin T George <sherin-t.george@hpe.com>
c9ce944 to
bfcaca8
Compare
| } | ||
|
|
||
| restart: | ||
| D_ASSERT(DFS_IS_HARDLINK(entry.mode)); |
There was a problem hiding this comment.
entry is fetched from storage. if there is a data corruption issue, we hit this assertion, no?
assert in that case is unacceptable and an error (EIO) should be returned.
There was a problem hiding this comment.
The assert is making sure that the caller calls this routine only if the entry points to a hardlink.
There was a problem hiding this comment.
im still not clear why this should be an assert. correct me if wrong, but you are not setting entry.mode yourself. this is coming from fetch, no?
There was a problem hiding this comment.
The assert here is to make sure that caller calls this routine only when entry.mode has hardlink bit set like the code below.
int
remove_entry(dfs_t *dfs, daos_handle_t th, daos_handle_t parent_oh, const char name, size_t len,
struct dfs_entry entry)
{
...
if (DFS_IS_HARDLINK(entry.mode))
return remove_hardlink(dfs, th, parent_oh, name, len, entry);
/ Default remove entry code */
Also, I have now moved the assert to the very beginning of the function so that the assert is called only once.
There was a problem hiding this comment.
I share @mchaarawi's concern here. From my understanding, entry.mode is read from storage via fetch_entry(), so a corrupted mode bit is a realistic scenario — not just a programming error. In that case a D_ASSERT would abort the daemon, which seems disproportionate.
Would it make sense to replace the assert with an explicit runtime guard returning EIO? Something like:
if (!DFS_IS_HARDLINK(entry.mode)) {
D_ERROR("remove_hardlink called on non-hardlink entry (mode=0x%x)\n", entry.mode);
return EIO;
}This would survive data-corruption scenarios gracefully and make the precondition explicit to future readers without aborting the process. No issue to keep the assert if you consider this a programming invariant that can only be violated by a bug in the caller — but a short comment explaining that assumption would help clarify the intent.
There was a problem hiding this comment.
The expectation here is that the caller here should ensure that entry.mode has hardlink bit set before calling remove_hardlink(). I see similar logic used in other functions. For example open_dir() expects parent to be not null when ocreate flag is passed.
| args->stbuf->st_nlink = 1; | ||
| args->stbuf->st_mode = op_args->entry.mode; | ||
| args->stbuf->st_mode = DFS_EXTERNAL_MODE(op_args->entry.mode); |
There was a problem hiding this comment.
im confused here.. how can this be a hardlink but st_nlink still hardcoded to 1?
the entire stat and statx paths is not in limbo. i assume you will finish this in a follow on PR?
There was a problem hiding this comment.
I have restricted this PR only to dfs_link(), dfs_remove() and internal mode bit. I will be creating subsequent new PRs for fetching the metadata with hardlinks, updating metadata with hardlinks, move/rename.
| D_ERROR("Missing GIT entry for file %s", obj->name); | ||
| D_GOTO(out, EIO); |
There was a problem hiding this comment.
so what if someone removed the entry between the first fetch and this fetch?
this is not an error anymore, and the transaction restart behavior would detect that i hope.
logging an EIO error here sounds wrong, but correct me if im wrong please.
i do not think believe this function should log any error on the transaction commit/restart path.
There was a problem hiding this comment.
Thanks, you are right. Modified to return rc (ENOENT) in this case.
| link_entry = &src_entry; | ||
| } | ||
|
|
||
| rc = insert_entry(dfs->layout_v, parent->oh, th, name, len, 0, link_entry); |
There was a problem hiding this comment.
should this not use a conditional insert for insertion into the new parent?
can't you overwrite an existing entry there in this case?
There was a problem hiding this comment.
Thanks. I had thought about this and had made the change in the PoC code, but missed this time. I will add negative test case too.
| int | ||
| dfs_link(dfs_t *dfs, dfs_obj_t *obj, dfs_obj_t *parent, const char *name, dfs_obj_t **new_obj, | ||
| struct stat *stbuf) |
There was a problem hiding this comment.
this function could use a few code comments at every stage to explain the steps.
| * \param[in] parent Opened parent directory object where the new link is created. | ||
| * If NULL, use root obj. | ||
| * \param[in] name Link name of the new hard link. |
There was a problem hiding this comment.
for clarity, please rename these to new_*
| /* | ||
| * Delete all extended attributes stored under a directory entry. | ||
| * Enumerates every akey whose name starts with "x:" under the dkey | ||
| * identified by @name in directory object @oh, and punches each one. | ||
| */ | ||
| int | ||
| xattr_delete_entry(daos_handle_t oh, const char *name, daos_handle_t th) |
There was a problem hiding this comment.
there seems to be no tests added for this path.
perhaps xattr support should be left to a follow on PR?
There was a problem hiding this comment.
Added a test to validate delete of xattr from directory entry has happened. Rest of the xattr tests will be added in the next PR related to xattr set and get operations.
| uint64_t new_link_cnt; | ||
| bool local_tx = false; | ||
|
|
||
| if (!daos_handle_is_valid(th)) { |
There was a problem hiding this comment.
i thought tx will always be used in hardlinks case per the design
There was a problem hiding this comment.
Since dfs_remove() is in performance path, I have restricted the use of tx only for hardlinks.
There was a problem hiding this comment.
yes, but here we are in the hardlinks path here, so this should always be true?
There was a problem hiding this comment.
th will be TX_NONE in remove_entry if use_dtx is not set. The above check ensure that dtx is opened if use_dtx is not set and the file getting removed is a hardlink.
| /** | ||
| * Test dfs_link and dfs_remove covering the following steps: | ||
| * |
There was a problem hiding this comment.
some missing tests that i can think of:
- linking into an existing file, directory, symlink (all cases must be tested and fail with EEXIST)
- link with same directory under a different name
- xattr testing, but sounds like you want to leave this to a different PR (I agree), but you added some xattr code in the implementation in this PR
- error paths testing for dfs_link()
for later:
- hardlink + rename interactions
- chmod/chown with hardlinks are visible to all paths
- mode masking is properly covered in all cases including readdirplus (not tested here)
There was a problem hiding this comment.
I have added additional tests for negative scenarios and hardlinks in same directory.
The xattr changes to dfs_link()/remove were made from completeness perspective. Is it ok, I retain it for now? I will create the next PR for adding support for hardlinks to dfs_*xattr() and dfs_lookupx() routines and add tests validating correctness xattr migration and deletion.
…de bit Addressed review comments. Allow-unstable-test: true Signed-off-by: Sherin T George <sherin-t.george@hpe.com>
|
Test stage Functional on EL 9 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18356/4/testReport/ |
knard38
left a comment
There was a problem hiding this comment.
First round of comments.
…de bit Addressed review comments. Allow-unstable-test: true Signed-off-by: Sherin T George <sherin-t.george@hpe.com>
|
Test stage NLT completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18356/5/testReport/ |
|
Test stage Functional on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-18356/5/execution/node/1043/log |
…de bit Addressed review comments regarding ctime not updated when hard link is first created for the file. Allow-unstable-test: true Signed-off-by: Sherin T George <sherin-t.george@hpe.com>
|
Test stage NLT completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18356/6/testReport/ |
…de bit
Implement dfs_link for regular files and maintain hardlink metadata through GIT entries. Handle hardlink-aware unlink path, including xattr cleanup/migration behavior. Ensure stat/mode APIs expose external mode only, and add DFS unit coverage for link/remove workflows.
Steps for the author:
After all prior steps are complete: