Skip to content

PG-2311: Extensible sync handler registration (register_sync_handler)#601

Open
glamberson wants to merge 2 commits into
percona:PSP_REL_18_STABLEfrom
lamco-admin:PG-2311-register-sync-handler
Open

PG-2311: Extensible sync handler registration (register_sync_handler)#601
glamberson wants to merge 2 commits into
percona:PSP_REL_18_STABLEfrom
lamco-admin:PG-2311-register-sync-handler

Conversation

@glamberson
Copy link
Copy Markdown

This is the Percona companion track to upstream CF 6689 ("Extensible sync handler registration", register_sync_handler), filed against PSP_REL_18_STABLE per the contribution guide referenced from PG-2311.

Motivation

I'm working on Lamstore, an extent-based PostgreSQL storage backend whose data lives at byte offsets inside a volume file rather than as md segment files. Lamstore uses a custom storage manager to route backend I/O. That works today on PSP18, but the checkpoint fsync pipeline is a dead end: syncsw[] in sync.c is a static const SyncOps[] indexed by SyncRequestHandler, and there is no way for an extension to install its own dispatch entry. An extension cannot reuse SYNC_HANDLER_MD because mdsyncfiletag() resolves paths via relpathperm() + OpenTransientFile() + fsync(), which does not reach Lamstore storage.

This two-commit series adds a dynamic register_sync_handler() API, parallel to RegisterCustomRmgr and the smgr_register shape being developed on upstream CF 5616. Built-in handlers (MD, CLOG, commit_ts, multixact_offset, multixact_member) keep their existing enum indices 0..4. Extensions get IDs starting at SYNC_HANDLER_FIRST_DYNAMIC.

The patch is strictly additive: zero WAL format changes, zero FileTag layout changes, zero shared memory changes, zero catalog involvement.

Apply against PSP_REL_18_STABLE

The patches were originally written against upstream master (PG 19devel). Applied here at HEAD 25b35533cc2 (PSP_REL_18_STABLE tip as of 2026-05-21):

  • 5 of 6 files in commit 1 applied cleanly via git apply.
  • src/backend/postmaster/postmaster.c needed two short manual reflows. PSP18 has register_builtin_dynamic_managers() where upstream has RegisterBuiltinShmemCallbacks(), and PSP18 lacks storage/shmem_internal.h relative to upstream. The new #include "storage/sync.h" and the InitSyncHandlers() call were placed in semantically equivalent positions (immediately after register_builtin_dynamic_managers(), before process_shared_preload_libraries()).
  • 7 of 9 files in commit 2 (the test module) applied cleanly.
  • src/test/modules/Makefile and src/test/modules/meson.build needed one-line additions to insert test_sync_handler in the SUBDIRS / subdir() list at the correct alphabetical position. PSP18 lacks test_shmem so the upstream patch context did not match exactly.

The semantic content of the two commits is identical to the upstream v2 series at https://www.postgresql.org/message-id/177699144837.2925.1505658927335705957@lamco.io.

Verification

  • ./configure --prefix=/tmp/pg-install --with-icu --enable-tap-tests: clean
  • make -j$(nproc): clean build
  • make -C src/test/modules/test_sync_handler check: 5/5 PASS
  • make check-world: verified clean (will note any deviation in a follow-up comment if observed)
  • objdump on GCC 14.2 at -O2: per-dispatch instruction sequence byte-identical to stock (verified on the upstream v2 submission; PSP18's sync.c and sync.h are byte-identical to upstream so the same verification carries over)

Relationship to CF 5616 and CF 6689

This is the Percona track for the same patches submitted upstream as CF 6689. The pgsql-hackers thread is https://www.postgresql.org/message-id/IA1PR07MB983072521EE7FDEE98902534A9592@IA1PR07MB9830.namprd07.prod.outlook.com (v2 latest message at 2026-04-24). Upstream targets CF 59 (PG20-1), review window opens 2026-07-01.

CF 5616 (Zsolt Parragi's "Extensible storage manager API") makes smgrsw[] dynamic; this patch does the same for syncsw[]. The two are orthogonal: none of CF 5616's six v6 sub-patches touch sync.c or sync.h. No dependency on CF 5616's unresolved design questions (hook vs registration, catalog recovery, per-tablespace vs per-relation config, GUC chaining).

Pre-empted reviewer objections

Andres Freund's 2023-07-01 review of CF 4428 v1 (postgrespro thread-id 2654666) raised four objections to a previous smgr_register prototype. All four are addressed in the v2 design:

  1. "Not a good place to initialize" - InitSyncHandlers() is called from PostmasterMain (for fork()) and from register_sync_handler() itself (for EXEC_BACKEND), with a builtin_sync_handlers_registered flag guarding against repeated built-in registration. Same lifecycle pattern as RegisterCustomRmgr.
  2. "Dynamic allocation overhead" - Each caller hoists syncsw into a local SyncOps *ops at function entry; per-dispatch assembly is byte-identical to stock. One additional memory load at function entry (one L1 cache hit), paid once per ProcessSyncRequests call, not per dispatch.
  3. "Compiler barrier?" - Not included. Single-threaded preload; fork() is a full POSIX barrier. The EXEC_BACKEND path uses a different synchronization shape (re-runs process_shared_preload_libraries in the child's own address space) which the v2-0001 commit message describes.
  4. "Per-entry size redundancy" - N/A. SyncOps is pure function pointers, no size field.

EXEC_BACKEND notes

v2 specifically fixes an EXEC_BACKEND load-order bug in the earlier (recalled) v1 submission. Under EXEC_BACKEND, each child re-runs process_shared_preload_libraries() in a fresh address space, so an extension's _PG_init() can reach register_sync_handler() before the child has called InitSync(). v2 handles this by calling InitSyncHandlers() at the top of register_sync_handler() as a defensive idempotent initializer. PSP18's postmaster startup architecture matches upstream's for the relevant code paths, so the same fix applies here unchanged.

Refs

Introduce a public extension API, register_sync_handler(), that lets
extensions install their own entries in the sync.c dispatch table.
This enables storage-related extensions to participate in the
checkpoint fsync pipeline without faking md.c segments or bypassing
sync.c's request coalescing and cancellation machinery.

The previously static syncsw[] array becomes a heap-allocated
dispatch table populated in two phases: the five built-in handlers
(MD, CLOG, commit_ts, multixact_offset, multixact_member) are
registered via InitSyncHandlers() before process_shared_preload_libraries(),
and extension _PG_init() calls receive sequentially assigned IDs
starting at SYNC_HANDLER_FIRST_DYNAMIC. Registration is forbidden
after process_shared_preload_libraries_done is set.

InitSyncHandlers() is called from both PostmasterMain() (for the
fork() path) and from register_sync_handler() itself (for the
EXEC_BACKEND path, where each child re-runs shared_preload_libraries
in its own address space and may reach an extension's registration
call before it has called InitSync()). An explicit
builtin_sync_handlers_registered flag guards against repeated
built-in registration.

SYNC_HANDLER_NONE is changed from its previous implicit value of 5
to an explicit -1 so that the "no handler" sentinel cannot be
confused with a valid handler index. The only consumers in core
are value-agnostic != comparisons in slru.c.

Documentation: doc/src/sgml/custom-sync-handler.sgml, modeled on
doc/src/sgml/custom-rmgr.sgml.

This patch is being filed against PSP_REL_18_STABLE as the companion
to the upstream submission at CF 6689 (commitfest.postgresql.org/patch/6689/).
The patch applies cleanly to PSP_REL_18_STABLE since storage/sync/sync.c
and storage/sync.h have no Percona-specific deltas; the only manual
adjustment required during application was reflowing the new include
and the InitSyncHandlers() call into the Percona-specific surrounding
structure in postmaster.c (which adds register_builtin_dynamic_managers()
between the upstream insertion point and process_shared_preload_libraries()).

Discussion: https://postgr.es/m/IA1PR07MB983072521EE7FDEE98902534A9592@IA1PR07MB9830.namprd07.prod.outlook.com
Jira: https://perconadev.atlassian.net/browse/PG-2311
test_sync_handler exercises register_sync_handler() from _PG_init()
and verifies:

  - The registered handler ID is at least SYNC_HANDLER_FIRST_DYNAMIC.
  - Distinct FileTags produce distinct sync_syncfiletag callbacks
    at CHECKPOINT time.
  - Duplicate FileTags coalesce via HASH_BLOBS to a single dispatch.
  - Idle checkpoints do not re-dispatch already-processed entries
    (cycle_ctr skip).

Shared state between the backend and the TAP harness lives in a small
shmem area allocated via RequestAddinShmemSpace, accessed under an
LWLock taken via RequestNamedLWLockTranche. The test SQL function
triggers a CHECKPOINT and reports the observed dispatch counts back
through psql, which the TAP test asserts against.

Layout mirrors the existing fsync_checker test module from CF 5616 v5+.

Discussion: https://postgr.es/m/IA1PR07MB983072521EE7FDEE98902534A9592@IA1PR07MB9830.namprd07.prod.outlook.com
Jira: https://perconadev.atlassian.net/browse/PG-2311
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant