Overview
When --attempt-instant-ddl is used and the instant ALTER succeeds, gh-ost can deadlock during final cleanup and hang forever instead of exiting.
Root cause
initiateApplier writes a GhostTableMigrated changelog row. The streamer's changelog listener callback (Migrator.onChangelogStateEvent) publishes that signal synchronously via base.SendWithContext(ctx, mgtr.ghostTableMigrated, true) while holding EventsStreamer.listenersMutex (the send happens inside notifyListeners, which holds the mutex for the duration of the callback).
On the normal migration path there is a dedicated receiver:
if !mgtr.migrationContext.Resume {
<-mgtr.ghostTableMigrated
}
But on the instant-DDL success path (go/logic/migrator.go), Migrate() returns early right after finalCleanup() and never receives from ghostTableMigrated:
if err := mgtr.applier.AttemptInstantDDL(); err == nil {
if err := mgtr.finalCleanup(); err != nil {
return nil
}
...
return nil
}
So the changelog send blocks forever, keeping listenersMutex held. finalCleanup() then closes the binlog reader, whose rows-event decode callback (EventsStreamer.shouldDecodeRowsEvent) needs the same mutex, and BinlogSyncer.Close() waits (via WaitGroup) for that goroutine to exit → permanent deadlock. gh-ost hangs and never completes the migration.
Reproduction
Run an instant-DDL-eligible migration against MySQL 8.0 with --attempt-instant-ddl, e.g. adding a column with a default:
gh-ost --attempt-instant-ddl --execute \
--alter="ADD COLUMN c INT NOT NULL DEFAULT 1" \
--host=127.0.0.1 --port=3306 --database=db --table=t ...
gh-ost applies the instant DDL, logs the "migrated instantly" success, but then hangs in cleanup instead of exiting.
Proposed fix
Drain the GhostTableMigrated signal on the instant-DDL success path before finalCleanup(), mirroring the receive already present on the normal path, guarded by !Resume (resume migrations never emit the signal). I have a PR ready with the fix plus regression tests.
Overview
When
--attempt-instant-ddlis used and the instantALTERsucceeds, gh-ost can deadlock during final cleanup and hang forever instead of exiting.Root cause
initiateApplierwrites aGhostTableMigratedchangelog row. The streamer's changelog listener callback (Migrator.onChangelogStateEvent) publishes that signal synchronously viabase.SendWithContext(ctx, mgtr.ghostTableMigrated, true)while holdingEventsStreamer.listenersMutex(the send happens insidenotifyListeners, which holds the mutex for the duration of the callback).On the normal migration path there is a dedicated receiver:
But on the instant-DDL success path (
go/logic/migrator.go),Migrate()returns early right afterfinalCleanup()and never receives fromghostTableMigrated:So the changelog send blocks forever, keeping
listenersMutexheld.finalCleanup()then closes the binlog reader, whose rows-event decode callback (EventsStreamer.shouldDecodeRowsEvent) needs the same mutex, andBinlogSyncer.Close()waits (viaWaitGroup) for that goroutine to exit → permanent deadlock. gh-ost hangs and never completes the migration.Reproduction
Run an instant-DDL-eligible migration against MySQL 8.0 with
--attempt-instant-ddl, e.g. adding a column with a default:gh-ost applies the instant DDL, logs the "migrated instantly" success, but then hangs in cleanup instead of exiting.
Proposed fix
Drain the
GhostTableMigratedsignal on the instant-DDL success path beforefinalCleanup(), mirroring the receive already present on the normal path, guarded by!Resume(resume migrations never emit the signal). I have a PR ready with the fix plus regression tests.