Skip to content

feat: Add graph.StorageMaintainer interface - BED-8353#94

Open
brandonshearin wants to merge 2 commits into
mainfrom
BED-8353
Open

feat: Add graph.StorageMaintainer interface - BED-8353#94
brandonshearin wants to merge 2 commits into
mainfrom
BED-8353

Conversation

@brandonshearin
Copy link
Copy Markdown
Contributor

@brandonshearin brandonshearin commented Jun 1, 2026

Description

Resolves: BED-8353

  • This change breaks out a piece of this PR DRAFT: Introduce regular database maintenance to datapipe #79
  • It defines a new optional interface that dawgs drivers can implement for optimizing storage via methods like vacuum, analyze, etc. Extensive Godoc strings added with help from Mr. Auggie
  • The OptimizeOption functional option pattern provides extensibility for future optimization parameters beyond just storage targets. This is the only difference between this interface proposal and the original implementation proposed in the draft PoC.

Type of Change

  • Chore (a change that does not modify the application functionality)
  • Bug fix (a change that fixes an issue)
  • New feature / enhancement (a change that adds new functionality)
  • Refactor (no behaviour change)
  • Test coverage
  • Build / CI / tooling
  • Documentation

Testing

  • Unit tests added / updated
  • Integration tests added / updated
  • Full test suite run (make test_all with CONNECTION_STRING set)

Screenshots (if appropriate):

Driver Impact

  • PostgreSQL driver (drivers/pg)
  • Neo4j driver (drivers/neo4j)

Checklist

  • Code is formatted
  • All existing tests pass
  • go.mod / go.sum are up to date if dependencies changed

@brandonshearin brandonshearin self-assigned this Jun 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Walkthrough

The PR introduces a new storage optimization framework by defining a configurable public API in graph/optimize.go, validating the configuration pattern with tests, and providing an initial no-op implementation in the PostgreSQL driver.

Changes

Storage Optimization Framework

Layer / File(s) Summary
Optimization API contract and configuration
graph/optimize.go
TargetStorage enum defines Nodes and Relationships constants; OptimizeConfig holds selected targets; OptimizeOption function type mutates config; OptimizeTargets helper appends targets; StorageMaintainer interface requires Optimize(ctx, opts...) error method.
Option configuration and accumulation tests
graph/optimize_test.go
TestOptimizeOptions verifies that options mutate config correctly: no options leave Targets nil, empty OptimizeTargets() preserves nil, single call preserves argument order, and multiple calls accumulate targets sequentially.
PostgreSQL driver optimization entry point
drivers/pg/driver.go
Driver.Optimize method implements StorageMaintainer interface by building config from options and including a placeholder for future PostgreSQL VACUUM/ANALYZE work.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A framework for storage takes shape,
With targets to optimize escape—
Nodes, relationships in focus clear,
Tests validate the patterns here,
One driver awaits the real work, dear! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description covers all required template sections with relevant details including ticket reference, feature type, testing approach, and driver impact.
Title check ✅ Passed The pull request title accurately describes the main change: introducing a new graph.StorageMaintainer interface for storage optimization.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch BED-8353

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
graph/optimize.go (1)

7-14: ⚡ Quick win

Consider adding a String() method for debugging and logging.

The TargetStorage enum would benefit from a String() method to provide human-readable names when logging or debugging optimization operations.

🎯 Proposed enhancement
 const (
 	Nodes TargetStorage = iota
 	Relationships
 )
+
+func (t TargetStorage) String() string {
+	switch t {
+	case Nodes:
+		return "Nodes"
+	case Relationships:
+		return "Relationships"
+	default:
+		return fmt.Sprintf("TargetStorage(%d)", t)
+	}
+}

Note: This would require adding "fmt" to the imports.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graph/optimize.go` around lines 7 - 14, Add a String() method on the
TargetStorage type to map enum values to human-readable names (e.g., return
"Nodes" for Nodes, "Relationships" for Relationships, and a fallback for unknown
values); update the graph/optimize.go imports to include "fmt" if using
fmt.Sprintf for the fallback. Implement the method with receiver (t
TargetStorage) String() string and reference the constants Nodes and
Relationships inside it so logging and debugging produce clear names.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@drivers/pg/driver.go`:
- Around line 180-188: The Optimize method currently builds graph.OptimizeConfig
but returns nil (no-op) which falsely signals success; update Driver.Optimize to
mark it as an explicit placeholder by adding a TODO comment and returning a
clear not-implemented error (e.g., errors.New("Driver.Optimize: not
implemented") or a package-specific ErrNotImplemented) instead of nil, and
ensure any capability advertising (StorageMaintainer) is removed or gated until
the VACUUM/ANALYZE implementation using graph.OptimizeConfig is completed so
callers can distinguish implemented vs placeholder behavior.

---

Nitpick comments:
In `@graph/optimize.go`:
- Around line 7-14: Add a String() method on the TargetStorage type to map enum
values to human-readable names (e.g., return "Nodes" for Nodes, "Relationships"
for Relationships, and a fallback for unknown values); update the
graph/optimize.go imports to include "fmt" if using fmt.Sprintf for the
fallback. Implement the method with receiver (t TargetStorage) String() string
and reference the constants Nodes and Relationships inside it so logging and
debugging produce clear names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b7be0ac4-5cc2-4030-8d96-a44a76e2771d

📥 Commits

Reviewing files that changed from the base of the PR and between a637b4a and 681c8b3.

📒 Files selected for processing (3)
  • drivers/pg/driver.go
  • graph/optimize.go
  • graph/optimize_test.go

Comment thread drivers/pg/driver.go
Comment on lines +180 to +188
func (s *Driver) Optimize(ctx context.Context, opts ...graph.OptimizeOption) error {
config := &graph.OptimizeConfig{}
for _, opt := range opts {
opt(config)
}

// PostgreSQL VACUUM/ANALYZE implementation
return nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

No-op Optimize silently reports success.

The method builds config from the options but discards it and returns nil without performing any work. This is fine as an intentional placeholder, but as written it advertises the StorageMaintainer capability while doing nothing, so callers cannot distinguish "optimized" from "not yet implemented." Consider marking the placeholder as a TODO so it surfaces in tracking and isn't mistaken for a completed implementation.

📝 Suggested placeholder marker
-	// PostgreSQL VACUUM/ANALYZE implementation
-	return nil
+	// TODO(BED-8353): implement PostgreSQL VACUUM/ANALYZE using config.Targets.
+	// A nil/empty config.Targets means "optimize every region" per graph.OptimizeConfig.
+	return nil

Want me to open a follow-up issue to track the VACUUM/ANALYZE implementation?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@drivers/pg/driver.go` around lines 180 - 188, The Optimize method currently
builds graph.OptimizeConfig but returns nil (no-op) which falsely signals
success; update Driver.Optimize to mark it as an explicit placeholder by adding
a TODO comment and returning a clear not-implemented error (e.g.,
errors.New("Driver.Optimize: not implemented") or a package-specific
ErrNotImplemented) instead of nil, and ensure any capability advertising
(StorageMaintainer) is removed or gated until the VACUUM/ANALYZE implementation
using graph.OptimizeConfig is completed so callers can distinguish implemented
vs placeholder behavior.

@brandonshearin brandonshearin changed the title feature: Add graph.StorageMaintainer interface: BED-8353 feat: Add graph.StorageMaintainer interface - BED-8353 Jun 1, 2026
Comment thread graph/optimize.go
type OptimizeConfig struct {
// Targets is the set of storage regions to optimize. A nil or empty
// slice instructs the driver to optimize every region it knows about.
Targets []TargetStorage
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion: While convenient, an empty slice representing "all" supported storage targets may be misleading

Comment thread graph/optimize.go
Comment on lines +52 to +57
type StorageMaintainer interface {
// Optimize runs a storage maintenance pass against the regions
// identified by opts. With no options, every region the driver
// recognizes is optimized.
Optimize(ctx context.Context, opts ...OptimizeOption) error
}
Copy link
Copy Markdown
Member

@ddlees ddlees Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dig that this interface features Options as closures. I think it has to potential to be really powerful.

For example, we could potentially pass driver specific options and only apply them when applicable. We'll want to feel something like that out a bit but here's the general idea:

m.Optimize(ctx,
  graph.Targets(targets),
  pg.Partitions(partitions),
  pg.SkipReindex(),
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context of having zero options is different than declaring zero storage targets.

On one hand, m.Optimize(ctx) optimizing all DAWGS storage makes perfect sense. If you're casting that from the graph db/driver, it says "optimize everything managed by this struct."

Where the expected behavior becomes murky is something like m.Optimize(OptimizeTargets(nil)). That to me could translate to, "of everything managed by this struct only optimize none of the targets."

We'll want to spend time considering how to expose targeting individual storage targets and all storage targets.

Copy link
Copy Markdown
Contributor Author

@brandonshearin brandonshearin Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this API gracefully handles the edge case you are mentioning.

The foundational rule is: when OptimizeConfig.Targets is nil, the API should treat that as optimize all StorageTargets. The specific case you are describing is actually captured by the simple table test i made in optimize_test.go (test case #2).

Screenshot 2026-06-01 at 2 51 22 PM

This test shows that calling m.Optimize(OptimizeTargets()) will evaluate to an OptimizeConfig struct with a nil Targets property, which is what I think we want.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to reframe my last comment as a question. Does my understanding of this edge case align with your idea of a good design here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants