Skip to content

improvement(cleanup): cleanup refs along in logs cleanup job#4661

Merged
icecrasher321 merged 3 commits into
stagingfrom
improvement/cleanup-logs
May 19, 2026
Merged

improvement(cleanup): cleanup refs along in logs cleanup job#4661
icecrasher321 merged 3 commits into
stagingfrom
improvement/cleanup-logs

Conversation

@icecrasher321
Copy link
Copy Markdown
Collaborator

@icecrasher321 icecrasher321 commented May 19, 2026

Summary

Update cleanup logs job to cleanup associated refs.
Also updated cleanup job to use billing helpers from rest of platform.

Type of Change

  • New feature

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 19, 2026 7:02pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 19, 2026

PR Summary

Medium Risk
Changes deletion criteria and adds storage-key cleanup during log retention, plus rewires plan/workspace resolution for dispatching cleanup jobs; mistakes could cause unintended data retention or premature deletion. Uses DB text search and additional subscription lookups, which may impact performance at scale.

Overview
Extends cleanup-logs to also delete cloud-stored “large execution values” referenced from workflowExecutionLogs.executionData, while avoiding deletion when the same storage key is still referenced by non-deleted logs; it also skips deleting logs for executions in resumable paused states and adds detailed deletion stats logging.

Reworks retention scoping in cleanup-dispatcher to derive workspace sets from the same effective plan lookup used elsewhere (handling personal vs organization workspaces), and tightens enterprise cleanup eligibility by verifying the workspace is org-mode and the org’s current subscription resolves to enterprise before applying org-configured retention.

Adds onError options to subscription lookup helpers (getOrganizationSubscription, getHighestPrioritySubscription, and new getHighestPriorityPersonalSubscription) so cleanup dispatch can fail fast per-tenant and skip problematic rows rather than silently misclassifying plans.

Reviewed by Cursor Bugbot for commit b9ffe66. Configure here.

Comment thread apps/sim/lib/billing/cleanup-dispatcher.ts Outdated
Comment thread apps/sim/background/cleanup-logs.ts
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 19, 2026

Greptile Summary

This PR extends the log cleanup job to also delete orphaned large-value storage refs (S3 keys embedded in executionData) when their associated execution logs are purged, and refactors the cleanup dispatcher to use getHighestPriorityPersonalSubscription — fixing a bug where personal workspaces owned by enterprise org members were incorrectly placed in the enterprise cleanup bucket instead of the free-plan bucket.

  • Large-value key cleanup (cleanup-logs.ts): on each batch, collectLargeValueKeys extracts storage keys from executionData, then filterLargeValueKeysWithoutRetainedReferences checks whether any surviving log row still references each key before deleting from storage. The cross-batch reference guard is correct — onBatch runs before the DB DELETE, so the notInArray exclusion of current-batch IDs is sufficient.
  • Dispatcher refactor (cleanup-dispatcher.ts): replaces the old SQL JOIN against the subscription table with per-user calls to getHighestPriorityPersonalSubscription, isolating personal-workspace plan resolution from org-level subscriptions. Individual per-user errors are now caught and logged without aborting the entire dispatch.
  • Billing utilities (billing.ts, plan.ts, subscription.ts): adds onError: 'throw' option to getOrganizationSubscription and getHighestPrioritySubscription, and extracts getHighestPriorityPersonalSubscription as a separate, personal-subscription-only lookup.

Confidence Score: 5/5

Safe to merge — the core cleanup correctness is sound and the dispatcher bug fix is well-reasoned. The two comments are background-job performance observations that do not affect data integrity.

The cross-batch reference guard in filterLargeValueKeysWithoutRetainedReferences is logically correct: onBatch runs before the DELETE, so excluding current-batch IDs via notInArray accurately represents which logs will survive. The dispatcher correctly isolates personal-workspace plan resolution from org-level subscriptions, fixing the previously identified enterprise-member bug. The per-key sequential full-table scan and redundant workspace queries in dispatchCleanupJobs are worth addressing but do not produce wrong results.

apps/sim/background/cleanup-logs.ts — the per-key position() scan loop warrants a closer look before this runs at enterprise scale.

Important Files Changed

Filename Overview
apps/sim/background/cleanup-logs.ts Adds large-value storage key cleanup alongside log deletion. The correctness logic is sound (onBatch runs before DELETE, cross-batch reference check via notInArray is correct), but filterLargeValueKeysWithoutRetainedReferences issues a sequential full-table text scan per unique key, which will be slow for large deployments.
apps/sim/lib/billing/cleanup-dispatcher.ts Replaces SQL-JOIN plan resolution with getHighestPriorityPersonalSubscription — correctly fixes the enterprise-member personal-workspace bug. resolveWorkspaceIdsForPlan is called redundantly (N+1 times) in dispatchCleanupJobs, causing repeated full workspace + subscription queries.
apps/sim/lib/billing/core/billing.ts Adds optional onError: 'throw' behaviour to getOrganizationSubscription — clean, backward-compatible change.
apps/sim/lib/billing/core/plan.ts Adds getHighestPriorityPersonalSubscription (personal-only subscription lookup) and refactors pickHighestPrioritySubscription into a shared helper. The [...orgSubs, ...personalSubs] ordering preserves the original org-over-personal tie-break. Looks correct.
apps/sim/lib/billing/core/subscription.ts Re-exports getHighestPriorityPersonalSubscription — trivial change, no issues.

Sequence Diagram

sequenceDiagram
    participant CJ as CleanupLogsJob
    participant CBD as chunkedBatchDelete
    participant DB as Database
    participant Filter as filterLargeValueKeysWithoutRetainedRefs
    participant S3 as StorageService

    CJ->>CBD: chunkedBatchDelete(workflowExecutionLogs)
    loop per workspace chunk x batch
        CBD->>DB: "SELECT rows WHERE workspaceId IN AND startedAt < retentionDate"
        DB-->>CBD: rows[]
        CBD->>CJ: onBatch(rows)
        CJ->>Filter: filterLargeValueKeysWithoutRetainedRefs(keys, deletedLogIds)
        loop per unique key
            Filter->>DB: SELECT id WHERE id NOT IN deletedLogIds AND position(key in executionData) LIMIT 1
            DB-->>Filter: referencingLog?
        end
        Filter-->>CJ: unreferencedKeys[]
        CJ->>S3: deleteFile for each unreferencedKey
        CBD->>DB: DELETE WHERE id IN batch ids
    end
Loading

Reviews (3): Last reviewed commit: "cleanup code" | Re-trigger Greptile

Comment thread apps/sim/background/cleanup-logs.ts
Comment thread apps/sim/lib/billing/cleanup-dispatcher.ts Outdated
@icecrasher321
Copy link
Copy Markdown
Collaborator Author

bugbot run

@icecrasher321
Copy link
Copy Markdown
Collaborator Author

@greptile

Comment thread apps/sim/lib/billing/cleanup-dispatcher.ts
Comment thread apps/sim/lib/billing/cleanup-dispatcher.ts Outdated
@icecrasher321
Copy link
Copy Markdown
Collaborator Author

bugbot run

@icecrasher321
Copy link
Copy Markdown
Collaborator Author

@greptile

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b9ffe66. Configure here.

Comment thread apps/sim/background/cleanup-logs.ts
Comment thread apps/sim/lib/billing/cleanup-dispatcher.ts
@icecrasher321 icecrasher321 merged commit 6414b93 into staging May 19, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the improvement/cleanup-logs branch May 19, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant