Add instance health checks#234
Conversation
✱ Stainless preview builds for hypemanThis PR will update the Edit this comment to update it. It will appear in the SDK's changelogs. ✅ hypeman-openapi studio · code · diff
✅ hypeman-typescript studio · code · diff
✅ hypeman-go studio · code · diff
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
Monitoring Plan: Instance Health Checks (PR #234)This PR adds a new health-check subsystem to The main risks are: (1) validation errors in Key risks to watch:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 36c8c34. Configure here.

Summary
health_checkpolicy andhealth_statusresponse fields for http, tcp, and exec probesInitializingorRunning, while keeping public health statusstartinguntil the instance reachesRunningTestCreateInstanceWithNetworkso the VM-starting network path waits for persistedhealthystatuslib/healthcheck/README.mdTests
go test ./lib/healthcheckgo test ./lib/instances -run TestCreateInstanceWithNetwork -count=0go test ./lib/instances -run 'TestHealthCheck|TestValidateCreateRequestHealthCheck|TestValidateUpdateInstanceRequest|TestManagerUpdateInstanceHealthCheckOnlyPublishesLifecycleUpdate|TestLifecycleEventMetrics_ObserveSubscribersQueueDepthAndDrops|TestLifecycleSubscribers'go test ./cmd/api/api -run 'TestCreateInstance_MapsHealthCheckPolicy|TestUpdateInstance_MapsHealthCheckPatch|TestCreateInstance_MapsAutoStandbyPolicy|TestUpdateInstance_MapsAutoStandbyPatch'go test ./cmd/api -run TestDoesNotExistgo test ./lib/providersNotes
go test ./lib/instances -run TestCreateInstanceWithNetwork -count=1was attempted twice; both runs failed before instance creation because the existing nginx image readiness wait still saw image statuspendingafter 60s.go test ./cmd/api/apiis currently blocked by Docker Hub unauthenticated pull rate limits and local network bridge permissions in existing integration tests.make generate-wireis currently blocked because the checked-in wire binary was built with Go 1.24 and this package now requires Go 1.25;wire_gen.gowas updated in the same small shape andgo test ./cmd/api -run TestDoesNotExistpasses.Note
Medium Risk
Adds a new asynchronous health-check controller that probes running/initializing instances and persists runtime status, plus new API surface for configuring checks; timing/state handling and background scheduling increase behavioral and concurrency risk.
Overview
Adds a first-class workload health dimension to instances via a new
lib/healthcheckpackage (policy normalization/validation, probe execution, and status tracking) plus persisted runtime state.Extends the Instances API to accept and return
health_checkand to reporthealth_status, wiring request/response mapping and validation into create/update flows and resetting runtime on policy changes.Introduces an
instances.HealthCheckControllerthat subscribes to lifecycle events, schedules probes while instances areInitializing/Running(withstartingmasking untilRunning), runs HTTP/TCP/exec checks, and persists health runtime; the API process now wires and runs this controller. Separately, metadata writes are made atomic via temp-file + rename, and tests/integration paths are updated to cover health-check behavior and lifecycle consumer metrics.Reviewed by Cursor Bugbot for commit a32c4c8. Bugbot is set up for automated code reviews on this repo. Configure here.