Publish API, v2 metadata cache, multi-arch release image#4
Closed
cyberb wants to merge 98 commits into
Closed
Conversation
New control-plane endpoints let CI clients upload snaps to S3 via presigned multipart URLs, without any S3 credentials of their own: POST /syncloud/v1/publish/init create multipart, return presigned PUT urls POST /syncloud/v1/publish/part-url re-mint a single part url (resume after expiry) POST /syncloud/v1/publish/finalise complete upload, write sidecars + snap.yaml + icon All endpoints share the existing cache-refresh token. snap.yaml is the source of truth for app metadata. The finalise endpoint parses it and rejects writes that would change (name, summary, description, type) compared to the existing shared object — catches per-arch drift before it corrupts the catalog. New CLI in cmd/publish/, shipped as a multi-arch docker image syncloud/release (amd64, arm64, arm/v7) built via buildx in CI. Layout written by finalise: v2/apps/<channel>/<app>/snap.yaml shared, drift-checked v2/apps/<channel>/<app>/icon.png shared, idempotent overwrite v2/apps/<channel>/<app>/<arch>/version per-arch pointer v2/apps/<channel>/<app>/<arch>/<v>.snap per-arch binary v2/apps/<channel>/<app>/<arch>/<v>.sha384 v2/apps/<channel>/<app>/<arch>/<v>.size Existing cmd/release/ binary and the index-v2 read path are untouched. Cache reader switch to the v2 layout will land separately.
Cache reads snap.yaml + icon from v2/apps/<channel>/<app>/ and lists
apps via v2/apps/<channel>/apps.json. Per-arch version/sha384/size
and the snap binary URL stay at legacy paths (apps/<n>_<v>_<a>.snap.*
and releases/<channel>/<app>.<arch>.version) so apps still using the
old release binary remain visible during migration.
model.App.Required is gone. snap.yaml's type field is the only source
for app/base distinction; cache filters UI apps by Type == "base".
Publish API in finalise:
- completes multipart at apps/<n>_<v>_<a>.snap (legacy path)
- writes sha384, size and version sidecars at legacy paths
- writes snap.yaml + icon.png to v2 paths
- adds the app to v2/apps/<channel>/apps.json if missing
- rejects drift in (name, summary, description, type) on snap.yaml
cmd/release and release/{info,storage*}.go deleted. The frozen
syncloud-release binary at the existing GitHub release tag stays
available for apps that have not yet migrated; new tags from this
branch no longer ship it. AWS_S3_ENDPOINT env var lets the SDK
target a custom S3-compatible host (groundwork for minio-backed
integration tests).
ListObjectsV2 with delimiter='/' returns the common prefixes under v2/apps/<channel>/, which is exactly the set of app ids. No separate index file to maintain, no race window when concurrent finalises both rewrite it, fewer keys in the bucket. Cache now takes an AppLister; release.Multipart implements it. The store no longer starts without AWS credentials — they were always needed for the publish endpoints, and the cache now needs them too for LIST. model.AppsIndex is gone. cmd/release is restored as a CI-internal binary. The new docker image is the going-forward way for apps to publish, but the existing integration tests in test/store_test.go still drive set-version / promote via ssh + the old binary. Rewriting that suite to use minio plus the new docker image is the next focused PR.
…in CI
Drone services:
- apps.syncloud.org nginx replaced with a minio service named "minio"
- New "seed minio" step downloads mc client and seeds the bucket with
v2 metadata (snap.yaml + icon) plus legacy snap binaries and sidecars
- test/seed.sh replaces test/publish.sh
Test config:
- config/test/secret.yaml carries base_url=http://minio/test and
bucket=test. util.Config gains these fields; cmd/store/main.go reads
them with sensible defaults so production is unchanged.
- store deploy gets AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and
AWS_S3_ENDPOINT env vars; ci/deploy-run.sh forwards them through ssh
sudo to deploy.sh's docker run
Tests:
- test/store_test.go drops every Ssh("apps.syncloud.org", "/syncloud-release ...")
call. A SetVersion helper writes the version pointer directly to
minio via aws-sdk-go. RefreshCache pokes the store's existing
/syncloud/v1/cache/refresh endpoint.
- TestPrepareStore now just installs the store; initial app data is
set up by the seed step.
Docker image smoke test:
- New "e2e publish image" step pulls
syncloud/release:${DRONE_BRANCH}-${DRONE_BUILD_NUMBER} (the image
built earlier in the same pipeline), docker-runs `publish snap`
against api.store.test, then verifies the app shows up via
/api/ui/v1/apps. Resolves the drone service network from
$(hostname) so the spawned container can reach api.store.test.
Legacy:
- cmd/release and release/{info,storage*}.go deleted for real.
build.sh no longer emits syncloud-release-${arch}.
- test/publish.sh and test/index-v2 deleted; the v2 metadata in
meta/snap.yaml + images/<app>.png is the only source of truth.
SnapRevision handler reads <baseUrl>/revisions/<key>.revision; the file was only ever written by the old syncloud-release binary. seed.sh now writes one per snap; publish API Finalise writes one per upload. SnapRevision handler now uses configurable baseUrl too.
…l titles to current names
…or prefix upstream.Path base_url=http://minio/test caused the default director to mangle r.URL.Path before TrimPrefix could find the icon route prefix. Rebuild Path explicitly so it works regardless of upstream path.
clearer name for what it is: the thing that publishes apps to the store
…us GetObject apps.syncloud.org has a bucket-level GetObject policy for *; the per-object ACL on PutObject and CreateMultipartUpload was redundant. Removing it also lets Garage (no per-object ACL support) become a drop-in S3 backend.
dxflrs/garage publishes arm/v7 images, so the arm pipeline can now run seed/test/build-test alongside amd64 and arm64. The service is still called minio in the drone config so test/seed.sh and test/store_test.go keep using the same hostname — only the image, command and bootstrap change. Garage requires explicit layout assign + bucket/key creation before its S3 API accepts traffic, all done inline in the service command. Garage has no per-object ACL and no bucket policy for anonymous reads, so we use its s3_web endpoint on port 3902 with website mode enabled on the bucket. Path-style: http://minio:3902/test/<key> serves objects anonymously, which is what snapd and the cache's HTTP reads need. config/test/secret.yaml now points base_url at the web endpoint; AWS_S3_ENDPOINT stays at http://minio (S3 API on :80) for the store's SDK calls. Same code, two endpoints — production-style anonymous-via- bucket-policy collapses to one endpoint when we move back to real S3.
The service container is now just a thin bash script under test/, kept in source instead of wedged into jsonnet as a heredoc. Drone runs it as a detached step (alongside vm) so subsequent steps can talk to s3:80 (S3 API) and s3:3902 (web endpoint) via the build network. Service name renamed to "s3" since the implementation is no longer minio-specific. seed.sh, test/store_test.go, test/test.sh and config/test/secret.yaml updated to match.
Drop test/seed.sh + mc download. Replaced with test/cmd/seed/main.go that uses the same aws-sdk-go we already pull in for store_test.go. build-tests.sh now produces both test/test and test/seed; the drone 'seed s3' step is plain ./test/seed with no apt installs. Also flipped the step order: build test → seed s3, so the binary exists when seed runs.
Each arch pipeline now builds and pushes its own syncloud/store-publisher:<branch>-<build>-<arch> tag using plugins/docker. The e2e publish image step moves out of the amd64-only block so each pipeline tests its own arch's binary against the test store it already brought up. A new top-level pipeline 'publisher manifest' with depends_on: [amd64, arm64, arm] fans the 3 arch tags into a single multi-arch manifest at <branch>-<build> and <branch> using plugins/manifest. If any arch pipeline fails, drone skips the manifest pipeline so consumers never get an incomplete manifest. Removes the docker publish (multi-arch) buildx step from amd64 — no more QEMU emulation, no cross-pipeline race when testing.
seed: log last s3 error on timeout, 120s instead of 60 build 214 saw 'bucket test not ready after 60s' in seed step but the detached s3 step's logs aren't fetchable via drone REST. Now the script prints status to stdout, sleep-infinitys at the end so the container stays up even if init partially fails, and seed reports the actual SDK error from HeadBucket.
detached steps don't get registered with drone's service network DNS, only top-level services do. The script approach was dead in the water for that reason. Back to inline heredoc — uglier in jsonnet but works.
… work dxflrs/garage image is distroless (no sh), so entrypoint:[sh,-c] exits immediately and the service container never registers its DNS alias. Switch to alpine:3.20 (has shell+wget), fetch the matching garage musl binary per arch at startup, init bucket+key, then exec server.
garage v1.0.1 rejects 'test' as an access key id: it must start with 'GK' followed by 12 hex bytes. Use a fixed valid id everywhere credentials are passed (drone, install.sh defaults, seed program, store integration test).
…shpass install + stripped comments
config.BaseUrl and config.Bucket are required fields: every secret.yaml template (test/uat/prod) sets them, and deploy.sh fails fast if secret.yaml is missing. The 'if baseUrl == ""' / 'if bucket == ""' fallbacks (defaulting to apps.syncloud.org) were unreachable. Fail fast with a clear message instead and use config.BaseUrl / config.Bucket directly. api.Url was the only place that hardcoded apps.syncloud.org. With the defaulting removed nothing references it. Drop the constant.
The 409-on-drift behavior left no clean path for an app maintainer to change summary/description/type — operator had to delete the S3 object manually. Trust the app's CI and just write the new snap.yaml. - api/snap_yaml_publisher.go: drop the Get+ParseSnapMeta+compare block, publish is now a single Put. p.write helper is gone too. - api/publish_test.go: TestSnapYamlPublisher_DriftRejected removed. TestSnapYamlPublisher_IdenticalAccepted renamed to TestSnapYamlPublisher_OverwritesExisting and now actually checks the new bytes landed in S3.
- publish_helpers_test.go: shared fakeMP, fakeCache, postJSON, itoa - snap_binary_publisher_test.go: SnapBinary init + finalise tests - snap_yaml_publisher_test.go: SnapYaml first write + overwrite tests - icon_publisher_test.go: Icon write + bad auth tests
Drop the '-linkmode external -extldflags -static' dance and go pure Go instead. With CGO off the Go compiler uses its internal linker which produces a static binary by default (no DT_NEEDED entries in the ELF). All our deps (aws-sdk-go, echo, resty, zap, yaml, sha3, cobra) are pure Go; net/os fall back to the pure-Go DNS resolver and getpwnam when CGO is disabled, which is fine for our usage. Removes the need for gcc + static glibc/musl archives on the build host and trivially supports cross-compile.
…o binary Single static CGO=0 binary at build/bin/deploy-verify. The deploy steps no longer need 'apt install -y curl python3 sshpass openssh-client' — they just run the binary that was built earlier in the pipeline. The binary does: - Waits for /api/ui/v1/version (max 120s) - POSTs /syncloud/v1/cache/refresh with SYNCLOUD_TOKEN to validate both the publish token and AWS creds end to end (200 = both work, 401 = token wrong, 500 = aws creds / endpoint wrong) - GETs /api/ui/v1/apps?channel=stable and asserts the list is non-empty - GETs /v2/snaps/find?architecture=amd64&channel=stable and asserts >0 - GETs / for the web UI - On failure, ssh's to DEPLOY_HOST with /tmp/_deploy_key, runs 'sudo -n docker ps -a' and 'sudo -n docker logs syncloud-store' and dumps the output to stderr before exiting non-zero Inputs from env: DEPLOY_URL, SYNCLOUD_TOKEN, DEPLOY_HOST, DEPLOY_USER, DEPLOY_KEYFILE (default /tmp/_deploy_key) — same as the bash script used. Unit tests cover all five HTTP checks via httptest, plus the JSON parsing helpers. ci/deploy-verify.sh deleted.
…ry takes MultipartStore SnapYamlPublisher and IconPublisher only ever called mp.Put — they were getting a 6-method MultipartStore for one method. Split: - ObjectPutter (api/snap_yaml_publisher.go): Put(key, body, contentType) - MultipartStore (api/snap_binary_publisher.go): Create / PresignPart / Complete / Abort / HeadSize / Put — for the multipart upload of the snap binary plus its sidecars NewSnapYamlPublisher and NewIconPublisher take ObjectPutter. Field renamed mp -> store in those two types. release.Multipart satisfies both interfaces so cmd/store/main.go wiring is unchanged. Also drop dead code: release.Multipart.Get + the test fake's Get + getErr field. Get was the drift check's only user; gone with it.
fakeMP is a 6-method MultipartStore implementation. Icon test only needs Put, so it gets a 1-method fakeIconStore (struct holds a single objects map) inside the same _test.go file. Reads more directly than borrowing the big shared fake.
Both test files now declare a tiny per-file fake satisfying only the interface they need: - snap_yaml_publisher_test.go: fakeYamlStore (1 method, Put) - snap_binary_publisher_test.go: fakeBinaryStore (6 methods) + fakeRefresher publish_helpers_test.go shrinks to just postJSON — that's echo plumbing for httptest, not a fake. fakeMP / fakeCache / itoa are gone with their no-longer-shared usage.
cmd/deploy-verify is gone. Replaced by verify/deploy_test.go — same checks expressed as Go tests, compiled with 'go test -c' into the same build/bin/deploy-verify path. Drone deploy steps run './build/bin/deploy-verify -test.v -test.failfast'. Why this is better: - No tests-of-tests. The deploy-verify code IS the test. Previously cmd/deploy-verify/main_test.go was httptest-mocking the same logic it was implementing; deleting it lost no coverage that matters. - Source-declaration order across the file gives natural ordering: TestVersion (poll until store is up), TestCacheRefresh (validates token + AWS creds), TestApps, TestFind, TestWebUI. -test.failfast short-circuits cleanly after the first failure. - TestMain reads env once (DEPLOY_URL, SYNCLOUD_TOKEN, DEPLOY_HOST, DEPLOY_USER, optional DEPLOY_KEYFILE); each test reuses it. - dumpOnFail (t.Cleanup that fires only when t.Failed()) sshs to the remote and grabs 'sudo -n docker ps -a' + 'docker logs syncloud-store' through t.Logf, so failures show docker state inline in test output. build.sh switches the deploy-verify line from 'go build' to 'go test -c -o build/bin/deploy-verify ./verify'.
Interface defined next to its concrete implementation release.Multipart. api/snap_yaml_publisher.go and api/icon_publisher.go now reference release.ObjectPutter. (MultipartStore stays in api/ for now since it's a publisher-side interface segregation, not directly mirroring a single release type's surface area.)
Without the tag verify/deploy_test.go was being picked up by 'go test ./...' and failing on the missing DEPLOY_URL env var. Now it only compiles into the binary that's built explicitly via 'go test -c -tags integration -o build/bin/deploy-verify ./verify' in build.sh.
…helpers_test.go Each test file now self-contains its echo-handler-to-httptest wrapper (iconPost / yamlPost / binaryPost — same body, different name to keep the files independent). Slight duplication, but each *_test.go reads top-to-bottom with everything it needs in one file. Shared helpers file is gone.
…ls, errors are typed Publishers no longer touch echo.Context. Signatures are now: SnapBinaryPublisher.Init(model.PublishInitRequest) (*model.PublishInitResponse, error) SnapBinaryPublisher.PartUrl(model.PublishPartUrlRequest) (*model.PublishPartUrlResponse, error) SnapBinaryPublisher.Finalise(model.PublishFinaliseRequest) (*model.PublishFinaliseResponse, error) SnapYamlPublisher.Publish(model.PublishSnapYamlRequest) (*model.PublishSnapYamlResponse, error) IconPublisher.Publish(model.PublishIconRequest) (*model.PublishIconResponse, error) api/errors.go: small apiError type with Status + Msg. unauthorized() / badRequest() / conflict() constructors. The publishers return these; the HTTP layer maps to status codes via errors.As. api/publish_routes.go: per-endpoint echo handlers that bind the request, call the publisher, and route the response through the reply helper. registerPublishRoutes wires all five endpoints. The whole echo.Context surface lives in this file plus public.go. Tests are much smaller now — no httptest, no iconPost/yamlPost/ binaryPost echo plumbing. Each test calls the publisher method directly with a model struct and asserts on the returned response or typed error.
verify/ is now its own Go module (github.com/syncloud/store/verify) with its own go.mod / go.sum. 'go test ./...' in the main module no longer sees it — no need for the 'integration' build tag hack. build.sh: 'cd verify && go test -c -o ... .' instead of 'go test -c -tags integration -o ... ./verify'. verify/deploy_test.go: drop the //go:build integration tag.
…napRevision test/ has its own go.mod (module 'test'). The seed binary at test/cmd/seed/main.go imports github.com/syncloud/store/model since commit 664d57c (SnapRevision reuse), but test/go.mod didn't declare the cross-module dependency, so 'go build ./cmd/seed' from test/ failed with 'no required module provides package github.com/syncloud/store/model'. Add 'require github.com/syncloud/store v0.0.0' + 'replace github.com/syncloud/store => ../' so the local main module satisfies it. go mod tidy regenerated test/go.sum and bumped some indirects.
… one place api/publish_routes.go is gone. The five publish endpoints now live in public.go's Start() alongside the snapd-protocol routes (/v2/snaps/*), the assertion routes, the UI routes, etc. reply() helper moved into public.go too.
seed had been using model.SnapRevision (commit 664d57c) which forced test/go.mod to require + replace the main module. Inline the 4-field fmt.Sprintf in seed instead; test/ is back to having no syncloud/store imports, so the require + replace come out of test/go.mod too. The small duplication with api/snap_binary_publisher's struct literal is the price.
5min per-call timeout meant a single hung TCP could eat the whole 2min retry budget. Drop client timeout to 10s and budget 30 attempts × 10s. Log every attempt's error or status+body so failures show what was actually returned.
debian:bookworm-slim ships without a CA bundle, so deploy-verify's https Get to uat_deploy_url failed with x509: certificate signed by unknown authority. Bundle ca-certificates into the same apt install that already runs for openssh-client.
TestCacheRefresh actually takes >10s (it hits S3), so the global 10s client timeout broke it. Restore the 5min shared client and give TestVersion its own short-timeout probe client locally — the short timeout only matters for the version poll, where we want fast retries.
echo's default error handler renders returned error strings into the response body. The cache refresh path bubbled up storage/S3 errors that way, potentially exposing internal URLs and bucket layout to API callers. Log the real error via zap and return a generic message.
deploy-prepare.sh now writes DEPLOY_KEY -> /tmp/_deploy_key and seds the three secret.yaml placeholders unconditionally. No more grep/exists branches that depended on which env was being deployed. test-init.sh generates an ephemeral ed25519 keypair, installs the pub on the test target via sshpass, and the drone test step then exports DEPLOY_KEY from /tmp/_deploy_key so deploy-prepare can run the same shape as uat/prod (which get DEPLOY_KEY from a drone secret). The remote ssh setup is also split into three plain commands instead of one chained "mkdir && cat && chmod" string.
Drone deploy test step now injects the (already-public) garage test creds via env vars, same way uat/prod inject from secrets. All three envs now share the same secret.yaml shape so deploy-prepare.sh can sed the placeholders unconditionally.
The previous test loaded config/test/secret.yaml directly and asserted token == "test". Once config/test/secret.yaml became a @Placeholder@ template (so deploy-prepare can sed it the same way it does for uat/ prod), yaml parsing failed and the assert no longer made sense. Write the test fixture inline via t.TempDir so the unit test owns its data.
All three secret.yaml files now have the same set of top-level keys, quoting the @Placeholder@ values so they parse as valid YAML on their own (deploy-prepare seds the placeholders before the file ever leaves the build container). - uat and prod gain an explicit aws_s3_endpoint pointing at the regional AWS endpoint (no implicit SDK default). - TestLoadConfig now loads the real config/test/secret.yaml and asserts on the placeholder strings. - TestSecretYamlSchemaMatches reads all three files, extracts the top-level key set, and fails if any of them diverge.
Member
Author
|
Squash-merged to master as 88b1dc4. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a publish API on the store, ships a multi-arch docker image to drive it, switches the cache to read app metadata from a new v2 layout, and retires
cmd/release/from the codebase.Endpoints
POST /syncloud/v1/publish/initPOST /syncloud/v1/publish/part-urlPOST /syncloud/v1/publish/finaliseShared-secret token is the existing one that
/syncloud/v1/cache/refreshalready uses.Flow
initwith{name, version, arch, channel, size, sha384}→ store opens an S3 multipart upload, returns N presigned PUT URLs.finalisewith the completed parts +snap.yamlcontent + base64 icon → store callsCompleteMultipartUpload, writes sidecars, refreshes cache.Path layout
snap.yamlandicon.pngare the only new files. The snap binary and version/sha384/size sidecars stay at their existing paths to keep apps that are still on the old release binary visible during migration.finalisewrites drives all six of these locations directly — no dual-write code, no mirror logic. Old layout files are written to their old paths because that's where they belong.snap.yaml drift detection
finaliseparses the newsnap.yamland any existing one and rejects with409 Conflictif(name, summary, description, type)differs. Catches per-arch metadata divergence before it corrupts the catalog.Cache reader
v2/apps/<channel>/apps.jsonv2/.../snap.yaml/api/ui/v1/icons/<channel>/<app>→ proxy rewrites tov2/apps/<channel>/<app>/icon.pngRequired → type: base
model.App.Requiredis gone. The cache filters UI apps byType == "base"read fromsnap.yaml. Onlyplatformisbasetoday.cmd/release retired
cmd/release/andrelease/{info,storage*}.goremoved. The existingsyncloud-release-<arch>binary at GitHub release tag 4 is unaffected — apps that haven't migrated keep using it until their next release.build.shno longer emitsout/syncloud-release-<arch>.Dockerfile.publish(multi-stage, distroless) is built multi-arch (amd64/arm64/arm/v7) by athegeeklab/drone-docker-buildxstep and pushed tosyncloud/release:<tag>.Client
cmd/publish/is the new CLI shipped via the docker image. Apps move from:to:
snap.yaml is picked up from
meta/snap.yaml, icon frommeta/gui/icon.png— both committed across all 44 catalog repos already.Misc
AWS_S3_ENDPOINTenv var supported on the store — points the SDK at any S3-compatible host (groundwork for minio-backed integration tests in a follow-up).v2/apps/{master,stable}/apps.json+ all 44snap.yaml+ 44 icons completed before this PR lands so the cache sees a populated catalog from minute one.Not in this PR
syncloud-release-<arch>(kept frozen for migration)syncloud/releaseimage and publishes a test snap against a MinIO-backed storeTest plan
go test ./...passesgo vet ./...cleandrone lint .drone.ymlpasses