CI: drop registry cache-export from build-base (Hub 400 root cause)
Validate / base-change-warning (push) Successful in 6s
Validate / docs-check (push) Successful in 13s
Validate / validate-with-pi (push) Successful in 4m9s
Validate / validate-omos (push) Successful in 4m31s
Validate / validate-base (push) Successful in 5m40s
Validate / validate-omos-with-pi (push) Successful in 12m49s
Validate / base-change-warning (push) Successful in 6s
Validate / docs-check (push) Successful in 13s
Validate / validate-with-pi (push) Successful in 4m9s
Validate / validate-omos (push) Successful in 4m31s
Validate / validate-base (push) Successful in 5m40s
Validate / validate-omos-with-pi (push) Successful in 12m49s
Diagnosed during manual v1.15.12 publish: buildkit's mode=max cache export to registry-1.docker.io reproducibly returns HTTP 400 with HTML body on the resumable-upload PUT. Image push (layers + manifest) works fine in parallel; only --cache-to fails. Removing cache-from/cache-to lets the publish complete. This explains all four prior CI failures (runs 332/333/334/336) which shared the exact same failure shape. Action-pin hypothesis (setup-buildx-action v4.1.0) was correctly disproven by run 336 with v4.0.0 pinned. Trade-off: every Dockerfile.base change now pays the full ~3 min multi-arch build. Unchanged bases short-circuit at the content-addressed probe step in base-decide and never re-build, so day-to-day cost is zero. Re-enable when moby/buildkit upstream resolves the cache-export protocol mismatch with Hub CDN, or when we can switch to a non-registry cache backend. CHANGELOG.md: full root-cause writeup in Unreleased section, including status update on every prior suspect (all ruled out).
This commit is contained in:
@@ -192,9 +192,14 @@ jobs:
|
||||
platforms: linux/amd64,linux/arm64
|
||||
push: true
|
||||
tags: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
# Registry cache for faster repeat base rebuilds (e.g. Node bump).
|
||||
cache-from: type=registry,ref=${{ env.IMAGE }}:base-buildcache
|
||||
cache-to: type=registry,ref=${{ env.IMAGE }}:base-buildcache,mode=max
|
||||
# Registry cache disabled: buildkit's cache-export (mode=max) hits a
|
||||
# reproducible HTTP 400 from registry-1.docker.io on the resumable-
|
||||
# upload PUT (state-token format mismatch on Hub CDN, suspected to
|
||||
# have started ~2026-05-23). Image push itself works fine. We pay
|
||||
# the full base build on every Dockerfile.base change, but the base
|
||||
# tag itself is content-addressed (base-<hash>) so unchanged bases
|
||||
# short-circuit at the probe step and never re-build anyway. Re-
|
||||
# enable when upstream resolves; tracked in CHANGELOG v1.15.12.
|
||||
|
||||
# ── Phase 3: amd64 smoke per variant (gates the multi-arch publish) ─
|
||||
# Each smoke job builds amd64-only against the base tag and runs
|
||||
|
||||
Reference in New Issue
Block a user