ci: workflow-level 3-attempt retry around buildx build --push
Validate / docs-check (push) Successful in 7s
Validate / base-change-warning (push) Successful in 6s
Validate / validate-with-pi (push) Successful in 4m11s
Validate / validate-omos (push) Successful in 4m31s
Validate / validate-base (push) Successful in 5m19s
Validate / validate-omos-with-pi (push) Successful in 11m38s
Validate / docs-check (push) Successful in 7s
Validate / base-change-warning (push) Successful in 6s
Validate / validate-with-pi (push) Successful in 4m11s
Validate / validate-omos (push) Successful in 4m31s
Validate / validate-base (push) Successful in 5m19s
Validate / validate-omos-with-pi (push) Successful in 11m38s
Belt-and-braces against transient registry-1.docker.io blips (rate limits, brief 5xx, CDN flap). Replaces all five push docker/build-push- action@v7 invocations (1 base + 4 variants) with shell: bash steps that run docker buildx build --push in a for-loop with backoff (15s, 30s). Smoke build steps (load: true, no push) are untouched. Does NOT mask deterministic failures: a true regression (e.g. the cache-export 400 we hit 2026-05-23..28) fails all 3 attempts identically and the job still fails by design. Orthogonal layer to both cache-export disablement and the ci-release-watcher skill's transient-rerun heuristic. - AGENTS.md: new Critical conventions bullet documenting the retry pattern, the consistency rule across push steps, and why it's duplicated rather than factored (Gitea Actions doesn't support reusable composite shell steps cleanly). - CHANGELOG.md: Unreleased section addendum, no image-side change. No image-side change.
This commit is contained in:
@@ -184,14 +184,17 @@ jobs:
|
|||||||
username: ${{ vars.DOCKERHUB_USERNAME }}
|
username: ${{ vars.DOCKERHUB_USERNAME }}
|
||||||
password: ${{ secrets.DOCKERHUB_TOKEN }}
|
password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||||
|
|
||||||
- name: Build and push base (multi-arch)
|
- name: Build and push base (multi-arch) — with retry
|
||||||
uses: docker/build-push-action@v7
|
shell: bash
|
||||||
with:
|
env:
|
||||||
context: .
|
BASE_TAG_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||||
file: Dockerfile.base
|
run: |
|
||||||
platforms: linux/amd64,linux/arm64
|
set -euo pipefail
|
||||||
push: true
|
# 3-attempt retry around `docker buildx build --push` for transient
|
||||||
tags: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
# registry-1.docker.io blips. Does NOT mask deterministic failures:
|
||||||
|
# a true regression (e.g. cache-export 400 hit 2026-05-23..28) will
|
||||||
|
# fail all 3 attempts identically and the job still fails — by
|
||||||
|
# design.
|
||||||
# Registry cache disabled: buildkit's cache-export (mode=max) hits a
|
# Registry cache disabled: buildkit's cache-export (mode=max) hits a
|
||||||
# reproducible HTTP 400 from registry-1.docker.io on the resumable-
|
# reproducible HTTP 400 from registry-1.docker.io on the resumable-
|
||||||
# upload PUT (state-token format mismatch on Hub CDN, suspected to
|
# upload PUT (state-token format mismatch on Hub CDN, suspected to
|
||||||
@@ -200,6 +203,25 @@ jobs:
|
|||||||
# tag itself is content-addressed (base-<hash>) so unchanged bases
|
# tag itself is content-addressed (base-<hash>) so unchanged bases
|
||||||
# short-circuit at the probe step and never re-build anyway. Re-
|
# short-circuit at the probe step and never re-build anyway. Re-
|
||||||
# enable when upstream resolves; tracked in CHANGELOG v1.15.12.
|
# enable when upstream resolves; tracked in CHANGELOG v1.15.12.
|
||||||
|
for attempt in 1 2 3; do
|
||||||
|
echo "==> Build+push attempt ${attempt}/3"
|
||||||
|
if docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
--file Dockerfile.base \
|
||||||
|
--push \
|
||||||
|
--tag "${BASE_TAG_FULL}" \
|
||||||
|
.; then
|
||||||
|
echo "==> Attempt ${attempt} succeeded"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
if [[ "${attempt}" -lt 3 ]]; then
|
||||||
|
backoff=$(( attempt * 15 ))
|
||||||
|
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||||
|
sleep "${backoff}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "==> All 3 build+push attempts failed"
|
||||||
|
exit 1
|
||||||
|
|
||||||
# ── Phase 3: amd64 smoke per variant (gates the multi-arch publish) ─
|
# ── Phase 3: amd64 smoke per variant (gates the multi-arch publish) ─
|
||||||
# Each smoke job builds amd64-only against the base tag and runs
|
# Each smoke job builds amd64-only against the base tag and runs
|
||||||
@@ -425,18 +447,40 @@ jobs:
|
|||||||
fi
|
fi
|
||||||
echo "EOF"
|
echo "EOF"
|
||||||
} >> "$GITHUB_OUTPUT"
|
} >> "$GITHUB_OUTPUT"
|
||||||
- uses: docker/build-push-action@v7
|
- name: Build and push variant (with retry)
|
||||||
with:
|
shell: bash
|
||||||
context: .
|
env:
|
||||||
file: Dockerfile.variant
|
TAGS: ${{ steps.tags.outputs.tags }}
|
||||||
platforms: linux/amd64,linux/arm64
|
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||||
push: true
|
run: |
|
||||||
build-args: |
|
set -euo pipefail
|
||||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
TAG_FLAGS=()
|
||||||
INSTALL_OPENCODE=true
|
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||||
INSTALL_OMOS=false
|
# 3-attempt retry around `docker buildx build --push` (see build-base
|
||||||
INSTALL_PI=false
|
# step for full rationale). Variant: base (opencode only).
|
||||||
tags: ${{ steps.tags.outputs.tags }}
|
for attempt in 1 2 3; do
|
||||||
|
echo "==> Build+push attempt ${attempt}/3"
|
||||||
|
if docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
--file Dockerfile.variant \
|
||||||
|
--push \
|
||||||
|
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||||
|
--build-arg "INSTALL_OPENCODE=true" \
|
||||||
|
--build-arg "INSTALL_OMOS=false" \
|
||||||
|
--build-arg "INSTALL_PI=false" \
|
||||||
|
"${TAG_FLAGS[@]}" \
|
||||||
|
.; then
|
||||||
|
echo "==> Attempt ${attempt} succeeded"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
if [[ "${attempt}" -lt 3 ]]; then
|
||||||
|
backoff=$(( attempt * 15 ))
|
||||||
|
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||||
|
sleep "${backoff}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "==> All 3 build+push attempts failed"
|
||||||
|
exit 1
|
||||||
|
|
||||||
build-variant-omos:
|
build-variant-omos:
|
||||||
needs: [base-decide, smoke-omos, resolve-versions]
|
needs: [base-decide, smoke-omos, resolve-versions]
|
||||||
@@ -473,19 +517,41 @@ jobs:
|
|||||||
fi
|
fi
|
||||||
echo "EOF"
|
echo "EOF"
|
||||||
} >> "$GITHUB_OUTPUT"
|
} >> "$GITHUB_OUTPUT"
|
||||||
- uses: docker/build-push-action@v7
|
- name: Build and push variant (with retry)
|
||||||
with:
|
shell: bash
|
||||||
context: .
|
env:
|
||||||
file: Dockerfile.variant
|
TAGS: ${{ steps.tags.outputs.tags }}
|
||||||
platforms: linux/amd64,linux/arm64
|
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||||
push: true
|
OMOS_VERSION: ${{ needs.resolve-versions.outputs.omos_version }}
|
||||||
build-args: |
|
run: |
|
||||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
set -euo pipefail
|
||||||
INSTALL_OPENCODE=true
|
TAG_FLAGS=()
|
||||||
INSTALL_OMOS=true
|
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||||
INSTALL_PI=false
|
# 3-attempt retry (see build-base step for rationale). Variant: omos.
|
||||||
OMOS_VERSION=${{ needs.resolve-versions.outputs.omos_version }}
|
for attempt in 1 2 3; do
|
||||||
tags: ${{ steps.tags.outputs.tags }}
|
echo "==> Build+push attempt ${attempt}/3"
|
||||||
|
if docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
--file Dockerfile.variant \
|
||||||
|
--push \
|
||||||
|
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||||
|
--build-arg "INSTALL_OPENCODE=true" \
|
||||||
|
--build-arg "INSTALL_OMOS=true" \
|
||||||
|
--build-arg "INSTALL_PI=false" \
|
||||||
|
--build-arg "OMOS_VERSION=${OMOS_VERSION}" \
|
||||||
|
"${TAG_FLAGS[@]}" \
|
||||||
|
.; then
|
||||||
|
echo "==> Attempt ${attempt} succeeded"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
if [[ "${attempt}" -lt 3 ]]; then
|
||||||
|
backoff=$(( attempt * 15 ))
|
||||||
|
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||||
|
sleep "${backoff}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "==> All 3 build+push attempts failed"
|
||||||
|
exit 1
|
||||||
|
|
||||||
build-variant-with-pi:
|
build-variant-with-pi:
|
||||||
needs: [base-decide, smoke-with-pi, resolve-versions]
|
needs: [base-decide, smoke-with-pi, resolve-versions]
|
||||||
@@ -522,19 +588,41 @@ jobs:
|
|||||||
fi
|
fi
|
||||||
echo "EOF"
|
echo "EOF"
|
||||||
} >> "$GITHUB_OUTPUT"
|
} >> "$GITHUB_OUTPUT"
|
||||||
- uses: docker/build-push-action@v7
|
- name: Build and push variant (with retry)
|
||||||
with:
|
shell: bash
|
||||||
context: .
|
env:
|
||||||
file: Dockerfile.variant
|
TAGS: ${{ steps.tags.outputs.tags }}
|
||||||
platforms: linux/amd64,linux/arm64
|
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||||
push: true
|
PI_VERSION: ${{ needs.resolve-versions.outputs.pi_version }}
|
||||||
build-args: |
|
run: |
|
||||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
set -euo pipefail
|
||||||
INSTALL_OPENCODE=true
|
TAG_FLAGS=()
|
||||||
INSTALL_OMOS=false
|
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||||
INSTALL_PI=true
|
# 3-attempt retry (see build-base step for rationale). Variant: with-pi.
|
||||||
PI_VERSION=${{ needs.resolve-versions.outputs.pi_version }}
|
for attempt in 1 2 3; do
|
||||||
tags: ${{ steps.tags.outputs.tags }}
|
echo "==> Build+push attempt ${attempt}/3"
|
||||||
|
if docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
--file Dockerfile.variant \
|
||||||
|
--push \
|
||||||
|
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||||
|
--build-arg "INSTALL_OPENCODE=true" \
|
||||||
|
--build-arg "INSTALL_OMOS=false" \
|
||||||
|
--build-arg "INSTALL_PI=true" \
|
||||||
|
--build-arg "PI_VERSION=${PI_VERSION}" \
|
||||||
|
"${TAG_FLAGS[@]}" \
|
||||||
|
.; then
|
||||||
|
echo "==> Attempt ${attempt} succeeded"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
if [[ "${attempt}" -lt 3 ]]; then
|
||||||
|
backoff=$(( attempt * 15 ))
|
||||||
|
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||||
|
sleep "${backoff}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "==> All 3 build+push attempts failed"
|
||||||
|
exit 1
|
||||||
|
|
||||||
build-variant-omos-with-pi:
|
build-variant-omos-with-pi:
|
||||||
needs: [base-decide, smoke-omos-with-pi, resolve-versions]
|
needs: [base-decide, smoke-omos-with-pi, resolve-versions]
|
||||||
@@ -571,20 +659,43 @@ jobs:
|
|||||||
fi
|
fi
|
||||||
echo "EOF"
|
echo "EOF"
|
||||||
} >> "$GITHUB_OUTPUT"
|
} >> "$GITHUB_OUTPUT"
|
||||||
- uses: docker/build-push-action@v7
|
- name: Build and push variant (with retry)
|
||||||
with:
|
shell: bash
|
||||||
context: .
|
env:
|
||||||
file: Dockerfile.variant
|
TAGS: ${{ steps.tags.outputs.tags }}
|
||||||
platforms: linux/amd64,linux/arm64
|
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||||
push: true
|
PI_VERSION: ${{ needs.resolve-versions.outputs.pi_version }}
|
||||||
build-args: |
|
OMOS_VERSION: ${{ needs.resolve-versions.outputs.omos_version }}
|
||||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
run: |
|
||||||
INSTALL_OPENCODE=true
|
set -euo pipefail
|
||||||
INSTALL_OMOS=true
|
TAG_FLAGS=()
|
||||||
INSTALL_PI=true
|
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||||
PI_VERSION=${{ needs.resolve-versions.outputs.pi_version }}
|
# 3-attempt retry (see build-base step for rationale). Variant: omos-with-pi.
|
||||||
OMOS_VERSION=${{ needs.resolve-versions.outputs.omos_version }}
|
for attempt in 1 2 3; do
|
||||||
tags: ${{ steps.tags.outputs.tags }}
|
echo "==> Build+push attempt ${attempt}/3"
|
||||||
|
if docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
--file Dockerfile.variant \
|
||||||
|
--push \
|
||||||
|
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||||
|
--build-arg "INSTALL_OPENCODE=true" \
|
||||||
|
--build-arg "INSTALL_OMOS=true" \
|
||||||
|
--build-arg "INSTALL_PI=true" \
|
||||||
|
--build-arg "PI_VERSION=${PI_VERSION}" \
|
||||||
|
--build-arg "OMOS_VERSION=${OMOS_VERSION}" \
|
||||||
|
"${TAG_FLAGS[@]}" \
|
||||||
|
.; then
|
||||||
|
echo "==> Attempt ${attempt} succeeded"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
if [[ "${attempt}" -lt 3 ]]; then
|
||||||
|
backoff=$(( attempt * 15 ))
|
||||||
|
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||||
|
sleep "${backoff}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "==> All 3 build+push attempts failed"
|
||||||
|
exit 1
|
||||||
|
|
||||||
# ── Phase 5: promote base-<hash> → base-latest (manifest copy only) ─
|
# ── Phase 5: promote base-<hash> → base-latest (manifest copy only) ─
|
||||||
promote-base-latest:
|
promote-base-latest:
|
||||||
|
|||||||
@@ -74,6 +74,7 @@ cd /tmp && npm pack @earendil-works/pi-coding-agent@0.75.5 && tar -xzf earendil-
|
|||||||
- **Resolved versions are logged by the smoke test** — `scripts/smoke-test.sh` prints a "Resolved component versions" table as its first step. CI logs always capture what got baked into a given image even when ARGs default to `latest`.
|
- **Resolved versions are logged by the smoke test** — `scripts/smoke-test.sh` prints a "Resolved component versions" table as its first step. CI logs always capture what got baked into a given image even when ARGs default to `latest`.
|
||||||
- **`PI_VERSION` and `OMOS_VERSION` MUST be passed by CI as concrete versions**, not left at the `latest` default. The npm install steps in `Dockerfile.variant` (`npm install -g @earendil-works/pi-coding-agent` / `oh-my-opencode-slim@${OMOS_VERSION}`) produce identical layer-hashes when the ARG values are byte-identical across builds; combined with the registry buildcache (`base-buildcache`) the layer gets reused even when `latest` would have resolved to a newer upstream. This is the same class of bug that bit pi-devbox v0.74.0 → v0.75.5 (silent same-bytes-across-releases regression discovered 2026-05-23, fixed in pi-devbox v0.75.5b). It is currently *masked* in opencode-devbox by `OPENCODE_VERSION` being a hard-coded ARG that bumps every release — that bump invalidates the parent-chain cache key for the downstream pi/omos layers — but the masking would fail the moment a `vN.N.Nb` opencode-version-unchanged release ships that only bumps pi or omos. Preventative fix: `.gitea/workflows/docker-publish-split.yml` has a `resolve-versions` job that runs `npm view @earendil-works/pi-coding-agent version` and `npm view oh-my-opencode-slim version`, exposing concrete values as outputs that every variant smoke + build job consumes via build-args. Smoke tests assert via `EXPECTED_PI_VERSION` / `EXPECTED_OMOS_VERSION` env vars — would catch the regression on the next release rather than four releases later. **If you change the variant build-args list, the resolve-versions job, or the smoke EXPECTED_*_VERSION wiring, audit all affected jobs in lockstep.**
|
- **`PI_VERSION` and `OMOS_VERSION` MUST be passed by CI as concrete versions**, not left at the `latest` default. The npm install steps in `Dockerfile.variant` (`npm install -g @earendil-works/pi-coding-agent` / `oh-my-opencode-slim@${OMOS_VERSION}`) produce identical layer-hashes when the ARG values are byte-identical across builds; combined with the registry buildcache (`base-buildcache`) the layer gets reused even when `latest` would have resolved to a newer upstream. This is the same class of bug that bit pi-devbox v0.74.0 → v0.75.5 (silent same-bytes-across-releases regression discovered 2026-05-23, fixed in pi-devbox v0.75.5b). It is currently *masked* in opencode-devbox by `OPENCODE_VERSION` being a hard-coded ARG that bumps every release — that bump invalidates the parent-chain cache key for the downstream pi/omos layers — but the masking would fail the moment a `vN.N.Nb` opencode-version-unchanged release ships that only bumps pi or omos. Preventative fix: `.gitea/workflows/docker-publish-split.yml` has a `resolve-versions` job that runs `npm view @earendil-works/pi-coding-agent version` and `npm view oh-my-opencode-slim version`, exposing concrete values as outputs that every variant smoke + build job consumes via build-args. Smoke tests assert via `EXPECTED_PI_VERSION` / `EXPECTED_OMOS_VERSION` env vars — would catch the regression on the next release rather than four releases later. **If you change the variant build-args list, the resolve-versions job, or the smoke EXPECTED_*_VERSION wiring, audit all affected jobs in lockstep.**
|
||||||
- **Registry buildkit cache-export is currently disabled** — do NOT re-add `cache-from`/`cache-to` to the `build-base` step in `.gitea/workflows/docker-publish-split.yml` without first verifying that buildkit's `mode=max` cache-export to `registry-1.docker.io` no longer returns HTTP 400 from the Hub CDN edge. The regression surfaced ~2026-05-23 and broke five consecutive opencode-devbox publish attempts (runs #332/333/334/336 + a rerun); root-caused on 2026-05-28 by a manual host-side publish that reproduced the same 400 only on `--cache-to` while image push worked fine. Failure shape is stable (`Offset:0` in the `_state` token, HTML response body = CDN-tier rejection, not registry backend), repo-specific (we're the only repo writing `:base-buildcache` mode=max), and explains why pinning `setup-buildx-action@v4.0.0` didn't help (action pin doesn't change the bundled buildkit version on the catthehacker runner image). Trade-off: dockerfile.base changes pay a full ~3 min rebuild instead of pulling cached layers; unchanged bases short-circuit at the Hub-probe step in `base-decide` and never re-build anyway. Variants don't use registry cache so they're unaffected. Re-enable condition: upstream moby/buildkit fix lands AND a low-risk test run succeeds without 400s. See CHANGELOG v1.15.12 `Unreleased` block for the full diagnostic chain. Manual escape-hatch publish procedure: `docs/manual-host-publish.md`.
|
- **Registry buildkit cache-export is currently disabled** — do NOT re-add `cache-from`/`cache-to` to the `build-base` step in `.gitea/workflows/docker-publish-split.yml` without first verifying that buildkit's `mode=max` cache-export to `registry-1.docker.io` no longer returns HTTP 400 from the Hub CDN edge. The regression surfaced ~2026-05-23 and broke five consecutive opencode-devbox publish attempts (runs #332/333/334/336 + a rerun); root-caused on 2026-05-28 by a manual host-side publish that reproduced the same 400 only on `--cache-to` while image push worked fine. Failure shape is stable (`Offset:0` in the `_state` token, HTML response body = CDN-tier rejection, not registry backend), repo-specific (we're the only repo writing `:base-buildcache` mode=max), and explains why pinning `setup-buildx-action@v4.0.0` didn't help (action pin doesn't change the bundled buildkit version on the catthehacker runner image). Trade-off: dockerfile.base changes pay a full ~3 min rebuild instead of pulling cached layers; unchanged bases short-circuit at the Hub-probe step in `base-decide` and never re-build anyway. Variants don't use registry cache so they're unaffected. Re-enable condition: upstream moby/buildkit fix lands AND a low-risk test run succeeds without 400s. See CHANGELOG v1.15.12 `Unreleased` block for the full diagnostic chain. Manual escape-hatch publish procedure: `docs/manual-host-publish.md`.
|
||||||
|
- **Push steps wrap `docker buildx build --push` in a 3-attempt retry loop** (15s, 30s backoff) for transient `registry-1.docker.io` blips — rate limits, brief 5xx, CDN flap. Implemented as inline `shell: bash` steps with `docker buildx build` raw rather than `docker/build-push-action@v7` so the loop is visible and tweakable. Affects the 1 base + 4 variant push steps in `.gitea/workflows/docker-publish-split.yml`; smoke-test builds (`load: true`, no push) are untouched. **This does NOT mask deterministic failures** — a true regression (like the cache-export 400 of 2026-05-23..28) fails all 3 attempts identically and the job still fails. Orthogonal to the cache-export disablement above: cache-export was about a deterministic protocol mismatch, retry is about absorbing genuine transients. Both are belt-and-braces with the `ci-release-watcher` skill's transient-rerun heuristic. If you change the matrix of push steps, keep the retry wrapper consistent across them — the pattern is duplicated rather than factored out because Gitea Actions doesn't support reusable composite shell steps cleanly.
|
||||||
- **Shell scripts use `set -euo pipefail`** — both entrypoints are strict. Errors in volume chown or SSH permission operations are intentionally suppressed with `|| true`.
|
- **Shell scripts use `set -euo pipefail`** — both entrypoints are strict. Errors in volume chown or SSH permission operations are intentionally suppressed with `|| true`.
|
||||||
- **MemPalace install path** — installed via `uv tool install` into `/opt/uv-tools/mempalace/`. Both the `mempalace` CLI and the `mempalace-mcp` MCP server binary are shipped as entry points by the mempalace package itself and placed on PATH by uv as shims whose shebangs point at the venv's Python. No hand-rolled wrapper is needed. Do not use `pip install --break-system-packages` — that was the previous approach and has been removed. Do not use `["python3", "-m", "mempalace.mcp_server"]` in `opencode.jsonc` — system Python can't import from the uv venv.
|
- **MemPalace install path** — installed via `uv tool install` into `/opt/uv-tools/mempalace/`. Both the `mempalace` CLI and the `mempalace-mcp` MCP server binary are shipped as entry points by the mempalace package itself and placed on PATH by uv as shims whose shebangs point at the venv's Python. No hand-rolled wrapper is needed. Do not use `pip install --break-system-packages` — that was the previous approach and has been removed. Do not use `["python3", "-m", "mempalace.mcp_server"]` in `opencode.jsonc` — system Python can't import from the uv venv.
|
||||||
- **generate-config.py idempotency** — the script MUST never overwrite an existing `opencode.jsonc` or legacy `opencode.json`. Config persists in the `devbox-opencode-config` named volume; accidentally clobbering that file would destroy hand-edits. The smoke test asserts this.
|
- **generate-config.py idempotency** — the script MUST never overwrite an existing `opencode.jsonc` or legacy `opencode.json`. Config persists in the `devbox-opencode-config` named volume; accidentally clobbering that file would destroy hand-edits. The smoke test asserts this.
|
||||||
|
|||||||
@@ -36,6 +36,12 @@ Next tag push (e.g. v1.15.13) is expected to publish cleanly via Gitea CI again.
|
|||||||
- ~~`catthehacker/ubuntu:act-latest`~~ / ~~act-runner egress~~ — manual publish from host reproduced the same 400, ruling out runner-side network. Not the cause.
|
- ~~`catthehacker/ubuntu:act-latest`~~ / ~~act-runner egress~~ — manual publish from host reproduced the same 400, ruling out runner-side network. Not the cause.
|
||||||
- **Confirmed:** buildkit cache-export protocol (mode=max) hitting Hub-CDN edge rejection. Workaround: don't export cache to registry. Long-term: track moby/buildkit upstream for protocol fix or switch to GHA cache (not portable to Gitea Actions).
|
- **Confirmed:** buildkit cache-export protocol (mode=max) hitting Hub-CDN edge rejection. Workaround: don't export cache to registry. Long-term: track moby/buildkit upstream for protocol fix or switch to GHA cache (not portable to Gitea Actions).
|
||||||
|
|
||||||
|
### CI: workflow-level retry around `docker buildx build --push`
|
||||||
|
|
||||||
|
All five push steps in `.gitea/workflows/docker-publish-split.yml` (1 base + 4 variants) are now wrapped in a 3-attempt retry loop with backoff (15s, 30s) as belt-and-braces against transient `registry-1.docker.io` blips. Replaces the `docker/build-push-action@v7` invocations with `shell: bash` steps that run `docker buildx build --push` directly so the loop is visible and tweakable. Smoke-test build steps (`load: true`, no push) are unchanged — they don't suffer from registry-side flakiness.
|
||||||
|
|
||||||
|
Does **not** mask deterministic failures: a true regression (e.g. the cache-export 400 documented above) will fail all 3 attempts identically and the job still fails by design. Belt-and-braces with the workflow-level retry-on-failure rerun heuristic in the `ci-release-watcher` skill, which catches transient-shaped runner-side failures separately. No image-side change.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## v1.15.12 — 2026-05-28
|
## v1.15.12 — 2026-05-28
|
||||||
|
|||||||
Reference in New Issue
Block a user