Files
opencode-devbox/deploy/README.md
T
joakimp 4560702550 Document the upgrade-ritual for reconciling VM compose files
New releases may add named volumes or bind-mount lines to
docker-compose.yml. The image can't update compose files on the VM —
they're user-owned — so a plain 'docker compose pull && up -d' picks
up the new image but silently misses new mount points.

Example from v1.14.19c → v1.14.20: bash history persistence needs
the devbox-shell-history named volume at /home/developer/.cache/bash.
The v1.14.20 image is configured to write history there either way,
but without the volume mount on the VM, writes land in the container's
writable layer and vanish on every --force-recreate.

Add a 'Upgrading an existing VM to a new release' subsection to
deploy/README.md describing the backup → diff → merge → recreate
ritual, so future upgrades don't quietly drop features the same way.
2026-04-22 10:29:03 +02:00

283 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Deploy — Host VM setup
Scripts for setting up a fresh Linux VM to host opencode-devbox.
## Files
- **`cloud-init.yml`** — cloud-init user-data template for automated VM provisioning on OpenStack, Proxmox, or any cloud with cloud-init support
- **`setup-host.sh`** — interactive post-install script for VMs that weren't provisioned with cloud-init
- **`setup-openstack-secgroup.sh`** — creates an OpenStack security group with the right rules (SSH, mosh, ICMP)
- **`sync-to-vm.sh`** — syncs local config directories (`~/.aws`, `~/.config/opencode`, etc.) to a remote VM based on which bind mounts are active in its `docker-compose.yml`
## Supported distributions
- **Debian 13 (Trixie)** — recommended (matches opencode-devbox base image)
- **Ubuntu 24.04 LTS** — also works
Other distributions will need manual adaptation.
## Quick start
### Option 1: Cloud-init (automated)
Customize `cloud-init.yml` — replace the SSH public key and optionally the hostname/timezone. Then use it during VM creation:
- **Proxmox**: attach as cloud-init user-data
- **OpenStack**: pass via `--user-data` flag (see full example below)
- **AWS/DigitalOcean/etc**: paste into the "user data" field
#### Full OpenStack example
Cloud-init only handles guest configuration — flavor, image, network, and security group must be specified explicitly at creation time.
> **Note:** Do not use `--key-name` — the SSH key is configured in `cloud-init.yml` under `ssh_authorized_keys` for the `devbox` user. The `--key-name` flag injects into the image's default user (e.g. `debian`), not the `devbox` user created by cloud-init.
```bash
# List available flavors to choose appropriate sizing
openstack flavor list
# Create the security group first (one-time, see below)
./setup-openstack-secgroup.sh
# Basic — boot from default storage
openstack server create \
--flavor c4m8 \
--image Debian-13-Trixie \
--network my-network \
--security-group opencode-devbox \
--user-data cloud-init.yml \
devbox-vm
```
If your cloud offers NVMe-backed (performance) volumes, boot from one for faster Docker and build I/O:
```bash
# Performance — boot from NVMe volume (40GB, preserved on instance deletion)
openstack server create \
--flavor c4m8 \
--network my-network \
--security-group opencode-devbox \
--user-data cloud-init.yml \
--block-device source_type=image,uuid=$(openstack image show Debian-13-Trixie -f value -c id),destination_type=volume,volume_size=40,delete_on_termination=false,boot_index=0,volume_type=performance \
devbox-vm
```
> **Note:** The inline `volume_type` parameter requires API microversion 2.67+. If the server goes to ERROR state, check your volume quota (`openstack quota show`) and try creating the volume separately:
> ```bash
> openstack volume create --image Debian-13-Trixie --size 40 --type performance --bootable devbox-boot-volume
> openstack server create --flavor c4m8 --volume devbox-boot-volume --network my-network --security-group opencode-devbox --user-data cloud-init.yml devbox-vm
> ```
#### Floating IP
OpenStack doesn't support assigning a floating IP at instance creation time — it's a separate step after the VM is active:
```bash
# Allocate a new floating IP from the external network
openstack floating ip create <external-network>
# Assign it to the VM
openstack server add floating ip devbox-vm <floating-ip>
```
To find your external network name: `openstack network list --external`. If you already have an unassigned floating IP, skip the create step.
The VM boots with Docker installed, firewall configured (or skipped on OpenStack), and your SSH key authorized. Log in as the `devbox` user.
### Console password (optional)
The cloud-init template uses SSH key authentication only — no password is set by default. This is sufficient for normal use since the `devbox` user has passwordless `sudo`.
A password is only needed for:
- **Emergency console access** — logging in via OpenStack Horizon console (noVNC) or Proxmox VNC when SSH is unreachable
- **`su - devbox`** — switching to the devbox user from another account
To enable console access, uncomment the `chpasswd` block in `cloud-init.yml` before deploying:
```yaml
chpasswd:
expire: false
users:
- name: devbox
password: your-password-here
type: text
```
For an already-running VM, set a password via SSH:
```bash
sudo passwd devbox
```
### Option 2: Post-install script (manual)
On a fresh Debian/Ubuntu VM:
```bash
curl -fsSL https://gitea.jordbo.se/joakimp/opencode-devbox/raw/branch/main/deploy/setup-host.sh | bash
```
Or clone and run:
```bash
git clone https://gitea.jordbo.se/joakimp/opencode-devbox
cd opencode-devbox/deploy
./setup-host.sh
```
## What gets installed
- Docker Engine (from Docker's official apt repo, not distro's `docker.io`)
- Docker Compose plugin (v2)
- `tmux`, `mosh`, `git`
- `ufw` firewall with SSH (22) and mosh (UDP 60000-61000) allowed — **skipped on OpenStack** (detected automatically; use security groups instead)
- IPv4 DNS preference (works around Docker Hub IPv6 connectivity issues)
## OpenStack security groups
On OpenStack, firewalling is handled by security groups rather than ufw. The `setup-host.sh` script detects OpenStack automatically and skips ufw configuration.
To create the required security group:
```bash
./setup-openstack-secgroup.sh
```
This creates a security group named `opencode-devbox` with rules for SSH (TCP 22), mosh (UDP 60000-61000), and ICMP. Apply it to your instance:
```bash
# New instance
openstack server create --security-group opencode-devbox ...
# Existing instance
openstack server add security group <instance-name> opencode-devbox
```
## VM sizing recommendations
| Use case | vCPU | RAM | Disk |
|---|---|---|---|
| Minimum | 2 | 4 GB | 20 GB |
| Recommended | 4 | 8 GB | 40 GB |
| Heavy use (Rust/Python builds, multi-project) | 8 | 16 GB | 80 GB |
## After VM setup
If you uncomment any bind mounts in `docker-compose.yml` (e.g. `~/.aws`, `~/.config/opencode`), create the directories first — Docker creates missing bind mount paths as root-owned, which causes permission issues:
```bash
# Only create directories for mounts you uncomment
mkdir -p ~/.aws # AWS Bedrock SSO
mkdir -p ~/.config/opencode # persistent opencode config
mkdir -p ~/.config/nvim # custom neovim config
mkdir -p ~/.agents/skills # opencode agent skills
```
Named volumes (`devbox-data`, `devbox-uv`, etc.) are managed by Docker and need no pre-creation.
```bash
mkdir -p ~/opencode-devbox && cd ~/opencode-devbox
curl -sL https://gitea.jordbo.se/joakimp/opencode-devbox/raw/branch/main/docker-compose.yml -o docker-compose.yml
curl -sL https://gitea.jordbo.se/joakimp/opencode-devbox/raw/branch/main/.env.example -o .env
vim .env # configure provider and keys
vim docker-compose.yml # uncomment optional volume mounts
docker compose up -d
docker compose exec -u developer devbox opencode
```
> **AWS Bedrock users:** Uncomment the `~/.aws` volume mount in `docker-compose.yml` before starting. You'll also need to copy your `~/.aws/config` from a machine where SSO is already configured, then authenticate inside the container with `aws sso login`.
### Syncing local config to the VM
After editing `docker-compose.yml` on the VM to uncomment the bind mounts you need, run `sync-to-vm.sh` from your local machine to copy the corresponding directories:
```bash
./deploy/sync-to-vm.sh devbox-affection
```
The script reads `docker-compose.yml` on the remote VM, detects which bind mounts are active, and syncs only those directories from your local machine. It also creates the remote directories if they don't exist.
### Upgrading an existing VM to a new release
Each tagged release may add new named volumes or bind-mount lines to `docker-compose.yml`. Pulling a new image via `docker compose pull` grabs the new container behaviour, but compose files on the VM are user-owned and never touched by the image — you have to reconcile them yourself when upgrading across versions.
**Symptom of a missed reconcile:** a new feature quietly doesn't work even though the image is correct. Example from v1.14.19c → v1.14.20: bash history persistence requires the `devbox-shell-history` named volume mounted at `/home/developer/.cache/bash`. The v1.14.20 image writes history to that path either way, but without the volume mount on the VM, writes land in the container's writable layer and vanish on every `--force-recreate`.
**Upgrade ritual:**
```bash
# On the VM, before recreating the container:
cd ~/opencode-devbox
cp docker-compose.yml docker-compose.yml.bak-$(date +%Y%m%d-%H%M%S)
# Compare against the repo version to see what's new:
# (from your local checkout)
scp devbox-affection:~/opencode-devbox/docker-compose.yml /tmp/vm-compose.yml
diff -u /tmp/vm-compose.yml ~/src/src_local/opencode-devbox/docker-compose.yml
```
For each new `volumes:` entry or mount line in the repo version that isn't in your VM's file, add it manually — preserving any local customizations you've made (image variant, read/write flags on bind mounts, etc.). Then:
```bash
docker compose config >/dev/null # verify YAML still parses
docker compose up -d --force-recreate
```
If you maintain the VM's compose file with no local changes, `scp` the repo version over wholesale. If you have customizations (the common case), do the diff-and-merge by hand.
### Shell defaults inside the container
The image ships baked `.bash_aliases` and `.inputrc` in `/etc/skel-devbox/` — quality-of-life defaults (prefix history search on Up/Down arrows, persistent history across container recreates via the `devbox-shell-history` named volume, `[devbox]` prompt marker, sensible aliases). On first container start the entrypoint copies them to `/home/developer/` **only if the target file does not already exist**.
This means:
- Fresh containers get the defaults automatically.
- If you bind-mount your host's `~/.bash_aliases` / `~/.inputrc` (see the commented lines in `docker-compose.yml`), your host versions win.
- If you edit the files inside a running container and store them via a home-dir bind-mount or equivalent, subsequent upgrades never overwrite them.
- To restore the baked defaults any time: `cp /etc/skel-devbox/.bash_aliases ~/` (or delete the file and recreate the container).
- To diff your current config against what the image ships: `diff ~/.bash_aliases /etc/skel-devbox/.bash_aliases`.
### Troubleshooting: SSH hangs or "banner exchange" timeouts
If SSH to the VM intermittently fails with `Connection timed out during banner exchange` or pure TCP connect timeouts — especially after the first few successful connects in a short window — the cause is almost certainly your ISP's CGNAT (Carrier-Grade NAT), not the VM.
**Symptoms**
- First 34 SSH connects succeed, then subsequent ones fail hard for 2030 minutes
- `ping` to the VM works perfectly throughout (ICMP isn't tracked the same way)
- `mosh` sessions stay stable once established (UDP, different flow table)
- Happens on residential ISPs (Tele2, Comhem, Telia, most European consumer broadband)
- VM-side logs show SSH is idle — the SYNs never reach it
**Cause**
Residential CGNAT boxes keep a per-subscriber TCP flow table with a small concurrent-flow cap (~4) per destination IP. Once exhausted, new SYNs to that destination are silently dropped until old flows age out (typically 2030 min after TCP close).
**Fix**
Add SSH connection multiplexing on your client so all SSH sessions (interactive, `scp`, `rsync`, scripts) share a single TCP connection to the VM:
```ssh-config
# ~/.ssh/config
Host <vm-alias>
HostName <vm-ip>
User devbox
IdentityFile ~/.ssh/id_ed25519
ControlMaster auto
ControlPath ~/.ssh/cm/%r@%h:%p
ControlPersist 4h
ServerAliveInterval 30
ServerAliveCountMax 6
```
Then create the socket directory:
```bash
mkdir -p ~/.ssh/cm && chmod 700 ~/.ssh/cm
```
All SSH to the VM now multiplexes over a single flow slot, regardless of how many parallel sessions you open. `sync-to-vm.sh` already does this internally for its own rsync/scp calls.
For a more robust long-term fix (especially if you access the VM from multiple hosts), run a WireGuard tunnel on the VM and route SSH through that — UDP bypasses the TCP flow table entirely.