Public Access

Files

T

joakimp cb4971b4a6 Document SSH banner-timeout workaround for residential CGNAT users

Add a Troubleshooting subsection to deploy/README.md describing the
ISP-CGNAT per-destination flow-table exhaustion that manifests as
'Connection timed out during banner exchange' or pure TCP connect
timeouts after the first 3-4 SSH connects.

The fix is SSH ControlMaster/ControlPersist on the client side, which
multiplexes all SSH sessions over one TCP flow and stays within the
CGNAT cap. sync-to-vm.sh already uses this pattern internally; this
note makes it discoverable for users hitting the issue in interactive
or scripted SSH use outside the deploy/ scripts.

2026-04-21 09:04:59 +02:00

9.5 KiB

Raw Blame History

Deploy — Host VM setup

Scripts for setting up a fresh Linux VM to host opencode-devbox.

Files

cloud-init.yml — cloud-init user-data template for automated VM provisioning on OpenStack, Proxmox, or any cloud with cloud-init support
setup-host.sh — interactive post-install script for VMs that weren't provisioned with cloud-init
setup-openstack-secgroup.sh — creates an OpenStack security group with the right rules (SSH, mosh, ICMP)
sync-to-vm.sh — syncs local config directories (~/.aws, ~/.config/opencode, etc.) to a remote VM based on which bind mounts are active in its docker-compose.yml

Supported distributions

Debian 13 (Trixie) — recommended (matches opencode-devbox base image)
Ubuntu 24.04 LTS — also works

Other distributions will need manual adaptation.

Quick start

Option 1: Cloud-init (automated)

Customize cloud-init.yml — replace the SSH public key and optionally the hostname/timezone. Then use it during VM creation:

Proxmox: attach as cloud-init user-data
OpenStack: pass via --user-data flag (see full example below)
AWS/DigitalOcean/etc: paste into the "user data" field

Full OpenStack example

Cloud-init only handles guest configuration — flavor, image, network, and security group must be specified explicitly at creation time.

Note: Do not use --key-name — the SSH key is configured in cloud-init.yml under ssh_authorized_keys for the devbox user. The --key-name flag injects into the image's default user (e.g. debian), not the devbox user created by cloud-init.

# List available flavors to choose appropriate sizing
openstack flavor list

# Create the security group first (one-time, see below)
./setup-openstack-secgroup.sh

# Basic — boot from default storage
openstack server create \
  --flavor c4m8 \
  --image Debian-13-Trixie \
  --network my-network \
  --security-group opencode-devbox \
  --user-data cloud-init.yml \
  devbox-vm

If your cloud offers NVMe-backed (performance) volumes, boot from one for faster Docker and build I/O:

# Performance — boot from NVMe volume (40GB, preserved on instance deletion)
openstack server create \
  --flavor c4m8 \
  --network my-network \
  --security-group opencode-devbox \
  --user-data cloud-init.yml \
  --block-device source_type=image,uuid=$(openstack image show Debian-13-Trixie -f value -c id),destination_type=volume,volume_size=40,delete_on_termination=false,boot_index=0,volume_type=performance \
  devbox-vm

Note: The inline volume_type parameter requires API microversion 2.67+. If the server goes to ERROR state, check your volume quota (openstack quota show) and try creating the volume separately:
openstack volume create --image Debian-13-Trixie --size 40 --type performance --bootable devbox-boot-volume
openstack server create --flavor c4m8 --volume devbox-boot-volume --network my-network --security-group opencode-devbox --user-data cloud-init.yml devbox-vm

Floating IP

OpenStack doesn't support assigning a floating IP at instance creation time — it's a separate step after the VM is active:

# Allocate a new floating IP from the external network
openstack floating ip create <external-network>

# Assign it to the VM
openstack server add floating ip devbox-vm <floating-ip>

To find your external network name: openstack network list --external. If you already have an unassigned floating IP, skip the create step.

The VM boots with Docker installed, firewall configured (or skipped on OpenStack), and your SSH key authorized. Log in as the devbox user.

Console password (optional)

The cloud-init template uses SSH key authentication only — no password is set by default. This is sufficient for normal use since the devbox user has passwordless sudo.

A password is only needed for:

Emergency console access — logging in via OpenStack Horizon console (noVNC) or Proxmox VNC when SSH is unreachable
su - devbox — switching to the devbox user from another account

To enable console access, uncomment the chpasswd block in cloud-init.yml before deploying:

chpasswd:
  expire: false
  users:
    - name: devbox
      password: your-password-here
      type: text

For an already-running VM, set a password via SSH:

sudo passwd devbox

Option 2: Post-install script (manual)

On a fresh Debian/Ubuntu VM:

curl -fsSL https://gitea.jordbo.se/joakimp/opencode-devbox/raw/branch/main/deploy/setup-host.sh | bash

Or clone and run:

git clone https://gitea.jordbo.se/joakimp/opencode-devbox
cd opencode-devbox/deploy
./setup-host.sh

What gets installed

Docker Engine (from Docker's official apt repo, not distro's docker.io)
Docker Compose plugin (v2)
tmux, mosh, git
ufw firewall with SSH (22) and mosh (UDP 60000-61000) allowed — skipped on OpenStack (detected automatically; use security groups instead)
IPv4 DNS preference (works around Docker Hub IPv6 connectivity issues)

OpenStack security groups

On OpenStack, firewalling is handled by security groups rather than ufw. The setup-host.sh script detects OpenStack automatically and skips ufw configuration.

To create the required security group:

./setup-openstack-secgroup.sh

This creates a security group named opencode-devbox with rules for SSH (TCP 22), mosh (UDP 60000-61000), and ICMP. Apply it to your instance:

# New instance
openstack server create --security-group opencode-devbox ...

# Existing instance
openstack server add security group <instance-name> opencode-devbox

VM sizing recommendations

Use case	vCPU	RAM	Disk
Minimum	2	4 GB	20 GB
Recommended	4	8 GB	40 GB
Heavy use (Rust/Python builds, multi-project)	8	16 GB	80 GB

After VM setup

If you uncomment any bind mounts in docker-compose.yml (e.g. ~/.aws, ~/.config/opencode), create the directories first — Docker creates missing bind mount paths as root-owned, which causes permission issues:

# Only create directories for mounts you uncomment
mkdir -p ~/.aws                  # AWS Bedrock SSO
mkdir -p ~/.config/opencode      # persistent opencode config
mkdir -p ~/.config/nvim          # custom neovim config
mkdir -p ~/.agents/skills        # opencode agent skills

Named volumes (devbox-data, devbox-uv, etc.) are managed by Docker and need no pre-creation.

mkdir -p ~/opencode-devbox && cd ~/opencode-devbox
curl -sL https://gitea.jordbo.se/joakimp/opencode-devbox/raw/branch/main/docker-compose.yml -o docker-compose.yml
curl -sL https://gitea.jordbo.se/joakimp/opencode-devbox/raw/branch/main/.env.example -o .env
vim .env                                           # configure provider and keys
vim docker-compose.yml                             # uncomment optional volume mounts
docker compose up -d
docker compose exec -u developer devbox opencode

AWS Bedrock users: Uncomment the ~/.aws volume mount in docker-compose.yml before starting. You'll also need to copy your ~/.aws/config from a machine where SSO is already configured, then authenticate inside the container with aws sso login.

Syncing local config to the VM

After editing docker-compose.yml on the VM to uncomment the bind mounts you need, run sync-to-vm.sh from your local machine to copy the corresponding directories:

./deploy/sync-to-vm.sh devbox-affection

The script reads docker-compose.yml on the remote VM, detects which bind mounts are active, and syncs only those directories from your local machine. It also creates the remote directories if they don't exist.

Troubleshooting: SSH hangs or "banner exchange" timeouts

If SSH to the VM intermittently fails with Connection timed out during banner exchange or pure TCP connect timeouts — especially after the first few successful connects in a short window — the cause is almost certainly your ISP's CGNAT (Carrier-Grade NAT), not the VM.

Symptoms

First 3–4 SSH connects succeed, then subsequent ones fail hard for 20–30 minutes
ping to the VM works perfectly throughout (ICMP isn't tracked the same way)
mosh sessions stay stable once established (UDP, different flow table)
Happens on residential ISPs (Tele2, Comhem, Telia, most European consumer broadband)
VM-side logs show SSH is idle — the SYNs never reach it

Cause

Residential CGNAT boxes keep a per-subscriber TCP flow table with a small concurrent-flow cap (~4) per destination IP. Once exhausted, new SYNs to that destination are silently dropped until old flows age out (typically 20–30 min after TCP close).

Fix

Add SSH connection multiplexing on your client so all SSH sessions (interactive, scp, rsync, scripts) share a single TCP connection to the VM:

# ~/.ssh/config
Host <vm-alias>
    HostName <vm-ip>
    User devbox
    IdentityFile ~/.ssh/id_ed25519
    ControlMaster auto
    ControlPath ~/.ssh/cm/%r@%h:%p
    ControlPersist 4h
    ServerAliveInterval 30
    ServerAliveCountMax 6

Then create the socket directory:

mkdir -p ~/.ssh/cm && chmod 700 ~/.ssh/cm

All SSH to the VM now multiplexes over a single flow slot, regardless of how many parallel sessions you open. sync-to-vm.sh already does this internally for its own rsync/scp calls.

For a more robust long-term fix (especially if you access the VM from multiple hosts), run a WireGuard tunnel on the VM and route SSH through that — UDP bypasses the TCP flow table entirely.

9.5 KiB Raw Blame History Unescape Escape