Document SSH banner-timeout workaround for residential CGNAT users

Add a Troubleshooting subsection to deploy/README.md describing the
ISP-CGNAT per-destination flow-table exhaustion that manifests as
'Connection timed out during banner exchange' or pure TCP connect
timeouts after the first 3-4 SSH connects.

The fix is SSH ControlMaster/ControlPersist on the client side, which
multiplexes all SSH sessions over one TCP flow and stays within the
CGNAT cap. sync-to-vm.sh already uses this pattern internally; this
note makes it discoverable for users hitting the issue in interactive
or scripted SSH use outside the deploy/ scripts.
This commit is contained in:
2026-04-21 09:04:59 +02:00
parent 3d632ef02f
commit cb4971b4a6
+43
View File
@@ -197,3 +197,46 @@ After editing `docker-compose.yml` on the VM to uncomment the bind mounts you ne
```
The script reads `docker-compose.yml` on the remote VM, detects which bind mounts are active, and syncs only those directories from your local machine. It also creates the remote directories if they don't exist.
### Troubleshooting: SSH hangs or "banner exchange" timeouts
If SSH to the VM intermittently fails with `Connection timed out during banner exchange` or pure TCP connect timeouts — especially after the first few successful connects in a short window — the cause is almost certainly your ISP's CGNAT (Carrier-Grade NAT), not the VM.
**Symptoms**
- First 34 SSH connects succeed, then subsequent ones fail hard for 2030 minutes
- `ping` to the VM works perfectly throughout (ICMP isn't tracked the same way)
- `mosh` sessions stay stable once established (UDP, different flow table)
- Happens on residential ISPs (Tele2, Comhem, Telia, most European consumer broadband)
- VM-side logs show SSH is idle — the SYNs never reach it
**Cause**
Residential CGNAT boxes keep a per-subscriber TCP flow table with a small concurrent-flow cap (~4) per destination IP. Once exhausted, new SYNs to that destination are silently dropped until old flows age out (typically 2030 min after TCP close).
**Fix**
Add SSH connection multiplexing on your client so all SSH sessions (interactive, `scp`, `rsync`, scripts) share a single TCP connection to the VM:
```ssh-config
# ~/.ssh/config
Host <vm-alias>
HostName <vm-ip>
User devbox
IdentityFile ~/.ssh/id_ed25519
ControlMaster auto
ControlPath ~/.ssh/cm/%r@%h:%p
ControlPersist 4h
ServerAliveInterval 30
ServerAliveCountMax 6
```
Then create the socket directory:
```bash
mkdir -p ~/.ssh/cm && chmod 700 ~/.ssh/cm
```
All SSH to the VM now multiplexes over a single flow slot, regardless of how many parallel sessions you open. `sync-to-vm.sh` already does this internally for its own rsync/scp calls.
For a more robust long-term fix (especially if you access the VM from multiple hosts), run a WireGuard tunnel on the VM and route SSH through that — UDP bypasses the TCP flow table entirely.