Document SSH banner-timeout workaround for residential CGNAT users
Add a Troubleshooting subsection to deploy/README.md describing the ISP-CGNAT per-destination flow-table exhaustion that manifests as 'Connection timed out during banner exchange' or pure TCP connect timeouts after the first 3-4 SSH connects. The fix is SSH ControlMaster/ControlPersist on the client side, which multiplexes all SSH sessions over one TCP flow and stays within the CGNAT cap. sync-to-vm.sh already uses this pattern internally; this note makes it discoverable for users hitting the issue in interactive or scripted SSH use outside the deploy/ scripts.
This commit is contained in:
@@ -197,3 +197,46 @@ After editing `docker-compose.yml` on the VM to uncomment the bind mounts you ne
|
|||||||
```
|
```
|
||||||
|
|
||||||
The script reads `docker-compose.yml` on the remote VM, detects which bind mounts are active, and syncs only those directories from your local machine. It also creates the remote directories if they don't exist.
|
The script reads `docker-compose.yml` on the remote VM, detects which bind mounts are active, and syncs only those directories from your local machine. It also creates the remote directories if they don't exist.
|
||||||
|
|
||||||
|
### Troubleshooting: SSH hangs or "banner exchange" timeouts
|
||||||
|
|
||||||
|
If SSH to the VM intermittently fails with `Connection timed out during banner exchange` or pure TCP connect timeouts — especially after the first few successful connects in a short window — the cause is almost certainly your ISP's CGNAT (Carrier-Grade NAT), not the VM.
|
||||||
|
|
||||||
|
**Symptoms**
|
||||||
|
|
||||||
|
- First 3–4 SSH connects succeed, then subsequent ones fail hard for 20–30 minutes
|
||||||
|
- `ping` to the VM works perfectly throughout (ICMP isn't tracked the same way)
|
||||||
|
- `mosh` sessions stay stable once established (UDP, different flow table)
|
||||||
|
- Happens on residential ISPs (Tele2, Comhem, Telia, most European consumer broadband)
|
||||||
|
- VM-side logs show SSH is idle — the SYNs never reach it
|
||||||
|
|
||||||
|
**Cause**
|
||||||
|
|
||||||
|
Residential CGNAT boxes keep a per-subscriber TCP flow table with a small concurrent-flow cap (~4) per destination IP. Once exhausted, new SYNs to that destination are silently dropped until old flows age out (typically 20–30 min after TCP close).
|
||||||
|
|
||||||
|
**Fix**
|
||||||
|
|
||||||
|
Add SSH connection multiplexing on your client so all SSH sessions (interactive, `scp`, `rsync`, scripts) share a single TCP connection to the VM:
|
||||||
|
|
||||||
|
```ssh-config
|
||||||
|
# ~/.ssh/config
|
||||||
|
Host <vm-alias>
|
||||||
|
HostName <vm-ip>
|
||||||
|
User devbox
|
||||||
|
IdentityFile ~/.ssh/id_ed25519
|
||||||
|
ControlMaster auto
|
||||||
|
ControlPath ~/.ssh/cm/%r@%h:%p
|
||||||
|
ControlPersist 4h
|
||||||
|
ServerAliveInterval 30
|
||||||
|
ServerAliveCountMax 6
|
||||||
|
```
|
||||||
|
|
||||||
|
Then create the socket directory:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir -p ~/.ssh/cm && chmod 700 ~/.ssh/cm
|
||||||
|
```
|
||||||
|
|
||||||
|
All SSH to the VM now multiplexes over a single flow slot, regardless of how many parallel sessions you open. `sync-to-vm.sh` already does this internally for its own rsync/scp calls.
|
||||||
|
|
||||||
|
For a more robust long-term fix (especially if you access the VM from multiple hosts), run a WireGuard tunnel on the VM and route SSH through that — UDP bypasses the TCP flow table entirely.
|
||||||
|
|||||||
Reference in New Issue
Block a user