Terminal Transport & Resume

How session output reaches clients reliably over flaky links, and how a client resumes a stream after a disconnect without a full history refetch or a screen clear. This complements Protocol (the full wire vocabulary) and Clustering (cross-node routing).

Status: server- and cluster-side resume, the unified keepalive, host-load-aware liveness (CPU + memory-pressure) with soft-drop, per-session backpressure isolation, the cluster data/control lane split, the app's client-side resume, and on-demand screen-state snapshots are implemented. Resume and screen-state are unconditional - Hive does not support mixed-version clusters or old clients (see root AGENTS.md "No backward / wire compatibility"); the whole fleet upgrades together. Remaining work (continuous viewer rendering / adaptive frame-rate, and a dedicated control-plane runtime) is in Roadmap.

The model

Session output is a per-session byte stream published as Output { session_id, seq, data } frames. seq is a per-session monotonic counter assigned exactly once, where the chunk is first emitted (publish_chunk), and it travels end-to-end - through the cluster SessionOutput forwarding and into the client's Output events - so it is comparable everywhere. Clients deduplicate by dropping any frame whose seq they have already applied.

Replay ring (`OutputRing`)

Every live session keeps a bounded in-memory ring of its most recently published frames, keyed by seq (crates/hive-daemon/src/session/mod.rs). It is fed by publish_chunk - the single point where a seq is assigned - so PTY output, SDK output, and injected control frames are all retained uniformly. The ring is bounded by both a byte budget (OUTPUT_RING_MAX_BYTES, 4 MiB) and a frame count (OUTPUT_RING_MAX_FRAMES); the oldest frames are evicted first, so the retained seqs are always a contiguous suffix oldest..=last.

OutputRing::replay_from(from_seq) returns either:

Frames(..) - the buffered frames with seq > from_seq, in order (possibly empty if the caller is already current), or
TooOld { last_seq } - the gap predates the retained window and cannot be replayed.

Resume-from-seq

SubscribeOutput carries an optional from_seq. When present:

The daemon subscribes the client to the live broadcast first (so no frame published during replay is missed).
It then replays the ring from from_seq. Any overlap between the replayed frames and the live stream is harmless - the client dedupes by seq.
If the gap is TooOld, the daemon sends ResyncRequired { session_id, last_seq }; the client re-syncs from scratch (a GetSessionHistory fetch today; a screen snapshot under the future screen-state path) and resumes live dedup from last_seq.

This is the bounded, rare fallback - the common reconnect replays a small tail from the ring with no screen clear and no full-history refetch.

A from_seq resume always re-arms delivery, even when an output task is already registered for the session on that connection. Plain attach (from_seq absent) keeps the idempotent fast-path - if a task exists it is left alone - but an explicit cursor is a recovery request and must not be a no-op. The task being replaced is typically alive but wedged: the fan-in shed a frame and marked a gap, then the PTY went idle so the gap never refilled, leaving last_sent_seq frozen. Such a task is not is_finished(), so prune_finished_output_task cannot reap it, and re-acking Attached without re-arming strands the client - its staleness resume loops forever at a frozen seq and only a full reconnect recovers. So when from_seq is present the daemon first runs reset_output_task_for_resume (abort + drop the registered task and its remote subscriber id), then the normal subscribe path replays from the client's cursor and spawns a fresh forwarder. Lossless (replay_from yields only seq > from_seq; the forwarder dedupes seq <= last_sent_seq) and timer-free.

resume_or_attach_local in crates/hive-daemon/src/server/session_handlers.rs implements the local path; the cross-node path mirrors it (below).

History collapse (VT-rendered scrollback)

A full GetSessionHistory fetch does not return the raw recorded byte stream for a PTY session. Terminal apps that redraw in place - spinners, status bars, Claude Code's status block - emit thousands of cursor-relative redraw frames (\r, cursor-up, erase-line). Replayed verbatim into a fresh xterm those relative moves no longer land where they did live (a cursor-up at the top of a near-empty replay buffer clamps at row 0), so every historical frame stacks as a fresh stripe of debris and the scrollback turns to garbage. Because the debris is in the persisted byte stream, a hard reconnect replays the same mess at any width.

Instead, the daemon feeds the spliced DB + live byte stream through a headless avt virtual terminal and serializes the reconstructed grid - scrollback and visible screen - back to one clean ANSI line per grid row, bounded to HISTORY_SCROLLBACK_LIMIT (10k rows, matching the client's MAX_PTY_SCROLLBACK). This collapses the redraw frames to the exact grid the live terminal showed, and the clean snapshot is what lands in the client's scrollback, so it survives reconnects. collapse_output_to_screen_lines in crates/hive-daemon/src/session/screen_dump.rs does the rendering; it runs on spawn_blocking (VT parse is CPU work) and applies only to PTY sessions - SDK/JSON sessions and the capped limit = Some(n) fast path are served raw.

No capability negotiation. Because the whole fleet runs one revision, resume and screen-state are always available - the app always preserves its terminals and re-subscribes with from_seq on reconnect, and always recovers a too-old gap via a screen snapshot. There is no client_caps/server_caps handshake and no legacy wipe-and-refetch path.

Resume repaint nudge

The VT-collapsed snapshot (and a cursor-resume's replayed ring frames) is a static grid. An in-place TUI like Claude Code only repaints on SIGWINCH, so if the reconnecting client lands on the same dimensions the PTY already has, the kernel sends no SIGWINCH and the process never redraws over that static snapshot - the user sees a subtly misaligned screen (box borders, cursor row) until they manually resize. The manual resize "fixed" it only because it changed the size and so delivered a SIGWINCH.

To close this automatically, a from_seq resume subscribe fires SessionManager::nudge_pty_repaint once the fresh output task is armed: a no-op PTY resize (same dimensions - SIGWINCH only, no reflow) that forces the running process to emit one clean full frame, which the just-armed task forwards. It is gated on resume (from_seq.is_some()) so ordinary focus switching does not churn a repaint, and is a no-op for non-PTY sessions and for the owner before any size has propagated. For a peer-owned session the nudge fires on the owner node, where the PTY lives, because the forwarded subscribe runs resume_or_attach_local there. See resume_or_attach_local in crates/hive-daemon/src/server/session_handlers.rs.

Backpressure isolation

A client that can't keep up never loses the connection and never holes the stream - and shedding is queue-driven, never timer-driven. The per-client output forwarder try_sends each frame into the connection's bounded fan-in; when the fan-in is full it drops that single frame and marks a gap, then replays the gap losslessly from the ring (fill_gap_from_ring) once the consumer drains - resuming exactly where delivery stopped if the fan-in fills mid-replay (GapFill::Partial). It falls back to a one-time screen repaint only when the gap predates the ring. There is deliberately no wall-clock drain budget: a timer that fires late on a starved runtime used to drop frames of perfectly healthy clients and trigger resync storms exactly when the host was weakest; a full queue means real consumer backpressure, and a starved runtime now produces delay, never loss.

The select loop couples the fan-in to the socket: past WRITER_HIGH_WATER frames queued in the per-connection writer it stops draining output, so a backed-up socket pushes shedding to the fan-in - where ring-replay gap repair lives - instead of the writer's silent drop-oldest backstop (which remains only as the memory bound). Control frames bypass the gate. A slow consumer therefore degrades only its own session's fidelity and self-heals; other sessions on the same connection are untouched.

Cross-node resume

When the session is owned by another node, the node the client is connected to forwards from_seq to the owner via the peer SubscribeOutput. The owner replays from its ring before streaming live, sending the missed frames back as SessionOutput. If the owner's ring cannot satisfy the gap it returns SessionResyncRequired { subscriber_id, session_id, last_seq }, which the requesting node surfaces to its local client as ResyncRequired (reusing the in-band control-frame path). The seq is the owner's throughout, so client dedup works identically whether the client is attached locally or across the mesh.

Keepalive

Both directions of every Hive WebSocket follow one shared policy, defined once in crates/hive-protocol/src/keepalive.rs so the daemon, the app, and the CLI cannot drift apart:

Ping interval - 15 s.
Pong timeout - 45 s (three missed pings). This is a staleness threshold, not a teardown deadline: liveness is observational (SSH semantics). No application timer on either side ever closes a connection.

A host under load produces lag and catch-up, exactly like a loaded SSH host. What each side does with a widening pong gap:

Daemon - logs one connection stale warning per episode (and a recovery line when pongs resume). The only teardown it gates is the zombie reap: a connection with no attached sessions that has also stopped answering pings is reaped after ZOMBIE_REAP_TIMEOUT, so claim bookkeeping isn't pinned forever by sockets whose kernel still ACKs but whose application is gone.
App (Tauri) - after two missed pings emits hive:connection-stale driving a pulsing yellow "Connection stalled Ns - waiting..." indicator; hive:connection-fresh clears it the instant traffic resumes. The socket is never closed from the timer.
Browser shim - browsers auto-pong invisibly, so it sends an app-level keepalive message every 15 s instead; the daemon replies KeepaliveAck (guaranteed inbound traffic on an idle session) and the shim tracks the gap since any inbound message, emitting the same stale/fresh events.

Actual connection death is the kernel's job: TCP keepalive (30 s idle, 10 s probes, 3 retries) reaps dead idle sockets, and TCP_USER_TIMEOUT (60 s, set on the daemon listener) bounds the dead-peer-with-in-flight-data case that keepalive cannot see. A slow-but-ACKing peer is held indefinitely. When the kernel kills the socket, the client's reader sees a stream error - the single authoritative hive:disconnected source - and reconnects with buffers and resume cursors intact, replaying only the gap from the ring.

Liveness is intentionally decoupled from output flow: pings ride the control lane, so a client that is merely slow to drain a high-throughput ("flooding") session is never reported stale by mistake.

Host-load-aware liveness

The pong gap is wall-clock, which misfires when the daemon itself is starved: a build pegging every core can keep the daemon's runtime off-CPU for tens of seconds, so it neither sends pings nor reads the pongs that did arrive, and a perfectly healthy client would be reported stale (or zombie-reaped) with a multi-hundred-second "gap". crates/hive-daemon/src/server/liveness.rs corrects this. A watchdog task sleeps for a fixed tick and, on each wake, credits the larger of two stall signals:

Timer lateness - how much later than the tick it actually ran. That lateness is the runtime's scheduling stall, since a CPU-saturated runtime delays the watchdog's own timer by exactly the amount every other task was starved.
Memory-pressure stall - the per-tick delta of Linux PSI some memory total (/proc/pressure/memory). Page-fault stalls under swap thrash hit tasks unevenly: a WS read task can be frozen reclaiming pages while the watchdog, its working set resident, ticks on time - so timer lateness alone misses it. PSI captures the stall regardless of which task ate it. Linux-only; absent elsewhere, crediting falls back to timer lateness with no change. The two are combined with max, not sum, so a stall both signals see is credited once, not doubled.

The stall accumulates into a shared StallTracker, and the staleness check (CreditedDeadline) subtracts any stall accrued since the last pong before reporting a client stale or reaping a session-less zombie. A starved daemon thus never mis-reports a live client, while a genuinely silent peer (no stall) still crosses the threshold normally. The watchdog runs on the data-plane runtime - the one whose CPU stall must be measured; the PSI signal, being host-wide, applies regardless.

Host-load-aware peer breaker

The same crediting protects the cluster link, not just client connections. The peer transport (crates/hive-cluster/src/transport.rs) sends Awaitable messages (RPC, claims, resize) with an 8 s timeout and, after six consecutive timeouts to one peer (~48 s), trips a circuit breaker that disconnects it so subsequent sends fast-fail and a clean reconnect resnapshots. But that 8 s measures the local node's ability to hand a message to its own outbound task - not the peer's responsiveness. A node thrashing in swap (a Pi pushed into memory pressure by other services, say) cannot drain its own channels in 8 s, so every send to every peer times out, the breaker trips against all of them at once, and the node partitions itself out of a perfectly healthy mesh.

The daemon injects its StallTracker into the transport (set_stall_signal) so the breaker credits host-load stall before disconnecting, mirroring the client soft-drop: at the threshold it compares the host stall accrued across the streak against the 5 s credit floor, and if the local node was off-CPU or paging for that much of it, the timeouts say nothing about the peer - the connection is held, the streak resets, and a genuinely dead peer is still reaped by the per-peer receiver task and TCP keepalive. A healthy node with a truly unresponsive peer still trips normally. Logged as holding peer, not partitioning self (held) vs disconnecting peer (genuine trip).

Nothing slow on the select loop

The per-connection select loop only reads frames, routes them, drains the output fan-in, and ticks pings. Anything that can park - a peer RPC, a replication retry, heavy local work - runs off-loop, so a slow or starved node can never freeze a healthy client's heartbeat and output streaming because some other node is slow. Routing, by message class:

io task (ordered FIFO, dedicated hive-io runtime) - SendInput/SendPtyInput/ResizePty, plus ClaimSession so a claim immediately followed by typing gates in order.
stream task (ordered FIFO) - Attach/Detach/SubscribeOutput/ UnsubscribeOutput/OpenSession/CloseSession. Their registries live in a shared per-connection ConnStreams (short lock scopes; the FIFO itself serializes the idempotency checks).
Per-request detached tasks - KillSession, RenameSession, TagSession, SaveSessionFile, GetSessionHistory (its drain fence stays on-loop ahead of the spawn), GetScreenSnapshot, ExportSession, and a node-targeted NewSession (forwards up to NODE_FORWARD_TIMEOUT, 35 s).
request worker (ordered FIFO) - every other request-shaped message (projects, teams, notes, git, worktrees, files, node-targeted diagnostics, TeamWait), preserving the old inline ordering semantics through a cloned writer handle.

Responses ride back through the push channel or the shared writer queue; the loop's only awaits are channel operations and non-blocking writer enqueues.

Android background keepalive

On mobile, a silent client is usually the OS freezing the app process when it is backgrounded - the WebSocket lives in the Tauri/Tokio layer, and a frozen process can no longer answer pings (the daemon holds the connection regardless, but session claims and live streaming work best on a responsive socket). To keep the process responsive across normal backgrounding, the Android app runs a dataSync foreground service (HiveKeepAliveService, injected by scripts/patch-android-keyboard.mjs). While it holds an ongoing notification, Android keeps the process unfrozen, so the Tokio runtime keeps answering pings and output keeps flowing.

The service is driven from useConnectionLifecycle.ts: it starts (via the HiveAndroid JS bridge) whenever the app holds a daemon connection and stops on disconnect. It is a no-op on desktop/iOS, where the bridge is absent.

This is not an unlimited keepalive. Android 14+ budgets dataSync foreground-service runtime, and Doze still throttles the network during long screen-off periods. So it reliably covers app-switching and short screen-off gaps; an overnight-backgrounded phone still drops and recovers through the ordinary foreground resume path (eager reconnect + resume-from-seq).

Input latency under CPU load

A terminal should stay crisp even when the host is CPU-saturated, the way an SSH session does. hived already runs at nice -10 (more favored than sshd), so the cause of felt lag under load was never CPU starvation of the daemon - it was work-per-byte and in-process scheduling. Three changes close the gap:

Dedicated input runtime (hive-io). The per-connection ordered I/O task that applies keystrokes / pty input / resize (server/mod.rs) is spawned on a separate 2-worker tokio runtime built in main.rs (io_handle on AppState), mirroring the control-plane runtime. Input therefore never waits for a data-plane worker that is busy serializing an output flood. A keystroke's path to the PTY is independent of how much output is in flight.
Per-session PTY writer thread. Each session owns a dedicated writer thread (session/pty_backend.rs) that drains an input channel and does one write_all + flush per batch - symmetric with the reader thread. Sending input is now a non-blocking channel send (session/pty.rs); a stalled child backs up only its own thread, never a shared tokio blocking-pool worker, so the old per-write spawn_blocking + 5 s timeout/leak is gone.
Binary output frame. Output is sent as a raw binary WS frame rather than JSON, removing the escape-dense per-byte encode cost on the session's busy owner node - see Binary output frame.

Cluster request bounding

Forwarded peer requests (ForwardedRequest) carry a hops budget (FORWARD_INITIAL_HOPS), decremented at each re-forward, so a misrouted or phantom-owner request cannot bounce around the mesh indefinitely. A node that receives hops == 0 handles the request terminally.

Cluster data/control lane split

Each peer link has two outbound mpsc lanes drained by one writer task (crates/hive-cluster/src/transport.rs): a control lane (tx) and a data lane (data_tx). PeerHandle::lane routes by message role - only the high-volume forwarded session output (PeerMessage::SessionOutput) takes the data lane; everything else, including state-replication gossip, RPC, and announcements, takes the control lane. The writer is a biased select that drains the control lane first, so a flood of forwarded output can never delay a state-replication or keepalive frame on the shared peer socket and trip the awaitable-timeout breaker.

The split is deliberately by message role, not DeliveryPolicy: a Lossy keepalive or announce frame must still stay on the control lane, because starving it behind an output flood is exactly what produced false peer-death detection before.

Screen-state snapshots (Phase 2)

When a resume gap predates the ring, the client recovers via a screen snapshot instead of a scrollback refetch. On ResyncRequired the client sends GetScreenSnapshot; the daemon reconstructs the session's current grid by feeding the recent live_output buffer through a throwaway avt virtual terminal (built on demand - no steady-state cost) and replies with ScreenSnapshot { seq, cols, rows, dump }, where dump is a self-contained escape-sequence string. The client resets its terminal, writes dump to land exactly on the daemon's current screen, and resumes live dedup from seq. This is O(screen) instead of O(scrollback) and free of the width/debris artifacts of replaying raw bytes at a viewer's own width.

handle_get_screen_snapshot in session_handlers.rs implements it; avt (asciinema virtual terminal, 0.18) is the VT parser.

OS-level protection (systemd)

Userspace can only be so resilient under swap thrash - sshd survives load 30 because its processes are tiny and the kernel schedules them; hived earns the same treatment through its unit file. The systemd template (installed by hive setup and rewritten on every hive.ps1 deploy; sources: crates/hive-client/src/install.rs and scripts/_common.ps1, kept equivalent) sets Nice=-10, CPUWeight=10000, IOWeight=10000, MemoryMin=256M, MemoryLow=256M, OOMScoreAdjust=-900, and Delegate=yes (so hived can split its cgroup into daemon/workload leaves - PTY children can't starve the control plane from inside the same service cgroup). Nodes installed before these directives keep their old unit until the next deploy/setup rewrites it; verify with systemctl show hived -p CPUWeight -p MemoryLow -p Nice.

Verifying under synthetic load

The pass/fail procedure for "host stress must produce lag, never breakage" (stress-ng is an external tool on the test node, not a dependency):

On the daemon host: stress-ng --cpu $(nproc) --vm 2 --vm-bytes 90% --timeout 300s (add --iomix 2 for IO pressure).
From a client with a real attached session, type into a shell and run a flood (yes | head -c 10000000), watching keystroke echo and output flow.
Pass criteria: zero hive:disconnected events and zero daemon "dropping/reaping connection" log lines for connections with sessions; the staleness indicator may appear and must clear on its own; terminal content stays intact (ring replay, no full-screen wipe unless the gap outran the ring, at most one ResyncRequired per session).
Socket-death matrix: kill -STOP the client app (kernel still ACKs) - the daemon must hold the connection indefinitely; pull the network with output in flight - TCP_USER_TIMEOUT errors the socket within ~60 s; an idle session-less zombie is reaped only by the ZOMBIE_REAP_TIMEOUT gate.

Roadmap

Continuous viewer rendering / adaptive frame-rate - extend screen-state beyond on-demand recovery: render all viewers from the authoritative grid (killing width-mismatch debris for good), and for a sustained-slow client switch its stream from raw bytes to periodic screen diffs (Mosh's adaptive frame rate) instead of relying on ring replay.

Terminal Transport & Resume ​

The model ​

Replay ring (OutputRing) ​

Resume-from-seq ​

History collapse (VT-rendered scrollback) ​

Resume repaint nudge ​

Backpressure isolation ​

Cross-node resume ​

Keepalive ​

Host-load-aware liveness ​

Host-load-aware peer breaker ​

Nothing slow on the select loop ​

Android background keepalive ​

Input latency under CPU load ​

Cluster request bounding ​

Cluster data/control lane split ​

Screen-state snapshots (Phase 2) ​

OS-level protection (systemd) ​

Verifying under synthetic load ​

Roadmap ​