Skip to content

Commit ff9923f

Browse files
authored
Session Durability Checkpoint (#2821)
Working on bug fixes and UX. Streams restarting, fixed lots of bugs, timing issues, concurrency bugs. Get status shipped to the FE to drive "shield" state display. Deal with stale streams. Also big UX changes to the block headers. Specialize the terminal headers to prioritize the connection (sense of place), remove old terminal icon and word "Terminal" from the header. Also drop "Web" and "Preview" labels on web/preview blocks. Added `wsh focusblock` command.
1 parent 26cd7a4 commit ff9923f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+2009
-955
lines changed

.roo/rules/rules.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ These files provide step-by-step instructions, code examples, and best practices
103103
- **Match response length to question complexity** - For simple, direct questions in Ask mode (especially those that can be answered in 1-2 sentences), provide equally brief answers. Save detailed explanations for complex topics or when explicitly requested.
104104
- **CRITICAL** - useAtomValue and useAtom are React HOOKS. They cannot be used inline in JSX code, they must appear at the top of a component in the hooks area of the react code.
105105
- for simple functions, we prefer `if (!cond) { return }; functionality;` pattern overn `if (cond) { functionality }` because it produces less indentation and is easier to follow.
106+
- It is now 2026, so if you write new files use 2026 for the copyright year
106107

107108
### Strict Comment Rules
108109

Lines changed: 291 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,291 @@
1+
# Block Controller Lifecycle
2+
3+
## Overview
4+
5+
Block controllers manage the execution lifecycle of terminal shells, commands, and other interactive processes. **The frontend drives the controller lifecycle** - the backend is reactive, creating and managing controllers in response to frontend requests.
6+
7+
## Controller States
8+
9+
Controllers have three primary states:
10+
- **`init`** - Controller exists but process is not running
11+
- **`running`** - Process is actively running
12+
- **`done`** - Process has exited
13+
14+
## Architecture Components
15+
16+
### Backend: Controller Registry
17+
18+
Location: [`pkg/blockcontroller/blockcontroller.go`](pkg/blockcontroller/blockcontroller.go)
19+
20+
The backend maintains a **global controller registry** that maps blockIds to controller instances:
21+
22+
```go
23+
var (
24+
controllerRegistry = make(map[string]Controller)
25+
registryLock sync.RWMutex
26+
)
27+
```
28+
29+
Controllers implement the [`Controller` interface](pkg/blockcontroller/blockcontroller.go:64):
30+
- `Start(ctx, blockMeta, rtOpts, force)` - Start the controller process
31+
- `Stop(graceful, newStatus)` - Stop the controller process
32+
- `GetRuntimeStatus()` - Get current runtime status
33+
- `SendInput(input)` - Send input (data, signals, terminal size) to the process
34+
35+
### Frontend: View Model
36+
37+
Location: [`frontend/app/view/term/term-model.ts`](frontend/app/view/term/term-model.ts)
38+
39+
The [`TermViewModel`](frontend/app/view/term/term-model.ts:44) manages the frontend side of a terminal block:
40+
41+
**Key Atoms:**
42+
- `shellProcFullStatus` - Holds the current controller status from backend
43+
- `shellProcStatus` - Derived atom for just the status string ("init", "running", "done")
44+
- `isRestarting` - UI state for restart animation
45+
46+
**Event Subscription:**
47+
The constructor subscribes to controller status events (line 317-324):
48+
```typescript
49+
this.shellProcStatusUnsubFn = waveEventSubscribe({
50+
eventType: "controllerstatus",
51+
scope: WOS.makeORef("block", blockId),
52+
handler: (event) => {
53+
let bcRTS: BlockControllerRuntimeStatus = event.data;
54+
this.updateShellProcStatus(bcRTS);
55+
},
56+
});
57+
```
58+
59+
This creates a **reactive data flow**: backend publishes status updates → frontend receives via WebSocket events → UI updates automatically via Jotai atoms.
60+
61+
## Lifecycle Flow
62+
63+
### 1. Frontend Triggers Controller Creation/Start
64+
65+
**Entry Point:** [`ResyncController()`](pkg/blockcontroller/blockcontroller.go:120) RPC endpoint
66+
67+
The frontend calls this via [`RpcApi.ControllerResyncCommand`](frontend/app/view/term/term-model.ts:661) when:
68+
69+
1. **Manual Restart** - User clicks restart button or presses Enter when process is done
70+
- Triggered by [`forceRestartController()`](frontend/app/view/term/term-model.ts:652)
71+
- Passes `forcerestart: true` flag
72+
- Includes current terminal size (`termsize: { rows, cols }`)
73+
74+
2. **Connection Status Changes** - Connection becomes available/unavailable
75+
- Monitored by [`TermResyncHandler`](frontend/app/view/term/term.tsx:34) component
76+
- Watches `connStatus` atom for changes
77+
- Calls `termRef.current?.resyncController("resync handler")`
78+
79+
3. **Block Meta Changes** - Configuration like controller type or connection changes
80+
- Happens when block metadata is updated
81+
- Backend detects changes and triggers resync
82+
83+
### 2. Backend Processes Resync Request
84+
85+
The [`ResyncController()`](pkg/blockcontroller/blockcontroller.go:120) function:
86+
87+
```go
88+
func ResyncController(ctx context.Context, tabId, blockId string,
89+
rtOpts *waveobj.RuntimeOpts, force bool) error
90+
```
91+
92+
**Steps:**
93+
94+
1. **Get Block Data** - Fetch block metadata from database
95+
2. **Determine Controller Type** - Read `controller` meta key ("shell", "cmd", "tsunami")
96+
3. **Check Existing Controller:**
97+
- If controller type changed → stop old, create new
98+
- If connection changed (for shell/cmd) → stop and restart
99+
- If `force=true` → stop existing
100+
4. **Register Controller** - Add to registry (replaces existing if present)
101+
5. **Check if Start Needed** - If status is "init" or "done":
102+
- For remote connections: verify connection status first
103+
- Call `controller.Start(ctx, blockMeta, rtOpts, force)`
104+
6. **Publish Status** - Controller publishes runtime status updates
105+
106+
**Important:** Registering a new controller automatically stops any existing controller for that blockId (line 95-98):
107+
```go
108+
if existingController != nil {
109+
existingController.Stop(false, Status_Done)
110+
wstore.DeleteRTInfo(waveobj.MakeORef(waveobj.OType_Block, blockId))
111+
}
112+
```
113+
114+
### 3. Backend Publishes Status Updates
115+
116+
Controllers publish their status via the event system when:
117+
- Process starts
118+
- Process state changes
119+
- Process exits
120+
121+
The status includes:
122+
- `shellprocstatus` - "init", "running", or "done"
123+
- `shellprocconnname` - Connection name being used
124+
- `shellprocexitcode` - Exit code when done
125+
- `version` - Incrementing version number for ordering
126+
127+
### 4. Frontend Receives and Processes Updates
128+
129+
**Status Update Handler** (line 321-323):
130+
```typescript
131+
handler: (event) => {
132+
let bcRTS: BlockControllerRuntimeStatus = event.data;
133+
this.updateShellProcStatus(bcRTS);
134+
}
135+
```
136+
137+
**Status Update Logic** (line 430-438):
138+
```typescript
139+
updateShellProcStatus(fullStatus: BlockControllerRuntimeStatus) {
140+
if (fullStatus == null) return;
141+
const curStatus = globalStore.get(this.shellProcFullStatus);
142+
// Only update if newer version
143+
if (curStatus == null || curStatus.version < fullStatus.version) {
144+
globalStore.set(this.shellProcFullStatus, fullStatus);
145+
}
146+
}
147+
```
148+
149+
The version check ensures out-of-order events don't cause issues.
150+
151+
### 5. UI Updates Reactively
152+
153+
The UI reacts to status changes through Jotai atoms:
154+
155+
**Header Buttons** (line 263-306):
156+
- Show "Play" icon when status is "init"
157+
- Show "Refresh" icon when status is "running" or "done"
158+
- Display exit code/status icons for cmd controller
159+
160+
**Restart Behavior** (line 631-635 in term.tsx via term-model.ts):
161+
```typescript
162+
const shellProcStatus = globalStore.get(this.shellProcStatus);
163+
if ((shellProcStatus == "done" || shellProcStatus == "init") &&
164+
keyutil.checkKeyPressed(waveEvent, "Enter")) {
165+
this.forceRestartController();
166+
return false;
167+
}
168+
```
169+
170+
Pressing Enter when the process is done/init triggers a restart.
171+
172+
## Input Flow
173+
174+
**Frontend → Backend:**
175+
176+
When user types in terminal, data flows through [`sendDataToController()`](frontend/app/view/term/term-model.ts:408):
177+
```typescript
178+
sendDataToController(data: string) {
179+
const b64data = stringToBase64(data);
180+
RpcApi.ControllerInputCommand(TabRpcClient, {
181+
blockid: this.blockId,
182+
inputdata64: b64data
183+
});
184+
}
185+
```
186+
187+
This calls the backend [`SendInput()`](pkg/blockcontroller/blockcontroller.go:260) function which forwards to the controller's `SendInput()` method.
188+
189+
The [`BlockInputUnion`](pkg/blockcontroller/blockcontroller.go:48) supports three types of input:
190+
- `inputdata` - Raw terminal input bytes
191+
- `signame` - Signal names (e.g., "SIGTERM", "SIGINT")
192+
- `termsize` - Terminal size changes (rows/cols)
193+
194+
## Key Design Principles
195+
196+
### 1. Frontend-Driven Architecture
197+
198+
The frontend has full control over controller lifecycle:
199+
- **Creates** controllers by calling ResyncController
200+
- **Restarts** controllers via forcerestart flag
201+
- **Monitors** status via event subscriptions
202+
- **Sends input** via ControllerInput RPC
203+
204+
The backend is stateless and reactive - it doesn't make lifecycle decisions autonomously.
205+
206+
### 2. Idempotent Resync
207+
208+
`ResyncController()` is idempotent - calling it multiple times with the same state is safe:
209+
- If controller exists and is running with correct type/connection → no-op
210+
- If configuration changed → replaces controller
211+
- If force flag set → always restarts
212+
213+
This makes it safe to call on various triggers (connection change, focus, etc.).
214+
215+
### 3. Versioned Status Updates
216+
217+
Status includes a monotonically increasing version number:
218+
- Frontend can process events out-of-order
219+
- Only applies updates with newer versions
220+
- Prevents race conditions from concurrent updates
221+
222+
### 4. Automatic Cleanup
223+
224+
When a controller is replaced:
225+
- Old controller is automatically stopped
226+
- Runtime info is cleaned up
227+
- Registry entry is updated atomically
228+
229+
The `registerController()` function handles this automatically (line 84-99).
230+
231+
## Common Patterns
232+
233+
### Restarting a Controller
234+
235+
```typescript
236+
// In term-model.ts
237+
forceRestartController() {
238+
this.triggerRestartAtom(); // UI feedback
239+
const termsize = {
240+
rows: this.termRef.current?.terminal?.rows,
241+
cols: this.termRef.current?.terminal?.cols,
242+
};
243+
RpcApi.ControllerResyncCommand(TabRpcClient, {
244+
tabid: globalStore.get(atoms.staticTabId),
245+
blockid: this.blockId,
246+
forcerestart: true,
247+
rtopts: { termsize: termsize },
248+
});
249+
}
250+
```
251+
252+
### Handling Connection Changes
253+
254+
```typescript
255+
// In term.tsx - TermResyncHandler component
256+
React.useEffect(() => {
257+
const isConnected = connStatus?.status == "connected";
258+
const wasConnected = lastConnStatus?.status == "connected";
259+
if (isConnected == wasConnected && curConnName == lastConnName) {
260+
return; // No change
261+
}
262+
model.termRef.current?.resyncController("resync handler");
263+
setLastConnStatus(connStatus);
264+
}, [connStatus]);
265+
```
266+
267+
### Monitoring Status
268+
269+
```typescript
270+
// Status is automatically available via atom
271+
const shellProcStatus = jotai.useAtomValue(model.shellProcStatus);
272+
273+
// Use in UI
274+
if (shellProcStatus == "running") {
275+
// Show running state
276+
} else if (shellProcStatus == "done") {
277+
// Show restart button
278+
}
279+
```
280+
281+
## Summary
282+
283+
The block controller lifecycle is **frontend-driven and event-reactive**:
284+
285+
1. **Frontend triggers** controller creation/restart via `ControllerResyncCommand` RPC
286+
2. **Backend processes** the request in `ResyncController()`, creating/starting controllers as needed
287+
3. **Backend publishes** status updates via WebSocket events
288+
4. **Frontend receives** status updates and updates Jotai atoms
289+
5. **UI reacts** automatically to atom changes via React components
290+
291+
This architecture gives the frontend full control over when processes start/stop while keeping the backend focused on process management. The event-based status updates create a clean separation of concerns and enable real-time UI updates without polling.

cmd/wsh/cmd/wshcmd-focusblock.go

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
// Copyright 2025, Command Line Inc.
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
package cmd
5+
6+
import (
7+
"fmt"
8+
"os"
9+
10+
"github.com/spf13/cobra"
11+
"github.com/wavetermdev/waveterm/pkg/wshrpc"
12+
"github.com/wavetermdev/waveterm/pkg/wshrpc/wshclient"
13+
)
14+
15+
var focusBlockCmd = &cobra.Command{
16+
Use: "focusblock [-b {blockid|blocknum|this}]",
17+
Short: "focus a block in the current tab",
18+
Args: cobra.NoArgs,
19+
RunE: focusBlockRun,
20+
PreRunE: preRunSetupRpcClient,
21+
}
22+
23+
func init() {
24+
rootCmd.AddCommand(focusBlockCmd)
25+
}
26+
27+
func focusBlockRun(cmd *cobra.Command, args []string) (rtnErr error) {
28+
defer func() {
29+
sendActivity("focusblock", rtnErr == nil)
30+
}()
31+
32+
tabId := os.Getenv("WAVETERM_TABID")
33+
if tabId == "" {
34+
return fmt.Errorf("no tab id specified (set WAVETERM_TABID environment variable)")
35+
}
36+
37+
fullORef, err := resolveBlockArg()
38+
if err != nil {
39+
return err
40+
}
41+
42+
route := fmt.Sprintf("tab:%s", tabId)
43+
err = wshclient.SetBlockFocusCommand(RpcClient, fullORef.OID, &wshrpc.RpcOpts{
44+
Route: route,
45+
Timeout: 2000,
46+
})
47+
if err != nil {
48+
return fmt.Errorf("focusing block: %v", err)
49+
}
50+
return nil
51+
}

cmd/wsh/cmd/wshcmd-jobdebug.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -178,7 +178,7 @@ func jobDebugListRun(cmd *cobra.Command, args []string) error {
178178
return nil
179179
}
180180

181-
fmt.Printf("%-36s %-20s %-9s %-10s %-6s %-30s %-8s %-10s %-8s\n", "OID", "Connection", "Connected", "Manager", "Reason", "Cmd", "ExitCode", "Stream", "Attached")
181+
fmt.Printf("%-36s %-25s %-9s %-10s %-6s %-30s %-8s %-10s %-8s\n", "OID", "Connection", "Connected", "Manager", "Reason", "Cmd", "ExitCode", "Stream", "Attached")
182182
for _, job := range rtnData {
183183
connectedStatus := "no"
184184
if connectedMap[job.OID] {
@@ -226,7 +226,7 @@ func jobDebugListRun(cmd *cobra.Command, args []string) error {
226226
}
227227
}
228228

229-
fmt.Printf("%-36s %-20s %-9s %-10s %-6s %-30s %-8s %-10s %-8s\n",
229+
fmt.Printf("%-36s %-25s %-9s %-10s %-6s %-30s %-8s %-10s %-8s\n",
230230
job.OID, job.Connection, connectedStatus, job.JobManagerStatus, doneReason, job.Cmd, exitCode, streamStatus, attachedBlock)
231231
}
232232
return nil

0 commit comments

Comments
 (0)