You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The previous analysis incorrectly identified gaps on the send side. With
crank buffering, output validity and deferred transmission are achieved:
messages only reach RemoteHandle via the run queue, which means the
originating crank has already committed.
The remaining gaps are on the receive side:
- Done table for exactly-once delivery (deduplication)
- FIFO enforcement for ordering
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Documentation-only change that reclassifies protocol property
coverage; no code paths or runtime behavior are modified, but it may
influence engineering priorities if misinterpreted.
>
> **Overview**
> Updates `docs/ken-protocol-assessment.md` to state that crank
buffering *fully achieves* send-side **output validity** and **deferred
transmission**, since messages reach `RemoteHandle` only after the
originating crank commits via the run queue.
>
> Reframes the remaining work as *receive-side* gaps only, adding
concrete failure scenarios and outlining needed mechanisms for
**exactly-once deduplication** (a `Done` table / processed-seq tracking)
and **FIFO ordering** (buffer/reorder out-of-order messages), and
refreshes the summary/progress tables accordingly.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
a7446e5. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
While crank outputs are now buffered until crank commit, when a message reaches `RemoteHandle` for remote transmission, it is persisted and transmitted in quick succession. A crash between persist and transmit could result in the message being retransmitted on recovery (which is fine due to idempotency), but more critically, there's no coordination ensuring the kernel's crank commit happens before network transmission.
90
-
91
-
**Impact**: If RemoteHandle transmits a message and then the kernel crashes before its crank fully commits, the remote has received a message that the local kernel will "forget" on recovery. This violates output validity.
77
+
**Why output validity is achieved**: When a message destined for a remote reaches `RemoteHandle`, it arrives via the run queue. Items only reach the run queue after the originating crank commits. Therefore, by the time `RemoteHandle` persists and transmits a message, the crank that produced it has already committed. The transmitted message corresponds to committed local state.
92
78
93
-
**Mitigation needed**: RemoteHandle should only transmit messages after the originating crank has been fully committed. This requires coordination between the kernel's crank lifecycle and RemoteHandle's transmission timing.
79
+
`RemoteHandle` persists messages to `remotePending` before transmitting for a different reason: to enable retransmission on recovery if the transmission or ACK is lost. This is part of the at-least-once delivery mechanism, not the output validity mechanism.
94
80
95
-
#### 2. Done Table / Duplicate Detection (Gap)
81
+
###Remaining Gaps (Receive Side)
96
82
97
-
Ken maintains a `Done` table ensuring:
98
-
- Each message delivered to application **at most once**
99
-
- FIFO ordering enforced via `next_ready()` considering seq + sender ID
83
+
The remaining gaps are on the **receive side** of remote messaging:
100
84
101
-
We track `highestReceivedSeq` but only for ACK purposes. We don't have explicit duplicate detection for incoming messages. If the remote retransmits a message we already processed (but before we ACKed), we could deliver it twice.
85
+
#### 1. Done Table / Duplicate Detection (Gap)
102
86
103
-
#### 3. Output Validity (Improved, but Partial)
87
+
Ken maintains a `Done` table ensuring each message is delivered to the application **at most once**. The `Done` table is updated atomically with the application state at crank commit.
104
88
105
-
Ken guarantees outputs could have resulted from failure-free execution because:
106
-
- Outputs are buffered during a turn
107
-
- A crash during processing loses all outputs from that turn
108
-
- Only committed outputs escape to the outside world
89
+
We track `highestReceivedSeq` per remote, but there's a gap in how it interacts with delivery:
109
90
110
-
**Improvement**: With crank buffering, kernel-internal outputs (sends to local vats, notifications) are now properly buffered and discarded on rollback. A crash mid-crank no longer results in partial kernel state.
91
+
**Scenario A - Update on receive, before delivery:**
92
+
1. Receive message seq=5 from remote R
93
+
2. Update `highestReceivedSeq` to 5
94
+
3. Add message to run queue for delivery to local vat
95
+
4. Crash before delivery crank commits
96
+
5. On recovery: `highestReceivedSeq=5` suggests we processed it
97
+
6. Remote retransmits seq=5, we ignore it
98
+
7.**Message lost** - vat never received it
111
99
112
-
**Remaining gap**: For remote messages, the gap described in #1 above means network transmissions could still escape before the crank is fully committed.
**What's needed**: A `Done` table (or equivalent) that is updated atomically with the delivery crank commit, and checked before delivering incoming messages.
115
109
116
-
Ken atomically checkpoints `(turn, app_state, Q_out, Done)` together at end of turn.
110
+
#### 2. FIFO Enforcement on Receive (Gap)
117
111
118
-
**Improvement**: The kernel now uses database savepoints to make crank state changes atomic. The `CrankBuffer` contents are flushed atomically with the crank commit.
112
+
Ken enforces per-sender FIFO ordering via `next_ready()` which only delivers the next expected sequence number.
119
113
120
-
**Remaining gap**: RemoteHandle's message persistence is separate from the kernel's crank commit. These two persistence operations are not atomic with respect to each other.
114
+
If messages arrive out of order from the network (e.g., seq 1, 3, 2):
115
+
- We should deliver seq=1
116
+
- Buffer seq=3 until seq=2 arrives
117
+
- Deliver seq=2, then seq=3
121
118
122
-
#### 5. FIFO Enforcement on Receive (Gap)
119
+
We don't currently enforce this ordering on the receive side. Out-of-order network delivery could result in out-of-order application delivery.
123
120
124
-
Hold out-of-order messages until predecessors processed:
125
-
- Track expected next seq per sender
126
-
- Buffer messages that arrive out of order
127
-
- Deliver in sequence order only
128
-
129
-
We don't currently enforce FIFO delivery order on the receive side.
121
+
**What's needed**: Track expected next seq per remote sender, buffer out-of-order messages, deliver in sequence order only.
130
122
131
123
### Summary Table
132
124
133
125
| Ken Property | Our System | Notes |
134
126
|--------------|------------|-------|
135
-
| Exactly-once delivery |**Partial**| At-least-once with no duplicate detection |
136
-
| Output validity |**Partial**| ✓ for kernel-internal, gap for remote transmission |
- Store `highestProcessedSeq.${remoteId}` updated atomically with delivery crank
145
+
- On receive: if `seq <= highestProcessedSeq`, ACK but don't deliver
146
+
- Simple but requires in-order processing
160
147
161
-
### 2. Add Done Table
148
+
**Option B: Explicit Done table**
149
+
- Store `done.${remoteId}.${seq} = true` for each processed message
150
+
- On receive: check Done table before delivering
151
+
- Supports out-of-order processing
152
+
- Requires garbage collection of old entries (after ACK confirms sender discarded)
162
153
163
-
Track processed message IDs, deduplicate on receive:
164
-
- Persist `Done` table entries for processed messages
165
-
- On receive, check if message already in `Done` before delivering
166
-
- ACK messages in `Done` without re-delivering
154
+
Either approach requires the processed-message record to be updated **atomically with the delivery crank commit**, so that crash recovery sees consistent state.
167
155
168
-
### 3. FIFO Enforcement on Receive
156
+
### 2. FIFO Enforcement on Receive
169
157
170
-
Hold out-of-order messages until predecessors processed:
171
-
- Track expected next seq per sender
172
-
- Buffer messages that arrive out of order
173
-
- Deliver in sequence order only
158
+
Buffer and reorder incoming messages:
174
159
175
-
## Architectural Implications
160
+
- Track `expectedNextSeq.${remoteId}` per sender
161
+
- On receive: if `seq > expectedNextSeq`, buffer the message
162
+
- When `expectedNextSeq` message arrives, deliver it and any buffered successors
163
+
- Update `expectedNextSeq` as messages are delivered
176
164
177
-
The crank buffering work has brought us significantly closer to Ken's model:
165
+
This interacts with the Done table: we need to handle the case where we've already processed some messages (from Done table) when determining what's "expected next."
178
166
179
-
**Before crank buffering:**
180
-
```
181
-
Kernel Crank:
182
-
process message → syscalls immediately enqueue to run queue
183
-
184
-
RemoteHandle (independent):
185
-
persist each outgoing message → transmit immediately
186
-
```
167
+
## Architectural Summary
187
168
188
-
**After crank buffering:**
169
+
**Send side (achieved with crank buffering):**
189
170
```
190
-
Kernel Crank:
191
-
process message → syscalls buffer outputs
171
+
Vat Crank:
172
+
vat processes message → syscalls buffer outputs
192
173
193
174
Crank Commit (atomic):
194
-
persist(kernel_state) + flush(buffered_outputs to run queue)
175
+
persist(vat_state) + flush(buffered_outputs to run queue)
195
176
196
-
RemoteHandle (still independent):
197
-
receive from run queue → persist → transmit immediately
177
+
Later (separate operation):
178
+
run queue delivers to RemoteHandle → persist to remotePending → transmit
0 commit comments