|
| 1 | +# Implementation Plan |
| 2 | + |
| 3 | +- [ ] 1. Define tagging contracts, errors, and configuration |
| 4 | +- [ ] 1.1 Define non-forwardable tag contracts and scope values |
| 5 | + - Introduce a domain representation for non-forwardable scope with `never_forward` and `client_history_only`. |
| 6 | + - Define a fixed-size, content-independent identity representation suitable for bounded in-memory storage. |
| 7 | + - Define a compact tag record structure that can be stored per session without retaining message content. |
| 8 | + - Ensure the domain contracts do not rely on any client-provided metadata for correctness. |
| 9 | + - _Requirements: 1.1, 1.7, 1.8, 14.1_ |
| 10 | + |
| 11 | +- [ ] 1.2 Define service interfaces for identity, registry, and enforcement |
| 12 | + - Define an identity computation contract that is deterministic for canonical domain messages and excludes client metadata. |
| 13 | + - Define a registry contract that supports session-scoped append-only tagging and tag lookups. |
| 14 | + - Define an enforcement contract that filters a message list and reports filtered counts while preserving order. |
| 15 | + - Specify fail-closed behavior for indeterminate matches or internal errors (raise before any backend call). |
| 16 | + - _Requirements: 1.2, 1.3, 1.4, 1.9, 1.10, 7.3, 10.1, 12.1_ |
| 17 | + |
| 18 | +- [ ] 1.3 Define error types and API error mapping for enforcement failures |
| 19 | + - Add a domain error for internal enforcement failures that must fail closed without backend calls. |
| 20 | + - Add a client-visible error for “no forwardable content remains” after filtering. |
| 21 | + - Add a client-visible error for “non-forwardable tag capacity exceeded” (bounded memory protection). |
| 22 | + - Ensure error responses are structured and do not leak filtered message content. |
| 23 | + - _Requirements: 5.3, 6.2, 7.3, 10.1, 14.3_ |
| 24 | + |
| 25 | +- [ ] 1.4 Add configuration for per-session non-forwardable tag capacity limit |
| 26 | + - Add a configuration field for maximum stored non-forwardable identities per session. |
| 27 | + - Ensure configuration precedence is respected (CLI > ENV > YAML). |
| 28 | + - Provide a default limit of 10,000 identities when not configured. |
| 29 | + - _Requirements: 14.3, 14.4_ |
| 30 | + |
| 31 | +- [ ] 2. Implement deterministic message identity |
| 32 | +- [ ] 2.1 Implement identity computation with compaction-stable rules |
| 33 | + - Compute identity from canonical message attributes only (no transport wrappers, no client metadata). |
| 34 | + - Normalize textual fields for hashing without mutating messages (line-ending normalization only). |
| 35 | + - Ensure tool result identities do not depend on tool output content so identities survive history compaction rewrites. |
| 36 | + - Add request-local caching/batching to avoid unnecessary repeated hashing under normal workloads. |
| 37 | + - _Requirements: 1.2, 1.9, 1.10, 1.12, 1.13, 5.2, 9.1_ |
| 38 | + |
| 39 | +- [ ] 2.2 (P) Add unit tests for identity determinism and compaction stability |
| 40 | + - Verify deterministic identities for equivalent canonical messages across repeated computations. |
| 41 | + - Verify identities ignore client metadata/transport-only fields. |
| 42 | + - Verify tool result identity remains stable when tool output content is rewritten by compaction. |
| 43 | + - _Requirements: 1.2, 1.10, 1.12, 1.13_ |
| 44 | + |
| 45 | +- [ ] 3. Implement session-scoped tag registry with bounded memory |
| 46 | +- [ ] 3.1 Implement registry persistence, deduplication, and per-session limit enforcement |
| 47 | + - Persist tags per session as append-only and immutable for the session lifetime. |
| 48 | + - Store only fixed-size identities and compact scope/tag records (no message content retention). |
| 49 | + - Deduplicate repeated tagging of the same identity+scope so stored state does not grow from repeats. |
| 50 | + - Enforce the configured per-session tag capacity limit and fail the request before any backend call when exceeded. |
| 51 | + - _Requirements: 1.1, 1.3, 8.3, 8.4, 14.1, 14.2, 14.3, 14.4_ |
| 52 | + |
| 53 | +- [ ] 3.2 (P) Add unit tests for registry immutability, dedupe, and limit exceeded behavior |
| 54 | + - Verify tags are monotonic (append-only) and cannot be removed within a session lifetime. |
| 55 | + - Verify re-tagging the same identity+scope does not increase stored state. |
| 56 | + - Verify exceeding the configured limit produces the correct error and prevents backend calls. |
| 57 | + - _Requirements: 1.3, 10.1, 14.2, 14.3_ |
| 58 | + |
| 59 | +- [ ] 4. Implement the non-forwardable enforcer (single filtering policy) |
| 60 | +- [ ] 4.1 Implement scope-aware filtering with preserved order and no content mutation |
| 61 | + - Filter messages recognized as non-forwardable for the session and exclude them from outbound payloads. |
| 62 | + - Always exclude `never_forward` messages regardless of whether they originated from the client or the proxy. |
| 63 | + - Preserve relative ordering of remaining forwardable messages and do not mutate their content. |
| 64 | + - Support filtering across roles/content variants present in canonical message contracts. |
| 65 | + - Ignore client attempts to mark messages as non-forwardable and rely solely on server-tagged recognition within the session. |
| 66 | + - If all forwardable user-provided content is removed, return a client-visible “nothing forwardable” error without backend calls. |
| 67 | + - _Requirements: 1.4, 1.5, 1.6, 1.7, 1.11, 3.2, 4.3, 5.1, 5.2, 5.3, 10.1, 12.1_ |
| 68 | + |
| 69 | +- [ ] 4.2 Implement injected-message provenance boundary and client-history-only semantics |
| 70 | + - Accept an internal provenance boundary that splits client-submitted history from proxy-injected messages for a call. |
| 71 | + - Filter client history against both scopes (`never_forward`, `client_history_only`). |
| 72 | + - Filter injected messages against `never_forward` only so proxy-injected steering remains included for that call. |
| 73 | + - Validate boundary inputs and fail closed on invalid provenance before any backend call. |
| 74 | + - _Requirements: 1.8, 1.11, 4.1, 4.2, 4.4, 7.2, 7.3_ |
| 75 | + |
| 76 | +- [ ] 4.3 Implement structured telemetry for filtering decisions |
| 77 | + - Emit a structured log entry when messages are filtered, including correlation id and filtered count. |
| 78 | + - Avoid logging message contents by default while still supporting request-level correlation. |
| 79 | + - Ensure telemetry remains accurate when wire capture is enabled and filtering is applied. |
| 80 | + - _Requirements: 6.1, 6.2, 11.1_ |
| 81 | + |
| 82 | +- [ ] 4.4 (P) Add unit tests for enforcer invariants and fail-closed behavior |
| 83 | + - Verify order preservation and no mutation for remaining messages. |
| 84 | + - Verify `never_forward` and `client_history_only` semantics, including injected-message boundary behavior. |
| 85 | + - Verify invalid boundary provenance and internal lookup errors fail closed without backend calls. |
| 86 | + - _Requirements: 1.4, 1.5, 1.6, 1.8, 4.4, 7.3, 10.1_ |
| 87 | + |
| 88 | +- [ ] 5. Tag non-forwardable messages at the sources (commands, responses, steering) |
| 89 | +- [ ] 5.1 (P) Tag slash commands as never-forward and ensure server-side handling |
| 90 | + - When a client message is identified as a slash command, tag it as `never_forward` for the session. |
| 91 | + - Execute supported commands server-side and return the command response without calling a remote backend. |
| 92 | + - Return an error response for invalid/unsupported commands without calling a remote backend. |
| 93 | + - _Requirements: 2.1, 2.2, 2.3, 2.4, 2.5_ |
| 94 | + |
| 95 | +- [ ] 5.2 (P) Tag server-generated command responses as never-forward |
| 96 | + - Tag the server-generated command response message as `never_forward` for the session at creation time. |
| 97 | + - Ensure recognition does not rely on client-preserved metadata if the client resubmits the response in history. |
| 98 | + - _Requirements: 3.1, 3.2, 3.3_ |
| 99 | + |
| 100 | +- [ ] 5.3 (P) Tag server-injected steering/internal messages as client-history-only |
| 101 | + - When the proxy injects steering/internal messages for a backend workflow, tag them as `client_history_only`. |
| 102 | + - Provide the injected-message provenance boundary for the current backend call so filtering semantics are correct. |
| 103 | + - Ensure client echo of these messages is excluded from future backend payloads within the same session. |
| 104 | + - _Requirements: 4.1, 4.2, 4.4_ |
| 105 | + |
| 106 | +- [ ] 6. Wire enforcement into backend orchestration (single authoritative boundary) |
| 107 | +- [ ] 6.1 Register identity, registry, and enforcer services in staged initialization |
| 108 | + - Register new services with appropriate lifetimes and dependency wiring. |
| 109 | + - Ensure configuration values (tag capacity limit) are available to the registry at runtime. |
| 110 | + - _Requirements: 1.1, 1.2, 1.3, 14.3_ |
| 111 | + |
| 112 | +- [ ] 6.2 Integrate enforcement into backend completion flow before capture and invocation |
| 113 | + - Apply optional history compaction (if enabled) before enforcement so filtering runs on the final outbound message list. |
| 114 | + - Invoke enforcement immediately before backend request translation, outbound wire capture, and backend invocation. |
| 115 | + - Update the canonical request message list to the filtered list so downstream processing and captures are consistent. |
| 116 | + - Ensure no remote backend call occurs without passing through this enforcement boundary. |
| 117 | + - _Requirements: 5.4, 6.3, 7.1, 7.4, 7.6_ |
| 118 | + |
| 119 | +- [ ] 6.3 Add integration tests for backend flow filtering and compaction compatibility |
| 120 | + - Verify wire captures and outbound payloads exclude non-forwardable messages. |
| 121 | + - Verify filtering still matches and excludes tagged messages when tool-result history is compacted/rewritten. |
| 122 | + - Verify “no forwardable content” and tag-capacity errors fail before any backend call. |
| 123 | + - _Requirements: 1.12, 1.13, 6.3, 7.1, 7.4, 14.3_ |
| 124 | + |
| 125 | +- [ ]* 6.4 Add property-based tests for identity and filtering invariants (deferrable) |
| 126 | + - Generate diverse canonical message shapes (role/content/tool variants) and assert identity determinism. |
| 127 | + - Assert filtering invariants (order preserved; removed messages are always tagged for the session and scope). |
| 128 | + - _Requirements: 1.2, 1.5, 5.2_ |
| 129 | + |
| 130 | +- [ ] 7. Ensure coverage across entry points and session identity propagation (Option B) |
| 131 | +- [ ] 7.1 (P) Refactor WebSocket-based backend call paths to use the shared orchestrator |
| 132 | + - Remove direct calls to backend adapters from WebSocket features and route via the shared backend-call service. |
| 133 | + - Ensure a session id is resolved/created for the interaction and reused across multiple backend calls in a turn. |
| 134 | + - Confirm the shared enforcement boundary is invoked for these calls. |
| 135 | + - _Requirements: 7.5, 7.6, 8.1, 8.2, 8.3_ |
| 136 | + |
| 137 | +- [ ] 7.2 (P) Refactor internal multi-phase backend workflows to use the shared orchestrator |
| 138 | + - Remove direct calls to backend adapters from internal multi-phase executors and route via the shared backend-call service. |
| 139 | + - Ensure nested backend calls reuse the same session id for the logical interaction. |
| 140 | + - Confirm the shared enforcement boundary is invoked for these calls. |
| 141 | + - _Requirements: 7.5, 7.6, 8.1, 8.2, 8.3_ |
| 142 | + |
| 143 | +- [ ] 7.3 Ensure internal retry/steering workflows propagate session id and injection provenance |
| 144 | + - Reuse the same session id across multiple backend calls made within a single logical interaction. |
| 145 | + - Provide the injected-message provenance boundary for every backend call that appends steering/internal messages. |
| 146 | + - _Requirements: 7.2, 8.2, 8.3_ |
| 147 | + |
| 148 | +- [ ] 7.4 Add integration tests for session scoping and non-leakage across entry points |
| 149 | + - Verify tags are applied only within the resolved session id and do not leak across sessions. |
| 150 | + - Verify a non-HTTP entry point reuses the same session id across multiple backend calls in a single interaction. |
| 151 | + - _Requirements: 8.1, 8.2, 8.3, 8.4_ |
| 152 | + |
| 153 | +- [ ] 8. Remove legacy regex-based enforcement and legacy code paths (alpha finality) |
| 154 | +- [ ] 8.1 Remove regex-based non-forwardable filtering mechanisms and all wiring |
| 155 | + - Delete legacy regex-based mechanisms previously used for command stripping and prompt redaction related to non-forwardable behavior. |
| 156 | + - Remove any remaining “compatibility” toggles or fallback paths that preserve legacy semantics. |
| 157 | + - Ensure the tagging + single-boundary enforcement mechanism is the only remaining implementation. |
| 158 | + - _Requirements: 13.1, 13.2, 13.3_ |
| 159 | + |
| 160 | +- [ ] 8.2 Update tests to remove legacy regex expectations and assert final behavior only |
| 161 | + - Remove or rewrite tests that depended on legacy regex stripping and replace with tagging/enforcement assertions. |
| 162 | + - Add regression coverage that fails if legacy regex-based non-forwardable filtering is reintroduced. |
| 163 | + - _Requirements: 13.1, 13.3_ |
| 164 | + |
| 165 | +- [ ] 8.3 Run the full unit and integration suites for this feature and fix failures |
| 166 | + - Ensure unit tests for identity/registry/enforcer pass. |
| 167 | + - Ensure integration tests for backend flow, compaction compatibility, and entry point coverage pass. |
| 168 | + - _Requirements: 9.1, 10.1, 11.1_ |
0 commit comments