|
| 1 | +# Gap Analysis: Backend Stage SOLID Refactoring |
| 2 | + |
| 3 | +> Note: `.kiro/specs/backend-stage-solid-refactoring/spec.json` shows `requirements.approved = false`. |
| 4 | +> This gap analysis proceeds anyway (it can inform requirement revisions), but design should treat requirements as draft until approved. |
| 5 | +
|
| 6 | +## Analysis Summary (3–5 bullets) |
| 7 | + |
| 8 | +- `src/core/app/stages/backend.py` currently combines stage orchestration, backend validation, per-backend initialization shaping, temporary `httpx.AsyncClient` lifecycle, and `static_route` validation; this conflicts with the SRP/OCP targets. |
| 9 | +- The repo already has a centralized backend DI registrar (`src/core/di/registrations/backend.py`) and a real `BackendFactory.ensure_backend()` path, but `BackendStage` still has fallback/manual validation that duplicates initialization logic and adds leak-prone resource handling. |
| 10 | +- Config/DI lifecycle constraints matter: stage validation runs before stage execution (`ApplicationBuilder.validate_stages()`), which makes “delegate to DI services” non-trivial unless config is injected (or validation avoids allocating resources). |
| 11 | +- The largest integration cost is test migration: unit tests heavily assert `BackendStage` implementation details and regression tests pin temporary-client leak behavior. |
| 12 | +- Viable approaches exist (extend existing, new components, hybrid), but the design phase must explicitly decide how validation is sequenced/cleaned up to avoid regressions. |
| 13 | + |
| 14 | +## Document Status |
| 15 | + |
| 16 | +- Framework: followed `.kiro/settings/rules/gap-analysis.md` |
| 17 | +- Context loaded: `.kiro/specs/backend-stage-solid-refactoring/spec.json`, `.kiro/specs/backend-stage-solid-refactoring/requirements.md`, and all `.kiro/steering/*` |
| 18 | +- Investigation method: codebase inspection via `rg`, targeted reads of key modules and tests |
| 19 | + |
| 20 | +## 1) Current State Investigation |
| 21 | + |
| 22 | +### Domain Assets (High-signal) |
| 23 | + |
| 24 | +- **Backend stage (current “god class”)** |
| 25 | + - `src/core/app/stages/backend.py` |
| 26 | + - `execute()` imports connectors, validates `static_route`, calls `src.core.di.registrations.backend.register(...)`, and has a compatibility `BackendService` registration path. |
| 27 | + - `validate()` and `_validate_backend_functionality()` implement backend functional checks and test-environment exceptions. |
| 28 | + - `_manual_backend_validation()` duplicates backend initialization config shaping (Gemini/Anthropic/OpenRouter) and directly instantiates/initializes connectors. |
| 29 | + - `_register_validation_http_client()` and `_cleanup_validation_client()` implement a “temporary validation client” lifecycle and task tracking. |
| 30 | + - `_validate_static_route_backend()` validates that `static_route` backend exists in `backend_registry`. |
| 31 | + |
| 32 | +- **Backend factory (central initialization path, still contains backend-specific branches)** |
| 33 | + - `src/core/services/backend_factory.py` |
| 34 | + - `ensure_backend()` already exists as the single “create + initialize” path used by `BackendStage._validate_backend_functionality()`. |
| 35 | + - Contains backend-specific augmentations for `anthropic`, `openrouter`, and `gemini` (hardcoded `if connector_type == ...` logic). |
| 36 | + |
| 37 | +- **Backend DI registrations (already centralized)** |
| 38 | + - `src/core/di/registrations/backend.py` calls focused registrars under `src/core/di/registrations/_backend/`. |
| 39 | + - There is **no** `src/core/di/registrations/_backend/validation.py` today. |
| 40 | + |
| 41 | +- **Backend discovery/registration** |
| 42 | + - `src/core/services/backend_imports.py` imports `src.connectors` at startup. |
| 43 | + - `src/connectors/__init__.py` auto-imports connector modules via `pkgutil.iter_modules(...)`. |
| 44 | + - `src/core/services/backend_registry.py` holds the registry used by connectors and validation. |
| 45 | + |
| 46 | +- **Config models already depend on the registry** |
| 47 | + - `src/core/config/models/backends.py`: |
| 48 | + - `BackendSettings.__init__()` reads `backend_registry.get_registered_backends()` to populate missing backend config stubs. |
| 49 | + - `static_route` exists on `BackendSettings`, but there is no config-level validation ensuring the backend part is valid. |
| 50 | + |
| 51 | +### Conventions & Architectural Constraints |
| 52 | + |
| 53 | +- Staged initialization: `src/core/app/application_builder.py` calls `validate_stages()` before any stage `execute()`. |
| 54 | +- Global DI bootstrapping: `src/core/di/services.py:get_service_collection()` eagerly registers all registrars (including backend) with `app_config=None`, which registers a default `AppConfig()` instance. |
| 55 | +- Resource cleanup: `ServiceCollection.dispose()` exists and awaits cleanup tasks, but `ApplicationBuilder.validate_stages()` does not dispose resources on validation failure. |
| 56 | + |
| 57 | +### Tests & Regression Pins (important constraints) |
| 58 | + |
| 59 | +- Unit tests: |
| 60 | + - `tests/unit/core/app/stages/test_backend_startup_validation.py` heavily asserts `BackendStage.validate()` and `_validate_backend_functionality()` behavior. |
| 61 | + - `tests/unit/core/app/stages/test_backend_stage_static_route_validation.py` asserts the exact error messaging of `_validate_static_route_backend()`. |
| 62 | +- Regression tests (resource leak prevention): |
| 63 | + - `tests/regression/test_backend_validation_client_leak_regression.py` |
| 64 | + - `tests/regression/test_backend_stage_cleanup_tasks_leak_regression.py` |
| 65 | + - `tests/regression/test_backend_stage_task_tracking_regression.py` |
| 66 | + |
| 67 | +## 2) Requirements Feasibility Analysis |
| 68 | + |
| 69 | +### Requirement-to-Asset Map (with Gaps Tagged) |
| 70 | + |
| 71 | +| Requirement | Existing Assets | Gap Type | Notes / Constraints | |
| 72 | +|---|---|---:|---| |
| 73 | +| R1: Initialization strategies | `BackendFactory.ensure_backend()`; connector `initialize(**kwargs)` signatures | **Missing** | Strategy registry + strategy modules do not exist; factory still contains hardcoded backend-specific branches. | |
| 74 | +| R2: Validation service | `BackendStage.validate()` and helpers | **Missing** | Needs extraction; must address stage validation running before stage execution/DI config injection. | |
| 75 | +| R3: HTTP client manager | `BackendStage._register_validation_http_client()` and `_cleanup_validation_client()`; `ServiceCollection.dispose()` | **Missing** | Manager should own lifecycle; current stage-level cleanup does not run on validation failure, and builder doesn’t dispose on validation errors. | |
| 76 | +| R4: Static route validation at config load | `BackendStage._validate_static_route_backend()`; `BackendSettings` model exists | **Missing / Constraint** | Config load already touches `backend_registry`; best insertion point may be config-model validation, but must ensure registry is populated in all entry points. | |
| 77 | +| R5: BackendStage simplification | `src/core/di/registrations/backend.py` already centralizes DI registration | **Extend** | Stage still contains validation, temporary HTTP client logic, and static route validation; must be removed/moved. | |
| 78 | +| R6: Duplication elimination | `BackendFactory.ensure_backend()` exists | **Constraint** | `_manual_backend_validation()` duplicates init shaping; removing it requires making the “real” factory path always available for validation. | |
| 79 | +| R7: Test migration | Existing unit + regression tests above | **Missing** | Significant refactor required: new service tests, stage tests become delegation-only, regression tests repoint to manager. | |
| 80 | +| R8: Interfaces + DI registration | `src/core/interfaces/*` pattern; DI registrars under `src/core/di/registrations/_backend/` | **Missing** | New Protocol interfaces and `validation.py` registrar required; backend registrar must call it. | |
| 81 | + |
| 82 | +### Key Feasibility Concerns (Design-Phase Decisions Needed) |
| 83 | + |
| 84 | +- **Validation sequencing vs DI**: `ApplicationBuilder.validate_stages()` runs before any stage `execute()`, but many backend validation paths assume `AppConfig` and `httpx.AsyncClient` are resolvable from DI. Today this is partially masked by global DI registering a default `AppConfig()`, which is not the runtime config. |
| 85 | +- **Validation resource cleanup**: if validation allocates an `httpx.AsyncClient` (directly or via DI), there is no guaranteed cleanup path when validation fails (builder doesn’t call `ServiceCollection.dispose()` during validation). |
| 86 | +- **Config-model mismatch risk**: `BackendConfig.api_key` is a `str | None` (normalized in `src/core/config/models/backends.py`), but legacy code paths still treat it like `list[str]` (e.g., stage fallback code). Any refactor that removes fallbacks reduces exposure to this mismatch. |
| 87 | + |
| 88 | +## 3) Implementation Approach Options |
| 89 | + |
| 90 | +### Option A: Extend Existing Components (minimal new concepts) |
| 91 | + |
| 92 | +- Extend `src/core/services/backend_factory.py` to support strategy delegation while keeping the public API (`ensure_backend()`) stable. |
| 93 | +- Gradually move validation out of `BackendStage` into a service but keep the stage-owned validation client until the end. |
| 94 | + |
| 95 | +Trade-offs: |
| 96 | +- ✅ Lower churn in wiring/bootstrapping initially |
| 97 | +- ✅ Minimizes new files early |
| 98 | +- ❌ Risk of keeping `BackendStage` bloated longer |
| 99 | +- ❌ Harder to make unit tests target the “right” abstraction until late |
| 100 | + |
| 101 | +### Option B: Create New Components (clean separation early) |
| 102 | + |
| 103 | +- Add: |
| 104 | + - `src/core/services/backend_validation_service.py` (+ `IBackendValidator`) |
| 105 | + - `src/core/services/validation_http_client_manager.py` (+ `IHttpClientManager`) |
| 106 | + - `src/connectors/strategies/` package (registry + per-backend strategies) |
| 107 | + - `src/core/di/registrations/_backend/validation.py` |
| 108 | +- Refactor `BackendStage` quickly to “import connectors + call backend registrar + delegate validate/cleanup”. |
| 109 | + |
| 110 | +Trade-offs: |
| 111 | +- ✅ Strong SRP boundaries and testability early |
| 112 | +- ✅ Easier to isolate leak-prone lifecycle logic behind a single manager |
| 113 | +- ❌ More integration points to get right up-front (DI + stage validation order) |
| 114 | +- ❌ Requires earlier test migration (unit + regression) |
| 115 | + |
| 116 | +### Option C: Hybrid (strangler fig with explicit validation-lifecycle plan) |
| 117 | + |
| 118 | +- Build the new components (as in Option B), but introduce them behind compatibility adapters: |
| 119 | + - Keep existing `BackendStage.validate()` behavior initially, but internally delegate most logic to `BackendValidationService` (instantiated directly), then later switch to DI resolution once config injection/ordering is solved. |
| 120 | + - Keep existing regression tests passing by providing a compatibility façade for temporary-client creation/cleanup, then repoint tests after behavior is stable. |
| 121 | + |
| 122 | +Trade-offs: |
| 123 | +- ✅ Reduces “big bang” risk and allows incremental verification |
| 124 | +- ✅ Keeps rollout flexible if stage-validation ordering needs adjustment |
| 125 | +- ❌ Requires disciplined migration plan to avoid “two sources of truth” lingering |
| 126 | + |
| 127 | +## 4) Implementation Complexity & Risk |
| 128 | + |
| 129 | +- **Effort**: **M (3–7 days)** — broad surface area (stage, factory, DI registration, connectors strategies, tests + regressions), but clear extraction boundaries exist. |
| 130 | +- **Risk**: **Medium–High** — startup critical path + leak regression sensitivity + validation/DI ordering concerns. |
| 131 | + |
| 132 | +## 5) Research Needed (carry into design phase) |
| 133 | + |
| 134 | +- **Stage validation contract**: confirm whether `validate_stages()` is intended to be “pure/no allocation” or allowed to allocate resources; if allocation is allowed, determine required cleanup hooks. |
| 135 | +- **Config injection**: decide how runtime `AppConfig` becomes authoritative in DI *before* validation (builder change vs “service uses passed config”). |
| 136 | +- **Strategy discovery/registration**: decide how `src/connectors/strategies/` is imported so new strategies register without touching `BackendStage`/`BackendFactory` (once the system is in place). |
| 137 | +- **Environment/API key precedence**: unify how “configured backend” detection works (config vs `*_API_KEY` env vars vs numbered keys) so validation does not drift from runtime behavior. |
| 138 | +- **Static route validation placement**: determine whether to implement in config models (`BackendSettings` validator) vs a dedicated config validator invoked by `ApplicationBuilder.build()` pre-stages. |
| 139 | + |
| 140 | +## Next Steps (Design Phase) |
| 141 | + |
| 142 | +- Use this gap analysis to author a design that explicitly answers: |
| 143 | + - validation sequencing + cleanup strategy |
| 144 | + - where strategies live and how they register |
| 145 | + - where static route validation runs (and how it sees registered backends) |
| 146 | +- Then run `/prompts:kiro-spec-design backend-stage-solid-refactoring` (or `-y` to auto-approve requirements first). |
0 commit comments