fix(BA-4282): Remove kernel row when purge scaling group preventing fk error#8614
fix(BA-4282): Remove kernel row when purge scaling group preventing fk error#8614seedspirit wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request fixes a foreign key constraint violation (IntegrityError) that occurs when purging a scaling group that has sessions with associated kernels. The issue was that the cascade deletion sequence was attempting to delete sessions before deleting their associated kernels, which violates the fk_kernels_session_id_sessions foreign key constraint (which uses the default RESTRICT behavior).
Changes:
- Added kernel deletion step in the cascade deletion sequence for
purge_scaling_group - Added comprehensive test to verify purging scaling groups with the full FK hierarchy (scaling group → session → kernel + endpoint → route)
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/ai/backend/manager/repositories/scaling_group/db_source/db_source.py |
Added deletion of KernelRow records between endpoint and session deletion to prevent FK constraint violations |
tests/unit/manager/repositories/scaling_group/test_scaling_group_repository.py |
Added test fixtures and test case to verify purging works correctly with the full entity hierarchy including kernels |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async with db_with_cleanup.begin_readonly_session() as db_sess: | ||
| sg_result = await db_sess.execute( | ||
| sa.select(ScalingGroupRow).where(ScalingGroupRow.name == sgroup_name) | ||
| ) | ||
| assert sg_result.scalar_one_or_none() is None |
There was a problem hiding this comment.
The test only verifies that the scaling group is deleted, but doesn't explicitly verify that all child records (sessions, kernels, endpoints, routes) are also deleted. While the deletion of the scaling group indirectly proves these were deleted (otherwise there would be FK constraint violations), it would be more comprehensive to explicitly verify that each child record type is deleted. Consider adding assertions to check that no sessions, kernels, endpoints, or routes remain after the purge operation.
| """Purges a scaling group and all related sessions and routes using a purger. | ||
|
|
There was a problem hiding this comment.
The cascade delete order documentation was removed from the docstring. While the implementation steps are now documented inline with comments, it would be helpful to keep a high-level mention of the deletion order in the docstring since this is critical behavior that prevents FK constraint violations. Consider adding back a brief note like "Deletes in order: routes, endpoints, kernels, sessions, then scaling group" to help future developers understand the method's behavior at a glance.
| """Purges a scaling group and all related sessions and routes using a purger. | |
| """Purges a scaling group and all related sessions, routes, endpoints, and kernels. | |
| Deletes in order: routes, endpoints, kernels, sessions, then scaling group. |
resolves #8613 (BA-4282)
Checklist: (if applicable)
ai.backend.testdocsdirectory