Skip to content

refactor(BA-4005): migrate direct access pattern of group_config to directory pattern (phase 2)#8207

Draft
kwonkwonn wants to merge 3 commits intomainfrom
refactor/migrate-groupconfig-torepository-pattern-phase2
Draft

refactor(BA-4005): migrate direct access pattern of group_config to directory pattern (phase 2)#8207
kwonkwonn wants to merge 3 commits intomainfrom
refactor/migrate-groupconfig-torepository-pattern-phase2

Conversation

@kwonkwonn
Copy link
Contributor

resolves #8061 (BA-4005)

For better readability and PR's functionality based division, This PR address APi handler modification and tests for it.

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

Copilot AI review requested due to automatic review settings January 22, 2026 04:22
@github-actions github-actions bot added size:XL 500~ LoC Intern-OKR comp:manager Related to Manager component labels Jan 22, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the groupconfig API to use the repository/service/processor pattern architecture, which is part of Phase 2 of BA-4005 for better code organization and maintainability in the dotfiles management feature.

Changes:

  • Introduced GroupConfigRepository with database operations for group configuration (dotfiles) management
  • Created GroupConfigService layer with authorization logic and business rules
  • Added GroupConfigProcessors to integrate with the action processing framework
  • Refactored API handlers to delegate to the new service/processor layers instead of direct database access
  • Added comprehensive unit tests for repository, service, and API handler layers

Reviewed changes

Copilot reviewed 22 out of 29 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/ai/backend/manager/repositories/group_config/repository.py New repository layer handling dotfile CRUD operations with validation logic
src/ai/backend/manager/repositories/group_config/db_source/db_source.py Database source for group config queries and updates
src/ai/backend/manager/repositories/group_config/types.py Type definitions for group configuration operations
src/ai/backend/manager/repositories/group_config/repositories.py Repository factory class
src/ai/backend/manager/services/group_config/service.py Service layer with authorization and business logic
src/ai/backend/manager/services/group_config/processors.py Processor package for action handling
src/ai/backend/manager/services/group_config/actions/*.py Action classes for dotfile operations
src/ai/backend/manager/api/groupconfig.py Refactored handlers to use processor pattern
src/ai/backend/manager/services/processors.py Integration of group_config processors
src/ai/backend/manager/repositories/repositories.py Integration of group_config repositories
tests/unit/manager/services/group_config/test_service.py Unit tests for service layer (contains critical bugs)
tests/unit/manager/repositories/group_config/test_group_config_repository.py Unit tests for repository layer
tests/unit/manager/api/groupconfig/test_handlers.py Unit tests for API handlers
changes/7956.enhance.md Changelog entry

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +167 to +183
@group_config_repository_resilience.apply()
async def remove_dotfile(self, group_id: uuid.UUID, path: str) -> None:
"""
Remove a dotfile from the group.

Raises:
ProjectNotFound: If group is not found
DotfileNotFound: If dotfile with the path does not exist
"""
result = await self._db_source.get_dotfiles(group_id)

new_dotfiles = [x for x in result.dotfiles if x["path"] != path]
if len(new_dotfiles) == len(result.dotfiles):
raise DotfileNotFound

dotfile_packed = msgpack.packb(new_dotfiles)
await self._db_source.update_dotfiles(group_id, dotfile_packed)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has a race condition. It performs a read-modify-write operation where get_dotfiles and update_dotfiles are executed in separate database transactions. Concurrent deletions could lead to data inconsistencies. The entire operation should be wrapped in a single database transaction with appropriate locking (e.g., SELECT FOR UPDATE) or use optimistic concurrency control.

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +125
duplicate = [x for x in result.dotfiles if x["path"] == dotfile.path]
if len(duplicate) > 0:
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicate check can be simplified using any() for better readability and performance. Consider replacing with: if any(x["path"] == dotfile.path for x in result.dotfiles):

Suggested change
duplicate = [x for x in result.dotfiles if x["path"] == dotfile.path]
if len(duplicate) > 0:
if any(x["path"] == dotfile.path for x in result.dotfiles):

Copilot uses AI. Check for mistakes.
Comment on lines +370 to +428
class TestDeleteDotfile:
@pytest.mark.asyncio
async def test_delete_dotfile_success(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# test"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = DeleteDotfileAction(
group_id_or_name=group_id, domain_name=domain_name, path=".bashrc"
)

result = await service.delete_dotfile(action)

assert result.success is True
mock_repo.update_dotfiles.assert_called_once()

@pytest.mark.asyncio
async def test_delete_dotfile_not_found(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# test"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = DeleteDotfileAction(
group_id_or_name=group_id, domain_name=domain_name, path=".zshrc"
)

with pytest.raises(DotfileNotFound):
await service.delete_dotfile(action)

@pytest.mark.asyncio
async def test_delete_dotfile_group_not_found(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
mock_repo.get_dotfiles.side_effect = ProjectNotFound
action = DeleteDotfileAction(
group_id_or_name=group_id, domain_name=domain_name, path=".bashrc"
)

# delete_dotfile converts ProjectNotFound to DotfileNotFound
with pytest.raises(DotfileNotFound):
await service.delete_dotfile(action)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests in this class mock mock_repo.resolve_group_for_admin, but this method doesn't exist in the GroupConfigRepository interface. Tests should mock the actual repository methods: resolve_group_id, get_group_domain (when not superadmin), and remove_dotfile.

Copilot uses AI. Check for mistakes.
Comment on lines +94 to +210
@pytest.mark.asyncio
async def test_create_dotfile_no_leftover_space(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
mock_repo.get_dotfiles.return_value = ([], 0)
action = CreateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".bashrc",
data="# test",
permission="644",
)

with pytest.raises(DotfileCreationFailed, match="No leftover space"):
await service.create_dotfile(action)

@pytest.mark.asyncio
async def test_create_dotfile_limit_reached(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
# Create 100 existing dotfiles
existing_dotfiles = [
{"path": f".file{i}", "perm": "644", "data": "# test"} for i in range(100)
]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = CreateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".bashrc",
data="# test",
permission="644",
)

with pytest.raises(DotfileCreationFailed, match="limit reached"):
await service.create_dotfile(action)

@pytest.mark.asyncio
async def test_create_dotfile_reserved_path(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
mock_repo.get_dotfiles.return_value = ([], 64000)
action = CreateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".ssh/authorized_keys", # Reserved path
data="# test",
permission="644",
)

with pytest.raises(InvalidAPIParameters, match="reserved"):
await service.create_dotfile(action)

@pytest.mark.asyncio
async def test_create_dotfile_duplicate_path(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# existing"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = CreateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".bashrc",
data="# new",
permission="644",
)

with pytest.raises(DotfileAlreadyExists):
await service.create_dotfile(action)

@pytest.mark.asyncio
async def test_create_dotfile_exceeds_max_size(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
mock_repo.get_dotfiles.return_value = ([], 64000)
# Create large data that exceeds maximum size
large_data = "x" * 70000
action = CreateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".bashrc",
data=large_data,
permission="644",
)

with pytest.raises(DotfileCreationFailed, match="No leftover space"):
await service.create_dotfile(action)

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests in this class mock mock_repo.resolve_group_for_admin, but this method doesn't exist in the GroupConfigRepository interface. Tests should mock the actual repository methods: resolve_group_id, get_group_domain (when not superadmin), and add_dotfile.

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +165
@group_config_repository_resilience.apply()
async def modify_dotfile(self, group_id: uuid.UUID, dotfile: DotfileInput) -> None:
"""
Update an existing dotfile in the group.

Raises:
ProjectNotFound: If group is not found
DotfileNotFound: If dotfile with the path does not exist
DotfileCreationFailed: If updated content exceeds size limit
"""
result = await self._db_source.get_dotfiles(group_id)

new_dotfiles = [x for x in result.dotfiles if x["path"] != dotfile.path]
if len(new_dotfiles) == len(result.dotfiles):
raise DotfileNotFound

new_dotfiles.append({
"path": dotfile.path,
"perm": dotfile.permission,
"data": dotfile.data,
})
dotfile_packed = msgpack.packb(new_dotfiles)
if len(dotfile_packed) > MAXIMUM_DOTFILE_SIZE:
raise DotfileCreationFailed("No leftover space for dotfile storage")

await self._db_source.update_dotfiles(group_id, dotfile_packed)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has a race condition. It performs a read-modify-write operation where get_dotfiles and update_dotfiles are executed in separate database transactions. Concurrent updates could lead to lost modifications. The entire operation should be wrapped in a single database transaction with appropriate locking (e.g., SELECT FOR UPDATE) or use optimistic concurrency control.

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +79
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
mock_repo.get_dotfiles.return_value = ([], 64000)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is mocking mock_repo.resolve_group_for_admin and mock_repo.get_dotfiles, but the actual service implementation calls self._group_config_repository.resolve_group_id(), self._group_config_repository.get_group_domain(), and self._group_config_repository.add_dotfile(). The mocked methods don't match the actual repository interface, which means these tests are not properly testing the service behavior. The tests should mock resolve_group_id, get_group_domain (when user is not superadmin), and add_dotfile instead.

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +251
class TestListDotfiles:
@pytest.mark.asyncio
async def test_list_dotfiles_success(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_user.return_value = (group_id, domain_name)
expected_dotfiles = [
{"path": ".bashrc", "perm": "644", "data": "# bash"},
{"path": ".zshrc", "perm": "644", "data": "# zsh"},
]
mock_repo.get_dotfiles.return_value = (expected_dotfiles, 64000)
action = ListDotfilesAction(group_id_or_name=group_id, domain_name=domain_name)

result = await service.list_dotfiles(action)

assert result.dotfiles == expected_dotfiles
mock_repo.get_dotfiles.assert_called_once_with(group_id)

@pytest.mark.asyncio
async def test_list_dotfiles_empty(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_user.return_value = (group_id, domain_name)
mock_repo.get_dotfiles.return_value = ([], 64000)
action = ListDotfilesAction(group_id_or_name=group_id, domain_name=domain_name)

result = await service.list_dotfiles(action)

assert result.dotfiles == []

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests in this class mock mock_repo.resolve_group_for_user, but the actual repository doesn't have this method. The service layer's _resolve_group_for_user is a private method that calls resolve_group_id, get_group_domain, and check_user_in_group on the repository. Tests should mock these actual repository methods instead.

Copilot uses AI. Check for mistakes.
Comment on lines +253 to +294
class TestGetDotfile:
@pytest.mark.asyncio
async def test_get_dotfile_success(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_user.return_value = (group_id, domain_name)
existing_dotfiles = [
{"path": ".bashrc", "perm": "644", "data": "# bash"},
{"path": ".zshrc", "perm": "644", "data": "# zsh"},
]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = GetDotfileAction(
group_id_or_name=group_id, domain_name=domain_name, path=".bashrc"
)

result = await service.get_dotfile(action)

assert result.dotfile["path"] == ".bashrc"
assert result.dotfile["data"] == "# bash"

@pytest.mark.asyncio
async def test_get_dotfile_not_found(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_user.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# bash"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = GetDotfileAction(group_id_or_name=group_id, domain_name=domain_name, path=".zshrc")

with pytest.raises(DotfileNotFound):
await service.get_dotfile(action)

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests in this class mock mock_repo.resolve_group_for_user, but this method doesn't exist in the GroupConfigRepository interface. Tests should mock the actual repository methods: resolve_group_id, get_group_domain, and check_user_in_group.

Copilot uses AI. Check for mistakes.
Comment on lines +296 to +368
class TestUpdateDotfile:
@pytest.mark.asyncio
async def test_update_dotfile_success(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# old"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = UpdateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".bashrc",
data="# new",
permission="755",
)

result = await service.update_dotfile(action)

assert result.group_id == group_id
mock_repo.update_dotfiles.assert_called_once()

@pytest.mark.asyncio
async def test_update_dotfile_not_found(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# old"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
action = UpdateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".zshrc",
data="# new",
permission="755",
)

with pytest.raises(DotfileNotFound):
await service.update_dotfile(action)

@pytest.mark.asyncio
async def test_update_dotfile_exceeds_max_size(
self,
service: GroupConfigService,
mock_repo: AsyncMock,
group_id: uuid.UUID,
domain_name: str,
user_context: UserData,
) -> None:
mock_repo.resolve_group_for_admin.return_value = (group_id, domain_name)
existing_dotfiles = [{"path": ".bashrc", "perm": "644", "data": "# old"}]
mock_repo.get_dotfiles.return_value = (existing_dotfiles, 64000)
large_data = "x" * 70000
action = UpdateDotfileAction(
group_id_or_name=group_id,
domain_name=domain_name,
path=".bashrc",
data=large_data,
permission="644",
)

with pytest.raises(DotfileCreationFailed, match="No leftover space"):
await service.update_dotfile(action)

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests in this class mock mock_repo.resolve_group_for_admin, but this method doesn't exist in the GroupConfigRepository interface. Tests should mock the actual repository methods: resolve_group_id, get_group_domain (when not superadmin), and modify_dotfile.

Copilot uses AI. Check for mistakes.
Comment on lines +107 to +138
@group_config_repository_resilience.apply()
async def add_dotfile(self, group_id: uuid.UUID, dotfile: DotfileInput) -> None:
"""
Add a new dotfile to the group.

Raises:
ProjectNotFound: If group is not found
DotfileAlreadyExists: If dotfile with same path already exists
DotfileCreationFailed: If no space left or limit reached
"""
result = await self._db_source.get_dotfiles(group_id)

if result.leftover_space == 0:
raise DotfileCreationFailed("No leftover space for dotfile storage")
if len(result.dotfiles) >= 100:
raise DotfileCreationFailed("Dotfile creation limit reached")

duplicate = [x for x in result.dotfiles if x["path"] == dotfile.path]
if len(duplicate) > 0:
raise DotfileAlreadyExists

new_dotfiles = list(result.dotfiles)
new_dotfiles.append({
"path": dotfile.path,
"perm": dotfile.permission,
"data": dotfile.data,
})
dotfile_packed = msgpack.packb(new_dotfiles)
if len(dotfile_packed) > MAXIMUM_DOTFILE_SIZE:
raise DotfileCreationFailed("No leftover space for dotfile storage")

await self._db_source.update_dotfiles(group_id, dotfile_packed)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has a race condition. It performs a read-modify-write operation where get_dotfiles and update_dotfiles are executed in separate database transactions. If two concurrent requests try to add dotfiles simultaneously, they could both read the same state, modify it independently, and the second write would overwrite the first, causing data loss. The entire operation should be wrapped in a single database transaction with appropriate locking (e.g., SELECT FOR UPDATE) or use optimistic concurrency control.

Copilot uses AI. Check for mistakes.
kwonkwonn and others added 3 commits January 22, 2026 13:31
- Add GroupConfigRepository with db_source pattern
- Implement resolve_group_id, get_group_domain, get_dotfiles, update_dotfiles
- Add comprehensive repository tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add GroupConfigService with CRUD actions for dotfiles
- Implement processors for list, get, create, update, delete dotfiles
- Add comprehensive service and processor tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace direct database access with GroupConfigService calls
- Simplify handlers by delegating to processors
- Add comprehensive handler unit tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@kwonkwonn kwonkwonn force-pushed the refactor/migrate-groupconfig-torepository-pattern-phase2 branch from 26a8ea8 to d5f87c5 Compare January 22, 2026 04:38
@kwonkwonn kwonkwonn changed the title refactor(BA-4005): add GroupConfigRepository for dotfiles management (Phase 2) refactor(BA-4005): add GroupConfig Repository for dotfiles management (Phase 2) Jan 22, 2026
@kwonkwonn kwonkwonn changed the title refactor(BA-4005): add GroupConfig Repository for dotfiles management (Phase 2) refactor(BA-4005): migrate direct access pattern of group_config to directory pattern (phase 2) Jan 22, 2026
@kwonkwonn kwonkwonn marked this pull request as draft January 22, 2026 05:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component Intern-OKR size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

migrate groupconfig to use processors pattern (phase 1)

1 participant