(apigw) update acl role caching key to include namespaces#5140
Open
sujay-hashicorp wants to merge 3 commits intomainfrom
Open
(apigw) update acl role caching key to include namespaces#5140sujay-hashicorp wants to merge 3 commits intomainfrom
sujay-hashicorp wants to merge 3 commits intomainfrom
Conversation
7070adf to
c86fd0e
Compare
3a33d3f to
3743030
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A customer observed intermittent API Gateway outages (
Permission denied: lacks service:write) when gateways with the same name were deployed in different Kubernetes namespaces.The investigation found cross-namespace ACL resource collisions during reconcile/cleanup, where role/policy/binding-rule state for one gateway could affect another and remove required permissions.
Changes proposed in this PR
gatewayName + namespaceinstead of onlygatewayName.managed-gateway-acl-role-<gateway>-<namespace>andapi-gateway-policy-for-<gateway>-<namespace>), and use the same key during cleanup soRemoveRoleBindingonly deletes resources for the correct gateway instance.TestCache_RemoveRoleBindingto seed/read cache entries using the new namespaced key.Before Fix
After Fix
How I've tested this PR
cd control-plane && go test ./api-gateway/cache -run TestCache_RemoveRoleBinding -count=1cd control-plane && go test ./api-gateway/cache -count=1Test outputs for this branch build
initial token list
post test ns gateway apply token list
How I expect reviewers to test this PR
cd control-plane && go test ./api-gateway/cache -count=1api-gateway-testinconsulnamespace.testnamespace.service:writeafter reconcile events (e.g., route updates / cert rotation).Checklist
PCI review checklist
I have documented a clear reason for, and description of, the change I am making.
If applicable, I've documented a plan to revert these changes if they require more than reverting the pull request.
Revert plan: revert this PR to restore previous keying/naming behavior.
If applicable, I've documented the impact of any changes to security controls.
Impact: no new security controls were added or removed; this change only scopes ACL resource identity to prevent cross-namespace collisions.