Content:
AAO now supports multiple account pools, any of which can be defined to have a specific set of AWS Service Quotas on a per Region Basis. Given we only request service quota changes for Accounts in an AccountPool, Service Quotas requests are currently only for non-CCS Accounts. These service quotas are set in the AAO ConfigMap under the accountpool key, example:
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-account-operator-configmap
namespace: aws-account-operator
data:
account-limit: "1234"
accountpool: |
hivei01ue1:
default: true
fm-accountpool:
servicequotas:
default:
L-1216C47A: '750'
L-0EA8095F: '200'
L-69A177A2: '255'
L-0263D0A3: '6'
us-east-1:
L-1216C47A: '760'In the above example, the default AccountPool is unaffected by service quota requests, whereas each account created in the fm-accountpool will have the default set of service quotas applied to all regions supported by it. For us-east-1 specifically, it will set L-1216C47A to 760. Accounts created in the fm-accountpool will have the servicequotas struct we see in the ConfigMap added directly into the Account CR spec - here is a link to the code where that happens.
Here is a simplified example of what the newly created AccountCR will look like at this point:
kind: Account
metadata:
creationTimestamp: "2023-03-21T18:11:04Z"
finalizers:
- finalizer.aws.managed.openshift.io
name: osd-creds-mgmt-lsw9hb
namespace: aws-account-operator
ownerReferences:
- apiVersion: aws.managed.openshift.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: AccountPool
name: fm-accountpool
spec:
accountPool: fm-accountpool
awsAccountID: ""
regionalServiceQuotas:
default:
L-0263D0A3:
value: 6
L-0EA8095F:
value: 200
L-69A177A2:
value: 255
L-1216C47A:
value: 750
us-east-1:
L-1216C47A: 760Our new Account CR should reconcile as normal, once it reaches the PendingVerification State, that is when we make the Service Quota requests. PendingVerification encapsulates 2 sets of requests to AWS:
- Creating a Support Case with AWS to enable Enterprise Support.
- Make any Service Quota requests as defined in the AccountCR spec.regionalServiceQuotas
To track the state of our service quota requests between reconciles, we load the service quotas in the Account CR spec into the Account CR status. To make life a bit easier tracking state, we build out each region in the .status.regionalServiceQuotas and keep a status field too, link to code and example:
status:
regionalServiceQuotas:
ap-northeast-1:
L-0263D0A3:
status: COMPLETED
value: 6
L-0EA8095F:
status: IN_PROGRESS
value: 200
L-69A177A2:
status: COMPLETED
value: 255
L-1216C47A:
status: COMPLETED
value: 750
ap-northeast-2:
L-0EA8095F:
status: TODO
value: 200
L-69A177A2:
status: TODO
value: 255
L-0263D0A3:
status: TODO
value: 6
L-1216C47A:
status: TODO
value: 750Begin by updating the AAO ConfigMap and then restarting the aws-account-operator pod. Upon the update, the accountpool_validation_controller.go will parse the modified AAO ConfigMap using the GetServiceQuotasFromAccountPool function: [https://github.com/openshift/aws-account-operator/blob/4af362e874f99454c7b69166588c4536728bbfc5/pkg/utils/utils.go#L116-L165]
Subsequently, within the accountpool_validation_controller.go, all accounts associated with an accountpool will be iterated over. It will inspect the account.Spec.RegionalServiceQuotas for any discrepancies compared to the ConfigMap, updating the spec accordingly: [https://github.com/openshift/aws-account-operator/blob/91d7d76294ebd2d9e6da512b0ed2fd7d7cf759b2/controllers/validation/accountpool_validation_controller.go#L126-L182]
The updated account spec will then trigger the account_validation_controller.go, where the handling of service quota requests will occur link to code
In account_types.go you add the quota code if it's not already supported. Below is a list of supported service quotas with their respective codes: [https://github.com/openshift/aws-account-operator/blob/d0c927b56353a0a754253e8b950848315ef595b0/api/v1alpha1/account_types.go#L67-L74]. Additionally, in the same file, you must also define the service code for the supported service quota services: [https://github.com/openshift/aws-account-operator/blob/d0c927b56353a0a754253e8b950848315ef595b0/api/v1alpha1/account_types.go#L78-L82].
In the getServiceCode function, extend the servicesMap by adding additional mappings for the new quota codes you've introduced. Ensure that each quotaCode is associated with the correct service code, enabling the function to provide accurate service code lookups for the supported service quotas: [https://github.com/openshift/aws-account-operator/blob/d0c927b56353a0a754253e8b950848315ef595b0/controllers/account/service_quota.go#L96-L106].
AWS has a maximum limit of 20 Service Quota requests in flight per account at a given time, if you surpass this threshold there is a chance they will blanket deny all in-flight and subsequent requests, which forces us to put the account into a failed state. So avoid this, we batch our requests and apply a limit of 20 in-flight requests - link to code. To aid in quickly getting the different service quotas by their current state, we wrote (this helper function)[https://github.com/openshift/aws-account-operator/blob/7eaa90bb66060cc046a4e37e4f3052ed8a234395/api/v1alpha1/account_types.go#L228-L249]. Once we have our batch of service quotas to request, we need to assume role into the correct region and make the request to AWS. All of the logic around the actual Service Quota request is handled (here in the HandleServiceQuotaRequests function)[https://github.com/openshift/aws-account-operator/blob/7eaa90bb66060cc046a4e37e4f3052ed8a234395/controllers/account/service_quota.go#L19-L91]. In HandleServiceQuotaRequests, we first determine if the service quota request is even needed, if not, we don't want to overload AWS. If it is needed, we check to see if we've already requested this service quota in the past, depending on what AWS returns we update accordingly.
Once all Service Quotas have moved into a COMPLETED state, it is only then we will finally set the Account CR to a Ready state.