A Serverless plugin to implement canary deployments of Lambda functions, making use of the traffic shifting feature in combination with AWS CodeDeploy
npm i --save-dev @flagsmith/serverless-plugin-canary-deployments
This plugin creates and manages several AWS resources. The IAM user or role deploying your Serverless service needs the following permissions:
cloudformation:CreateStack
cloudformation:UpdateStack
cloudformation:DeleteStack
cloudformation:DescribeStacks
cloudformation:DescribeStackEvents
cloudformation:DescribeStackResource
cloudformation:GetTemplate
cloudformation:ValidateTemplate
codedeploy:CreateApplication
codedeploy:DeleteApplication
codedeploy:GetApplication
codedeploy:CreateDeploymentGroup
codedeploy:DeleteDeploymentGroup
codedeploy:UpdateDeploymentGroup
codedeploy:GetDeploymentGroup
codedeploy:CreateDeployment
codedeploy:GetDeployment
codedeploy:StopDeployment
lambda:CreateAlias
lambda:DeleteAlias
lambda:UpdateAlias
lambda:GetAlias
lambda:GetFunction
lambda:GetFunctionConfiguration
lambda:PublishVersion
lambda:ListVersionsByFunction
lambda:UpdateFunctionCode
lambda:UpdateFunctionConfiguration
lambda:AddPermission
lambda:RemovePermission
iam:CreateRole
iam:DeleteRole
iam:GetRole
iam:PassRole
iam:AttachRolePolicy
iam:DetachRolePolicy
iam:PutRolePolicy
iam:DeleteRolePolicy
cloudwatch:PutMetricAlarm
cloudwatch:DeleteAlarms
cloudwatch:DescribeAlarms
cloudwatch:PutCompositeAlarm
Depending on which event sources trigger your Lambda functions, you may also need:
| Event Source | Permissions |
|---|---|
| API Gateway | apigateway:* on your API resources |
| SNS | sns:Subscribe, sns:Unsubscribe |
| S3 | s3:PutBucketNotification |
| CloudWatch Events | events:PutRule, events:PutTargets, events:RemoveTargets, events:DeleteRule |
| CloudWatch Logs | logs:PutSubscriptionFilter, logs:DeleteSubscriptionFilter |
| IoT | iot:CreateTopicRule, iot:ReplaceTopicRule, iot:DeleteTopicRule |
| AppSync | appsync:UpdateDataSource |
| ALB/NLB | elasticloadbalancing:RegisterTargets, elasticloadbalancing:DeregisterTargets |
The plugin creates a CodeDeploy service role with the following managed policies attached:
arn:aws:iam::aws:policy/service-role/AWSCodeDeployRoleForLambdaLimitedarn:aws:iam::aws:policy/AWSLambda_FullAccess
If you provide your own codeDeployRole, ensure it has equivalent permissions. See example-code-deploy-policy.json for reference.
For pre/post traffic hooks to report deployment status, the plugin adds this permission to your Lambda execution role:
codedeploy:PutLifecycleEventHookExecutionStatus
To enable gradual deployments for Lambda functions, your serverless.yml should look like this:
service: canary-deployments
provider:
name: aws
runtime: nodejs6.10
iamRoleStatements:
- Effect: Allow
Action:
- codedeploy:*
Resource:
- "*"
plugins:
- serverless-plugin-canary-deployments
functions:
hello:
handler: handler.hello
events:
- http: GET hello
deploymentSettings:
type: Linear10PercentEvery1Minute
alias: Live
preTrafficHook: preHook
postTrafficHook: postHook
alarms:
- FooAlarm # When a string is provided, it expects the alarm Logical ID
- name: BarAlarm # When an object is provided, it expects the alarm name in the name property
preHook:
handler: hooks.pre
postHook:
handler: hooks.postYou can see a working example in the example folder.
type: (required) defines how the traffic will be shifted between Lambda function versions. It must be one of the following:Canary10Percent5Minutes: shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed five minutes later.Canary10Percent10Minutes: shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 10 minutes later.Canary10Percent15Minutes: shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 15 minutes later.Canary10Percent30Minutes: shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 30 minutes later.Linear10PercentEvery1Minute: shifts 10 percent of traffic every minute until all traffic is shifted.Linear10PercentEvery2Minutes: shifts 10 percent of traffic every two minutes until all traffic is shifted.Linear10PercentEvery3Minutes: shifts 10 percent of traffic every three minutes until all traffic is shifted.Linear10PercentEvery10Minutes: shifts 10 percent of traffic every 10 minutes until all traffic is shifted.AllAtOnce: shifts all the traffic to the new version, useful when you only need to execute the validation hooks.
alias: (required) name that will be used to create the Lambda function alias.preTrafficHook: (optional) validation Lambda function that runs before traffic shifting. It must use the CodeDeploy SDK to notify about this step's success or failure (more info here).postTrafficHook: (optional) validation Lambda function that runs after traffic shifting. It must use the CodeDeploy SDK to notify about this step's success or failure (more info here)alarms: (optional) list of CloudWatch alarms. If any of them is triggered during the deployment, the associated Lambda function will automatically roll back to the previous version.triggerConfigurations: (optional) list of CodeDeploy Triggers. See more details in the CodeDeploy TriggerConfiguration Documentation, or this CodeDeploy notifications guide for example uses
You can set default values for all functions in a top-level custom deploymentSettings section. E.g.:
custom:
deploymentSettings:
codeDeployRole: some_arn_value
codeDeployRolePermissionsBoundary: some_arn_value
stages:
- dev
- prod
functions:
...Some values are only available as top-level configurations. They are:
codeDeployRole: (optional) an ARN specifying an existing IAM role for CodeDeploy. If absent, one will be created for you. See the codeDeploy policy for an example of what is needed.codeDeployRolePermissionsBoundary: (optional) an ARN specifying an existing IAM permissions boundary, this permission boundary is set on the code deploy that is being created when codeDeployRole is not defined.stages: (optional) list of stages where you want to deploy your functions gradually. If not present, it assumes that are all of them.
Canary alarms are version-specific CloudWatch alarms that monitor only the new Lambda version during deployment. This solves the "pre-existing alarm" problem where a deployment fails because the previous version's errors keep an alarm in ALARM state.
canaryAlarms creates CloudWatch alarms with the ExecutedVersion dimension, which only monitors metrics from the specific Lambda version being deployed.
functions:
hello:
handler: handler.hello
deploymentSettings:
type: Canary10Percent5Minutes
alias: Live
canaryAlarms:
- preset: errors # Use preset configurationcanaryAlarms: (optional) list of version-specific CloudWatch alarms to create. Each alarm can be:
Using a preset:
canaryAlarms:
- preset: errors # Use 'errors' preset
- preset: errors # Override preset values
threshold: 5Using custom configuration:
canaryAlarms:
- metric: Duration
threshold: 5000
comparisonOperator: GreaterThanThreshold
statistic: AverageThe errors preset creates an alarm with these defaults:
| Property | Value |
|---|---|
| metric | Errors |
| namespace | AWS/Lambda |
| statistic | Sum |
| period | 60 |
| evaluationPeriods | 1 |
| datapointsToAlarm | 1 |
| threshold | 1000 |
| comparisonOperator | GreaterThanThreshold |
| treatMissingData | notBreaching |
Each alarm in canaryAlarms must use either a preset or a custom metric configuration:
Preset-based alarm (requires preset):
| Property | Type | Required | Description |
|---|---|---|---|
preset |
string | yes | Preset name (errors) |
Any other property below can be specified to override the preset defaults.
Custom metric alarm (requires metric and threshold):
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
metric |
string | yes | - | CloudWatch metric name |
threshold |
number | yes | - | Alarm threshold value |
namespace |
string | no | AWS/Lambda |
CloudWatch namespace |
comparisonOperator |
string | no | GreaterThanThreshold |
Comparison operator |
statistic |
string | no | Sum |
Metric statistic |
period |
number | no | 60 |
Period in seconds |
evaluationPeriods |
number | no | 1 |
Number of periods to evaluate |
datapointsToAlarm |
number | no | 1 |
Datapoints to trigger alarm |
treatMissingData |
string | no | notBreaching |
How to treat missing data |
When you configure canaryAlarms, the plugin generates:
- Per-function canary alarms with the
ExecutedVersiondimension - A stack composite alarm that fires if ANY function's canary alarm fires
The composite alarm uses a predictable name: ${service}-${stage}-canary-composite
You can use both alarms and canaryAlarms together:
functions:
hello:
deploymentSettings:
type: Canary10Percent5Minutes
alias: Live
alarms:
- name: my-existing-alarm
canaryAlarms:
- preset: errors # Version-specific alarmThe plugin relies on the AWS Lambda traffic shifting feature to balance traffic between versions and AWS CodeDeploy to automatically update its weight. It modifies the CloudFormation template generated by Serverless, so that:
- It creates a Lambda function Alias for each function with deployment settings.
- For functions that already have a target alias due to provisioned concurrency or SnapStart configuration, the CodeDeploy-related settings are applied to it.
- It creates a CodeDeploy Application and adds a CodeDeploy DeploymentGroup per Lambda function, according to the specified settings.
- It modifies events that trigger Lambda functions, so that they invoke the newly created alias.
For now, the plugin only works with Lambda functions invoked by
- API Gateway
- Stream based (such as the triggered by Kinesis, DynamoDB Streams or SQS)
- SNS based events
- S3 events
- CloudWatch Scheduled events
- CloudWatch Logs
- IoT rules
- AppSync DataSources
- ElasticLoadBalancingV2 TargetGroups
More events will be added soon.
ISC © David García