Shut down CloudFormation stack resources over night using AWS Lambda - Sebastian Hesse

A CloudFormation stack evolves over time and usually costs increase as well. You’ll probably not only have one stack, but instead have at least a production and a development stack. Even one development stack per developer might be common in your organization. This means the total costs increase even more. In order to prevent paying for idle resources, you can shutdown CloudFormation stack resources over night to save costs. A nice option is AWS Lambda here. You can schedule a Lambda function to stop or start resources after or before your working day. This blog post describes the steps to accomplish such a setup to decrease costs for developer stacks.

Identify the expensive resources

Before starting to decrease costs, you should be aware of the type of your costs. I already wrote another article about keeping your AWS budget under control and this blog post extends my initial approach. In short, you need to identify the resources and services which are consuming the biggest part of of your bill. A first step is to head over to your last bills in your Billing Management Console and investigate the most expensive services. In a developer stack you’ll probably see services like EC2, Fargate or others like provisioned DynamoDB tables, Kinesis, and more. These services are always on, i.e. they just run and cost you money if you don’t stop them. So, we’ll have to find a way to shut down these resources or at least try to reduce their costs.

Consider different pricing models

As a next step you have to consider that each service is using a different pricing model. Hence, you have to use a different approach for each resource. For example, EC2 pricing is based on an hourly pricing (or per second, depends on the instances type). In order to save costs, you’d have to shut down the instance completely. As a counterexample, DynamoDB’s pricing model is based on provisioned read and write capacity and how much data you’ve stored in there. Let’s ignore the amount of stored data for now as developer stacks usually don’t contain a lot data. Then, the costs are mainly driven by the provisioned read and write capacity. Here you can choose between scaling down the provisioned capacities or by removing the DynamoDB resources completely (see advantages/disadvantages below). Similar for Kinesis, you either decrease the shards of your Kinesis stream or you remove the Kinesis resources from your stack.

Basic setup

Now that you know the expensive resources and how you can approach your cost savings, you need to reduce the costs now. You could either do that by manually shutting down resources after each working day and restarting them again before the beginning of your working day. This is a tedious approach and I suggest to automate that. AWS Lambda is a perfect solution here: you can create a Lambda function which shuts down and starts up resources each day. Then, a scheduled event based on a cron expression can trigger the function after you’ve finished work and before you start your working day the next day. A good cron expression is considering your working days, for example:

# shutdown schedule: 6pm each weekday
cron(0 18 ? * MON-FRI *)

# startup schedule: 6am each weekday
cron(0 6 ? * MON-FRI *)

Select a shutdown approach

The basic setup sounds easy. You have a CloudFormation stack to manage your resources and a Lambda function to start/stop them. But how do you actually shut down resources considering the different pricing models? And how do you make sure that they’re started again the next day? In principle you can choose between one of the following approaches:

Delete complete stack

The most basic approach is to delete the whole CloudFormation stack and build it up again each day. This is fairly easy but reveals a few drawbacks:

It can take a long time to shutdown your resources. ~~For example, if you use CloudFront, certain initialization steps can take about 30 or more minutes.~~ (This is not true anymore because CloudFront drastically improved the time to create/update a distribution)
You can lose data. As an example, let’s consider you’re using DynamoDB tables containing data for your test environment. Now you remove the instances in the evening. Your data will be deleted. That means you have to restore the data somehow which usually takes time. Of course, there are backup or recovery options (and also automatic database backup options) to do that. But it requires more work or time and backups aren’t free as well.
You can lose S3 bucket names, because AWS does not guarantee that you can reuse them. You can minimize the chances to run into this by prefixing your buckets with e.g. a unique name. But you’re not 100% safe!

Of course, a good reason to use this approach is because its’s easy. However, due to the drawbacks, let’s look at further ways to shut down the resources.

[mailjet_subscribe widget_id=”3″]

Use parameters to control resources scalings

Another approach is to use parameters within your CloudFormation template. (It’s similar to using parameters to create multiple stack environments) For example, you can provide parameters to set the provisioned read or write capacity for your DynamoDB tables. In this case you’d just make a stack update to your CloudFormation stack by providing new parameters. Then CloudFormation will take care of scaling down your resources. This also works for resources like AutoScalingGroup’s or Kinesis. See some example code:

AWSTemplateFormatVersion: '2010-09-09'
Description: 'Template with parameters for DynamoDB provisioned table'

Parameters:
  ProvisionedWriteCapacity:
    Description: 'Provisioned write capacity'
    Type: String
  ProvisionedReadCapacity:
    Description: 'Provisioned read capacity'
    Type: String

Resources: 
  MyDynamoDBTable: 
    Type: AWS::DynamoDB::Table
    Properties: 
      AttributeDefinitions: 
        - AttributeName: "Id"
          AttributeType: "S"
      KeySchema: 
        - AttributeName: "Id"
          KeyType: "HASH"
      ProvisionedThroughput: 
        ReadCapacityUnits: !Ref ProvisionedReadCapacity
        WriteCapacityUnits: !Ref ProvisionedWriteCapacity
      TableName: "MyTableName"

Now you can use a Lambda function and update the stack by providing new values for your CloudFormation stack parameters. Calling an update on a stack isn’t a big deal but there is a surprise here: since CloudFormation will start/stop resources on your behalf, you need to make sure to set the appropriate permissions in your Lambda function’s role policy as well. I won’t cover this in detail now but you can use CloudTrail to identify the necessary permissions.

The drawback of this approach is that it can’t be used for all resources. For example, if you’re using single EC2 instances in your CloudFormation (sounds like a bad idea in general but there might be good use cases), you can’t use a parameter like above. In this case you need to use conditions which are discussed below.

Use conditions to remove/add resources

You can extend the parameter approach by introducing conditions to your CloudFormation template. Conditions are evaluated on the template parameters and can be placed on stack resources. Only if a condition evaluates to “true”, a resource is created. For example, you could add a parameter like “OverNightShutdown” to your template. Then, you can add a condition evaluating if this parameter is “false”. If this evaluation is true, the resource is created. Otherwise CloudFormation won’t create or in case of a shutdown even remove it. Here’s an example for using conditions:

AWSTemplateFormatVersion: '2010-09-09'
Description: 'Template with conditions to shutdown resources'

Parameters:
  OverNightShutdown:
    Description: 'Indicates if certain resources should be shutdown overnight'
    Type: String

Conditions:
  IsShutdownResource: !Equals [!Ref OverNightShutdown, 'true']

Resources:
  MyDynamoDBTable:
    Type: AWS::DynamoDB::Table
    Condition: IsShutdownResource
    Properties:
      # ...

Unfortunately, this approach also has a few drawbacks:

If you remove a resource, you have to consider the dependencies within your template (or maybe even outside of your template). If you’d like to shut down resources, you also have to shut down all resources which are referencing those. This can get tricky and you might even consider to remove the whole stack if you have to remove 90% of the resources due to their dependencies. (Or don’t use this condition approach at all)
If you’re using the Serverless Application Model (SAM) this approach might not work properly, because resources of type AWS::Serverless::Function don’t support conditions yet (support might be added in the next months). This is especially bad in combination with the previous drawback regarding the resource dependencies. It’s now supported, see issue #142 in the SAM GitHub project.

Summary

As you can see, there is no one solution. You might be even considering a mix of the approaches, e.g. using parameters and conditions to shutdown instances or provisioned capacities. That’s what we actually did in a recent project. Whatever you choose, please follow this general rule: never remove managed resources manually from your managed stack. You will very likely run into problems!