A CloudFormation stack evolves over time and usually costs increase as well. You’ll probably not only have one stack, but instead have at least a production and a development stack. Even one development stack per developer might be common. This means the total costs increase even more. In order to prevent this, you might want to shut down CloudFormation stack resources over night to save costs. A neat way is to use AWS Lambda functions for this. You schedule them to execute after and before a regular working day to shut down or start up the stack resources. This blog post describes the steps to accomplish such a setup to decrease costs for developer stacks.
Identify the expensive resources
Before starting to decrease costs, you should be aware of your costs. I already wrote another article about keeping your AWS budget under control and this blog post extends my initial approach. It means you need to identify the resources and services which are consuming the biggest part of of your bill. Just head over to your last bills in your Billing Management Console and investigate the most expensive services. In a developer stack you’ll probably see something like EC2 or DynamoDB. These services are always on, i.e. they just run if you don’t stop them (Kinesis is another example for such a service). So, we’ll have to find a way to shut down those resources or at least try to reduce their costs based on configuration changes.
Consider different pricing models
As a next step you have to consider that each service is using a different pricing model. Hence, you have to use a different way for each resource. For example, EC2 pricing is based on an hourly pricing (also per seconds, depends on the instances). In order to save costs, you’d have to shut down the instance completely. As a counterexample, DynamoDB’s pricing model is based on the provisioned read and write capacity and how much data you’ve stored in there. Let’s ignore the amount of stored data for now as developer stacks usually don’t contain a lot data. Then the costs are mainly driven by the provisioned read and write capacity. Here you can choose between scaling down the provisioned capacities or by removing the DynamoDB resources completely (see advantages/disadvantages below).
Now that you know the expensive resources, you are able to reduce the costs. You could either do that manually by shutting down resources after each working day and restarting them again before beginning your working day. This is a tedious approach and I suggest you automate that. AWS Lambda is a perfect solution for this: you can create a Lambda function which shuts down and starts up resources each day. Then, scheduled events based on a cron expression can trigger this function after you’ve finished work and before you start your working day again.
Select a shut down approach
The basic setup sounds easy, but how do you actually shut down resources based on the different pricing models? And how do you make sure that they’re started again the next day? In principle you can choose one of the following approaches:
Delete complete stack
The most basic approach is to delete a whole stack and build it up again each day. This is fairly easy, but comes with a few drawbacks:
- It can take a long time depending on your resources. For example, if you use CloudFront, certain initialization steps can take about 30 or more minutes.
- You can lose data. As an example, you’re using DynamoDB instances containing data for a feature you’re implementing. Now you remove the instances in the evening and your data is lost. That means you have to restore the data somehow. Of course, there are options (and also automatic database backup options) to do that, but it requires more work for you and backups aren’t free as well.
- You can get in conflict with S3 bucket names, because AWS does not guarantee that you can reuse them. You can minimize the chances to run into this by prefixing your buckets with e.g. a unique name. But you’re not 100% safe!
Of course, a good reason to use this approach is because its’s easy. But due to the drawbacks, let’s look at further ways to shut down such resources.
Use parameters to control resources scalings
Another approach is to use parameters within your CloudFormation template. For example, you can provide parameters to set the provisioned read or write capacity for your DynamoDB tables. In this case you’d just make a stack update to your CloudFormation stack by providing new parameters. Hence, CloudFormation will take care of scaling down your resources. This also works for other resources like AutoScalingGroup’s or Kinesis. The drawback with this approach is that it can’t be used for all resources.
Use conditions to remove/add resources
You can extend the parameter approach by introducing conditions to your CloudFormation template. Conditions are evaluated on the template parameters and can be placed on stack resources. Only if a condition evaluates to “true”, a resource is created. For example, you could add a parameter like “OverNightShutdown” to your template. Then, you can add a condition evaluating if this parameter is “false”. If this evaluation is true, the resource is created. Otherwise CloudFormation won’t create or in case of a shutdown even remove it. But also this approach brings a few drawbacks:
- If you remove certain resource, you have to consider the dependencies within your template (or maybe even outside of your template). If you’d like to shut down resources, you also have to shut down all resources which are referencing those. This can get tricky and you might even consider to remove the whole stack if you have to remove 90% of the resources due to their dependencies.
- If you’re using the Serverless Application Model (SAM) this approach might not work properly, because resources of type AWS::Serverless::Function don’t support conditions yet (support might be added in the next months). This is especially bad in combination with the previous drawback regarding the resource dependencies.
As you can see there is no “best approach” to use. In a recent project we’ve decided to use the last approach using conditions on certain expensive resources. However, you might also have reasons to not use it. Whatever you choose, please follow this general rule: never remove managed resources manually from your stack. You will run into unforeseeable situations and problems!