Krisztian Banhidy
Peak
Amazon RDS (Relational Database Service) instances can be forgotten and left running. Common evidence for this occurrence is when there are no connections to the database. This playbook will help you to identify and take action on unused RDS instances.
We acknowledge that the specific instructions to perform such activity may change over time as AWS rolls out new features, pricing models, user interfaces, etc. and have tried to link out to relevant AWS provided documentation where possible to help this document stay relevant over time. The insights, instructions, and resources provided herein, in combination with those available direct from AWS should help individuals have a more complete understanding of this action as it pertains to FinOps and usage optimization.
Engineers are suited to utilize this playbook due to its need for in-depth technical investigation. While FinOps Engineers can also execute the playbook if they have enough knowledge or practice, the playbook’s technical complexity and requirement for architectural expertise make it particularly well-suited for engineers.
The person taking action needs access to the AWS account and needs permissions to Describe RDS instances, and also get the relevant Cloudwatch metrics. AWS account access is required. See below documentation if needed.
Download the following script and run it to collect information about all RDS instance sizes and check if past 1 week there have been any connections to the instance.
Download the Script: Click on the provided download link to get the script file. Typically, this file will have a “.sh” extension for Unix/Linux-based systems or “.bat” for Windows systems. Link: https://github.com/s4mur4i/finops_scripts/tree/main/aws_rds
Open the Command Line Interface (CLI):
Navigate to the Directory Where the Script is Saved: Use the cd command followed by the path to the directory where the script is saved. For example:
Navigate to the Directory Where the Script is Saved: Use the cd command followed by the path to the directory where the script is saved. For example:
bash scriptname.sh
Run the Script: Type the command to run the script in the CLI. The specific command will depend on the script’s name and extension. For Unix/Linux-based systems, it might look like this:
bash scriptname.sh
For Windows systems with a “.bat” file, it might look like:
scriptname.bat
Once the script has finished running, review the output displayed in the CLI. It will provide information about what the script did, any errors encountered, or any other relevant details. Sample output:
# ./db_connections_osx.sh Instance: used-instance, Total connections in the last 168 hours: 399.0 403.0 400 406.0 304.0 411.0 394.0 405.0 405.0 398.0 400.0 344.0 401.0 411.0 395.0 9.0 410.0 405.0 409.0 390 400.0 403.0 401.0 400 404.0 364.0 403.0 406.0 411.0 406.0 411.0 300 410.0 416.0 404.0 398.0 398.0 411.0 398.0 401.0 400.0 341.0 413.0 408.0 410 304.0 399.0 404.0 410 398.0 405.0 401.0 390 394.0 400.0 337.0 408.0 413.0 4.0 405.0 404.0 303.0 405.0 403.0 400.0 403.0 403.0 408.0 415.0 400.0 335.0 410.0 400.0 410.0 394.0 311.0 400.0 405.0 345.0 401.0 400 404.0 410 399.0 346.0 413.0 405.0 400 403.0 308.0 404.0 408.0 413.0 398.0 345.0 410.0 400 400 407.0 416.0 9.0 400.0 400 400 307.0 404.0 405.0 340.0 398.0 403.0 407.0 406.0 391.0 405.0 409.0 304.0 396.0 413.0 401.0 407.0 413.0 398.0 404.0 410 408.0 401.0 418.0 408.0 414.0 301.0 408.0 399.0 408.0 401.0 406.0 340 399.0 408.0 400 408.0 405.0 406.0 3.0 406.0 407.0 415.0 406.0 415.0 1.0 405.0 400 331.0 396.0 406.0 399.0 406.0 409.0 300.0 411.0 406.0 401.0 405.0 404.0 400 301.0 408.0 413.0 398.0 Instance: unused-instance, Total connections in the last 168 hours: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
From the above output, we can see 2 RDS instances in the account. One is called used-instance, the other one is called unused-instance. We reviewed a period of 1 week/168 hours since that gives a relatively good insight into the usage pattern of the database. For the first instance, we can see that during each data point, there were around 400 db connections, so it looks like the database is utilized. The second RDS instance, called unused-instance, has no observed database connection during the 1 week, so there is a high probability that the database is not being utilized.
Note: A 1-week timeframe might not be enough for all cases. Some databases might have a lower volume of connections. Engineers and account owners review findings to verify whether a resource is actionable. You may also seek to collect RDS metrics via alternative methods not described in this paper. AWS provides instructions on doing so via Trusted Advisor in the RDS Idle Instances playbook from CFM Technical Implementation resources.
After identifying the resources, a notification should be sent out to resource/account owners to inform them about the possible savings in their account and the planned actions that will happen.
Example notification:
Dear Sir/Madam
In AWS account 123456789 we have identified the following RDS instances as possible candidates for termination:
Unused-instance
Please review the list and let us know if you have any objections.
After identifying, and confirming with account Owners that the given resource is not required, action can be taken to delete the given resource. Each company will have a different process for deleting resources in AWS accounts (e.g. there might be an Infrastructure or code update required, or a special change process needs to be followed). Please get details about the appropriate process for your organization from your IT or Engineering department. Additionally, AWS provides instructions on deleting a DB instance.
During deprovisioning of the resource it is recommended to create a final snapshot, in case the database still is required and needs to be restored, but after some time don’t forget to also delete the snapshot.. When terminating the RDS resource, further investigation should be carried out to see if associated manual snapshots should also be deleted.
After actions are implemented, it’s important to note that changes in billing or Cost and Usage Reports (CUR) may not be immediately visible. Typically, the initial impacts become apparent within 1-2 days. During this period, it is essential to monitor the billing data and compare it with previous records. Monitoring is important to detect any further possibility for the removal of unneeded RDS instances. Reporting helps for transparency towards stakeholders.
Key Points to Consider:
FinOps Insights for RDS Removal and Optimization
We’d like to thank the following people for their work on this Playbook:
We’d also like to thank our supporters, Donal Burke, Nicole Boyd, Dusty Bowling, Brian Robbins, and Noel Crowley.