This content was provided as a Professional Contribution through the FinOps Certified Professional program.
Summary: FinOps practitioners can implement gamification to improve accountability and support container optimization. Deploying shared dashboards that rank cost and resource efficiency by application namespace, celebrating the top-performing teams in monthly newsletters, and offering targeted coaching (such as implementing HPA/VPA scaling or start-stop scripts) to under-performers shifts the culture from top-down enforcement to peer-driven ownership. By making cloud usage transparent and comparative, engineers become motivated to proactively optimize their Kubernetes environments without sacrificing reliability.
Facilitating the migration of applications from on-premise infrastructure to Kubernetes in the AWS Cloud was originally driven by a strategic decision from leadership, aiming to capitalize on the cloud’s scalability and agility. However, following the transition, we encountered significant challenges with uncontrolled spending, with costs becoming excessively high for some applications.
This unexpected surge in expenses prompted a crucial shift in focus towards tighter financial control and enhanced resource management.
Here are insights on how we navigated these challenges, gamified the optimization of containers, and implemented FinOps practices to optimize spending while ensuring efficient resource utilization across our cloud operations.
Prior to FinOps actions, our transition to Kubernetes presented several challenges, largely due to its novelty among many DevOps team members. With a primary focus on ensuring production stability, the DevOps teams allocated high guaranteed vCPU and memory resources across the board to maintain service availability. This approach led to an exact replication of production environments in both development and pre-production, significantly increasing costs.
Additionally, there was a lack of attention to unused pods, which continued to consume resources unnecessarily. Moreover, the teams had not implemented Vertical Pod Autoscaler (VPA) and Horizontal Pod Autoscaler (HPA) for autoscaling, missing critical opportunities to optimize resource usage and cost efficiency.
Faced with an urgent need for immediate results, driven by leadership directives, we initially organized meetings with the DevOps teams responsible for the applications that consumed the most resources. During these sessions, we presented our usage analysis and provided targeted recommendations to eliminate idle resources, introduce scheduled start-stop routines, and implement rightsizing strategies. This approach yielded positive outcomes in the short term.
However, once the frequency of our meetings decreased, old habits resurfaced. The teams reverted to demanding high allocations of guaranteed vCPU and memory, justifying these requests with issues encountered during usage spikes or severe system crashes. A significant challenge we faced with this initial strategy was that it did not foster a sense of ongoing accountability among the DevOps teams, leading to repeated resource management issues.
When our initial efforts did not lead to sustainable savings, we decided to adopt a new strategy centered around ‘responsibility’ and ‘gamification’. We began by creating a PowerBI dashboard that displayed and compared the costs and resource usage for CPU and memory, segmented by application namespaces. This allowed all business units to view their costs in relation to their peers. At the end of each month, we compiled these insights into a newsletter that highlighted and celebrated the teams that achieved the most significant cost reductions.
For those teams showing less engagement, we provided specific recommendations and offered additional support. This included arranging coaching sessions with DevOps experts to assist in implementing start and stop scripts for off-peak times, such as nights and weekends. We also helped the highest consuming applications implement Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) to optimize their resource scaling.
Our revised approach not only yielded very good results but also significantly enhanced our capabilities in resource utilization and efficiency. By integrating gamification and transparent reporting into our processes, we established a stronger FinOps culture across the organization.
This shift led to increased accountability among the DevOps teams, who took greater ownership of their resource management. The teams felt a sense of pride in their actions, motivated by the positive recognition from their peers and the detailed feedback provided. Moreover, this transformation received strong support from leadership, reinforcing the importance of FinOps principles at both the management and team levels.
This collective effort not only improved our FinOps efficiency but also solidified a culture of cost-awareness and proactive management.
Initially driven by urgency, our first approach did not fully embrace the foundational FinOps principle that ‘Everyone takes ownership of their cloud usage.’ As a result, the improvements were not sustained. Recognizing this, we shifted our strategy to emphasize the importance of building references and employing gamification.
We implemented a comparative dashboard that visually encouraged DevOps teams to modify their behaviors by making the data easy to understand and act upon. This simple yet effective tool not only motivated teams but also facilitated immediate feedback on their actions, illustrating the potential future benefits of their efforts.
By simplifying the presentation and communication of data, we ensured that all team members could easily engage with and respond to the insights provided, fostering a more robust and enduring FinOps culture.