How finance, operations, and executives build models to forecast cloud spend and allocate budgets to business units.
You should understand the basics of how cloud works, specifically you should know the key services around compute and storage for the cloud providers your organization is using and their billing and pricing models. You will also need to understand financial processes around forecasting, budgeting, procurement, and allocations.
Depending on the cloud providers your organization is using, you can gain some of this knowledge through training and certifications. Specifically for AWS we recommend the AWS Cloud Practitioner certification, for Google the Google Cloud Platform Fundamentals course, and for Azure, the Azure Fundamentals learning path.
Fundamentally there is a potential gap between engineers, finance, and procurement where finance has financial reporting responsibilities, and procurement has accounting responsibilities, and both need assistance from engineers and leadership to meet these obligations.
In this section we are going to analyze the challenge around cloud forecasting to identify how we can overcome it and we will also provide examples of how companies of different types and FinOps maturity levels tackle cloud forecasting.
Unfortunately there is no one forecasting method that fits all situations.
Cloud spend is variable which is inherently difficult to predict. Specifically engineers can start workloads at any time typically without having to go through a procurement process.
Forecasting cloud-provider consumption as product or service consumption requires specific data and tooling to be consistently available. Billing and reporting from cloud providers is difficult to understand and explain to traditional finance teams.
Workloads need to be clearly defined whether through tagging or account structures so that cost can be attributed back to them and their owners.
Tagging or labeling is the foundation of telling apart workloads in the cloud, identifying ownership, and attributing costs to teams. Depending on the maturity of the organization, tagging may be manual, use automated tag hygiene monitoring, or integrated in CI/CD pipelines with tag-or-terminate policies in place.
Even in a best case scenario where everything taggable has been tagged in the cloud, not all cloud resources support tagging. This means that untaggable costs, like network traffic, need to be apportioned to the workloads responsible for incurring their cost.
To be able to identify ownership and attribute cost back to teams, additional tags are needed like for example cost center, VP, business unit, department, or owner, which is typically the engineer or automation that launched the workload. Which of these tags your organization will use depends on the tagging standard and your organizational structure.
Tags may also change over time, when applications are decomposed into micro services, or when organizational changes require a renaming of tags. Any system relying on tags needs to be able to handle versioning of tags to follow these changes and represent cost data accurately.
A key capability of FinOps is to enable communication between executives, finance, business, and engineers. FinOps practitioners need to strive to build a culture of communication to enable fast and high quality decision making.
A common challenge in cloud forecasting related to communication is that the people working on a forecast are not being included in decisions that substantially impact the forecast. This includes project scope changes that affect cloud spend.
Finance will have specific requirements of when forecasting is due and how frequently forecast updates are needed. Most common is an annual forecast that is due close to the end of the fiscal year of the company. Intermediate forecasts may be necessary to update budgets based on business drivers.
Depending on the maturity of an organization, specific prediction models will be easier to implement, for example trend based forecasting versus driver based forecasting. Finance will also have requirements around forecast granularity and frequency depending on their fiscal reporting requirements.
Identifying workloads performing substantially over or under when comparing forecast to actuals. For driver based forecasting identifying why workloads scale differently from their drivers. Layering in discounts, optimizations, and prepayments.
Cloud spend materiality defines where the organization focuses their resources. Lack of cloud forecasting accuracy will not be addressed until it has become a larger problem and has executive attention and sponsorship.
New cloud workloads that do not exist in the cloud yet, or new features of existing workloads that are a substantial addition, like high availability and disaster recovery or persistency models such as databases being added, will require manual estimating of these new costs.
All major cloud providers offer web-based cost calculators that allow modeling of non-existent workloads in the cloud and provide a cost estimate. However the cost estimate is only as good as the detailed model is. Typically the best resource to build the model is the engineer that is going to launch the new workload as they have in-depth subject matter knowledge.
The challenge here is that the engineer may not have a perfect view of how the actual cloud workload will look once it is launched. Common mistakes are to forget to model a specific aspect of the workload like data transfer, or to overprovision compute resources as utilization in the cloud is not yet known.
An iterative approach is recommended where the engineer revises the initial model and shares the updated estimates with the forecasting team so they can update the numbers in the forecast and layer in the new estimate.
Once a forecast is created FinOps can add value by configuring Budget alerts in AWS or spending quota in Azure to support accountability of actuals versus budget.
You want to analyze your cloud cost and make sure people are not wasting resources. It is best to look at your data and see if there are any improvements that can be made to your infrastructure. This helps in getting an accurate baseline to be able to forecast from.
FinOps & Technology training (e.g. cheaper services replacing more expensive ones)
Here are common FinOps roles and their responsibilities and expectations as they relate to building accurate cloud forecasts.
Are the primary sponsors for process improvements around cloud usage. FinOps needs their understanding, buy-in, and support so that improvements can trickle down the organizational hierarchy.
Is the main consumer of forecasts and will drive frequency, granularity, and quality requirements around forecasting.
Has established processes that need to be extended to cloud services and prepayment products such as reservations, savings plans, and committed use discounts.
Are engineers tasked with day to day operations in the cloud across all business units. They are responsible for implementing requirements from the FinOps team or Cloud Center of Excellence (CCoE) around governance, efficiency, and security.
Are engineers focussing on visibility and reporting in the cloud across all business units? They are responsible for building actionable, accurate, consistent, near real-time insights for engineers, leadership, and finance based on requirements from the FinOps team or CCoE.
Gathers requirements for FinOps processes and practices, gets buy-in from executives, and communicates requirements and deliverables to engineering leaders.
Communicate FinOps processes and practices to engineers, provide training opportunities, validate that processes are followed, and reward positive outcomes.
Are the front-line executioners of FinOps processes and practices. Finance relies on them for quality tagging for cost attribution to be accurate.
Uses historic trends to forecast future spend. Ideally this takes seasonality into consideration. Seasonality can include annual peaks during holidays but also daily peaks when more people are using a service during specific hours of the day.
Trend based forecasting will not be able to capture out-of-band events such as launching a new product or feature, launching in a new country, or the effect of TV commercials on consumer behavior.
Uses Key Performance Indicators (KPIs) to forecast the effect on business results. KPIs can be things like active accounts, widgets sold, ad impressions and so forth. The business will forecast the KPIs factoring in organic growth, like more people on the Internet, and inorganic growth, like new launches and marketing efforts. Cloud workloads that scale based on a specific business KPI are forecasted by applying the KPI growth on actual spend.
Driver based forecasting will not be able to forecast workloads that don’t exist in the cloud yet but are planned to be launched in the future.
To predict next month, quarter, and year. It allows companies to adjust their plans based on any shifts in the business such as economic changes, COVID as an example. As the economy changed a rolling forecast would be adjusted to forecast that change and allow the company to alter their plans with the new data.
Predict for the fiscal year only with no adjustments.
Are planned cloud workloads that currently do not yet exist in the cloud. Their cost needs to be estimated by engineers and layered into trend or driver based forecasting to get a complete picture of future cloud spend. Special projects can also be costs that will not materialize on the cloud bill like licensing fees, professional services, or small workloads running on other cloud providers where automation isn’t feasible.
Tools are very helpful when it comes to forecasting because they have sophisticated algorithms to apply to your usage/cost data.
The FinOps Foundation extends a huge thank you to the members of the Special Interest Group that broke ground on this documentation:
If we’ve missed anyone, let us know. We thank you all for your contributions.
Terms that were previously in this appexdix can be found on the FinOps Terminology page.