The Complexity of AWS Billing
When AWS spend is small, organizations often choose to ignore AWS bill complexity. Why spend a dollar on bill management to save a dime in spend? Surely a simple spreadsheet would suffice.
Not so fast.
Manual spreadsheet analysis of AWS Cost and Usage Reports (CUR) fails after the AWS footprint goes beyond select workloads and minimal spend.
All this detail is captured—somewhere—in your AWS bill, but you need more than just a spreadsheet to take a CUR data dump and deliver actionable insights. AWS billing is too complex for that.
Keeping your (AWS) balance
AWS billing complexity can complicate the discovery of the key to cloud optimization—achieving a balance between cost-effectiveness and avoiding performance risk.
The balancing act isn’t new, as it also holds for on-prem IT. But there’s a crucial difference.
Own a datacenter, and spend is locked into a depreciation cycle. Unless you are so inefficient you must go out and build more capacity, there isn’t any variability to on-prem datacenter spend.
Not so for cloud solutions.
AWS spend reflects every imbalance between cost-effectiveness and performance risk. Over rotate on AWS cost-effectiveness and fail to meet datacenter SLAs? You lose that balance and pay for the privilege of course-correcting.
An AWS-hosted datacenter does not have fixed costs. Spend is based on usage, and there are no operational restrictions to capacity—the only limit is your budget. If predictability in datacenter spend is the goal, a cloud center of excellence (CCoE) or cloud procurement team must identify levers to control costs (e.g., right-sizing, utilizing Reserved Instances, turning off wasted resources). CURs hide those levers in millions of rows of data.
Measuring cloud migration success
A cloud-first, or cloud-smart, strategy must articulate the benefits of moving to the cloud and the benefits of moving from an on-premises (on-prem) model. An AWS bill only tells part of the migration story.
An EC2 m5.large instance should, through the eyes of corporate finance, be twined with decommissioned on-prem servers. A CCoE must communicate how that isn’t an overnight process but rather, to avoid disrupting vital business or product services, an exercise in ramping down datacenter costs down while simultaneously ramping up cloud spend. A step function, not a singular event.
Read more: Forecasting Your Cloud Migration Costs
The figure below shows the costing curve for a datacenter migration. Understanding where you are in this migration journey needs more than just an AWS bill—you also need on-prem spend and the costs associated with execution (“the migration bubble”).
AWS billing can’t tell the story of increased cloud spend paired with a step-function drop in existing datacenter spend.
SKUs and payment options
Efforts to manually analyze AWS bills are swamped by too much detail (hello, per-second-billing) and too many services (~300k AWS SKUs). The luxury paradox of choice makes it difficult to pick the right choice per workload (e.g., refactoring an application for ECS or Lambda instead of using EC2), or to ensure that others are making the same choices.
Dispersed cloud procurement makes it hard to track resources and their purchase terms. EC2 alone has three types of RIs to choose from (e.g., standard, convertible, and scheduled) using one of three payment options (e.g., all upfront, partial upfront, and no upfront).
When organizations fail to unpack the complexity of a CUR, they often treat a cloud bill as a single cost center and even-spread allocations to all cloud users. There is no faster way to promote cloud waste than making everyone pay the same for it, regardless of usage.
Billing file sizes and handling large amounts of data
There can be millions of AWS billing line items, one for every resource, one per second for every resource currently running, or for every serverless operation. That’s a lot of information to analyze for a single compute resource, let alone a pool of 10,000 EC2 instances.
This mass of data leads many organizations only to do monthly AWS bill analysis even though a CUR is updated many times a day.
Organizations looking for usage trends, and wanting to react to them, need to look at cloud bills more frequently than once a month—30 days is a long time to pay for EC2 instances with less than 5% utilization (a.k.a. zombie instances) or unattached EBS volumes that are costing money but are not being used for anything.
Managing and storing an AWS bill takes more than downloading a CSV and opening Excel. For starters, CURs with more than a million rows hit the hard limits of Excel.
(To help make storage and updates easier, part of setting up your account for CUR includes creating an S3 bucket to hold the CUR files. Accessing the CUR is then a matter of syncing with the S3 bucket, ideally with a tool that can make full use of the data. )
Reserved Instance (RI) discounts
RIs save 70%+ over on-demand pricing. The devil is in the detail. A promised saving isn’t an actual saving when RIs are left unused, or on-demand resources are still available.
When it comes to billing, the trickiness with RIs and Savings Plans (SP) comes in how they’re billed. When an RI or SP is applied to a resource, it will often show as zero, while the cost of the RI/SP itself will be a separate line item. While this makes it easy to look at cash accounting, it’s not truly accurate allocation. That requires amortization (see below), as well as an understanding of the CUR file.
Manual management of RIs isn’t scalable. As usage patterns change, you can modify or exchange RI at no cost. Fail to keep track of RI usage, and you lose the opportunity to make informed RI exchange decisions.
AWS now offers on-demand, spot, RI, and Savings Plans for cloud resources. Getting the most of each requires understanding expiration dates and the dynamic changes in workload demands of the workloads.
AWS regularly releases updates to its pricing structure. SPs were new as of November 2019. A manually updated spreadsheet to manage cloud spend needs to integrate this, and every subsequent change, to provide current analysis. That’s a big ask.
One challenge for IT and DevOps management is to match revenues with their overall AWS costs at any given point in time. But, upfront charges, like the fees to use RIs, can be challenging to account for with default AWS billing features.
Every RI has its start and end date. And, with hundreds if not thousands of RIs, organizations struggle to amortize, report out on, and visualize RI charges and their accounting treatment over a one or three-year term.
Chargeback and showback
AWS bills are light on organizational detail. To understand usage, and recover costs with chargeback, you need to tag AWS bills with unique environments, business units, teams, or products. Without the right tagging, chargeback or showback have indefensible even-spread or headcount allocations. The agility, speed, and flexibility of AWS isn’t reflected in these static allocations, and end-users push back against them (“How much? I’m an Engineer, I don’t deal with guestimates”).
You need a tool to manage AWS bill complexity
Cloud replaces a world of predictable computing expenses with variable costs based on load, amortization of Reserved Instances, provider price cuts, and poor procurement transparency.
The complexity of AWS billing isn’t going away. It’s a sobering thought: your AWS bill will never be as understandable as it is today.