Finops, Guides

How to Manage AI Spend in 2026: 5 Takeaways from FinOps X

Navanita Devi

Head of Marketing

Updated June 16, 2026

5 min read

FinOps X 2026 brought together 2,500+ practitioners, engineers, and finance leaders last week. A lot was discussed. But one number cut through everything else.

Two years ago, 31% of FinOps teams were managing AI spend. Today that number is 98%.

That shift didn’t happen because AI became a boardroom priority. It happened because the invoices arrived and nobody was ready for them.

This is an organizational problem that FinOps teams are now being handed. And understanding why AI costs are so hard to manage is the first step to actually getting on top of them.

1. AI Costs Don’t Follow the Rules You Already Know

Cloud cost management has a playbook. You provision infrastructure, monitor utilization, right-size instances, and set budgets based on relatively predictable growth curves. It’s not easy, but the feedback loop is understood.

AI spend breaks almost every assumption in that playbook.

The fundamental unit is the token, not a compute hour, not a GB of storage. Token consumption scales with behavior: how often a model is called, how long the prompts are, which model tier is being used, whether outputs are cached or regenerated. A product team ships a new feature. Usage spikes. The bill looks nothing like the forecast. No one is sure why.

This is the core challenge. Traditional cloud monitoring tools aren’t built to track this. Traditional budgeting assumptions don’t hold. And the teams building AI features are often moving too fast to think about cost attribution until finance asks the question.

2. The “Flat Rate” Era Is Over

A big part of why AI spend crept up on so many organizations is that early adoption was cheap and predictable. Many companies started with flat-rate API subscriptions or bundled enterprise agreements. The bill was fixed. Nobody watched it closely.

Then usage scaled, those agreements ended, and teams moved to consumption-based pricing. Suddenly the cost model changed completely and the visibility didn’t exist to catch it early.

Gartner forecasts $2.59 trillion in global AI spending in 2026. The organizations contributing to that number aren’t all AI-native startups. They’re enterprises that started with a few LLM integrations and now have dozens of teams running production AI workloads, each generating costs that nobody centrally owns.

3. Attribution Is the Hardest Part

Ask most FinOps or finance teams where their AI spend is going and they’ll give you a number but not a breakdown. The spend shows up as a line item, OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, but mapping that back to a product, a team, or a business unit is a different problem entirely.

This matters because without attribution, you can’t prioritize. You don’t know if the spend is generating value or burning budget on low-impact use cases. You can’t have a rational conversation about whether to optimize or invest more.

The teams getting this right are the ones treating AI spend the same way mature cloud teams treat compute: tagging from day one, building allocation policies before scale, and making cost visibility a condition of deployment.

4. Engineering and Finance Are Still Talking Past Each Other

AI budgets are typically set by engineering or product leadership, based on projected usage. They get blown past by engineering, for the same reason. Finance finds out later.

The purchasing motion for AI is often decentralized with individual teams spinning up API keys, picking model tiers, building integrations in ways that cloud infrastructure procurement usually isn’t. By the time a centralized view exists, the spend pattern is already established and hard to change.

The organizations managing this well have done one thing differently: they brought finance into the conversation before production deployment, not after. That means getting finance comfortable with token economics, model pricing tiers, and usage-based forecasting, which is a real investment, but one that pays off when scale hits.

5. The Optimization Opportunity Is Real, But Sequenced Wrong at Most Companies

There’s genuine money to be saved on AI spend. Model selection, prompt optimization, caching strategies, and right-sizing inference infrastructure can meaningfully reduce costs without touching product quality. Some teams are reporting 30-40% reductions once they get serious about it.

But most organizations are trying to optimize before they have visibility. They’re skipping straight to “how do we spend less” without answering “where is the money going and why.” That’s the wrong sequence.

Visibility → attribution → optimization. In that order. Every time.

If you don’t know which teams are driving spend, which models are being used for which use cases, and which calls are redundant or cacheable, you’re guessing. And in a cost category that’s growing this fast, guessing is expensive.

Where to Start

If your organization is somewhere in the “we know AI spend is growing but we don’t have a clear picture” phase, the most useful thing you can do right now is build a simple inventory:

Which AI services are you paying for?
Which teams are using them?
Are costs tagged and attributable at the feature or product level?
Do you have alerting in place for unexpected spikes?

That’s not a sophisticated FinOps program. That’s the baseline. Most teams don’t have it yet and that’s the gap worth closing before everything else.

At Usage.ai, we work with teams who are starting to feel this pressure on their cloud and AI costs. If you want to understand your AWS baseline before the AI cost layer gets more complex, our free AWS Savings Calculator is a good place to start.

Cut cloud cost with automation

Latest from our blogs

View all posts

Azure, Guides

Azure Cloud Cost Management: From Cost Exports to Executive Dashboards

Finops

Kubernetes Cost Allocation: How to Break Down Spend by Team, Namespace, and Workload — and the Step That Comes After

Finops

Agentic FinOps: What It Actually Means, Where It Already Exists, and What the Definition Usually Misses