Azure

Azure in May 2026: The Reserved Instance Retirement Clock Is Ticking and Agents Get a New AI Stack

Navanita Devi

Head of Marketing

Originally Published on June 1, 2026

Updated June 2, 2026

16 min read

Executive Summary

May 2026 was dominated by one high-stakes FinOps announcement and a steady stream of developer and AI platform updates. Three things need attention now:

Azure Reserved VM Instances for at least 14 legacy VM series will no longer be available for new purchase or renewal starting July 1, 2026. Teams with expiring RIs on Dv2, Dsv2, Dv3, Dsv3, Ev3, Esv3, A-series, and related families face a hard deadline. Expired reservations revert to pay-as-you-go with no auto-migration. At scale, the cost impact can reach $84,000 per year for a mid-size deployment.
Azure Functions Durable Task Scheduler Consumption SKU reached general availability with pay-per-use pricing for durable workflows and AI agent orchestrations. No idle capacity costs. This changes the cost model for teams running long-running agent pipelines on Azure Functions.
Microsoft Foundry expanded its model catalog with DeepSeek V4 Flash and DeepSeek V4 Pro, and added Vercel AI SDK support for TypeScript. Both reduce the cost of building AI applications on Azure for teams already using these toolchains.

For FinOps teams, the July 1 Reserved Instance retirement is the most urgent item. Teams with auto-renew enabled on affected series will not automatically receive reservation discounts after expiry. Action is required before July 1.

All pricing figures in this document require verification at azure.microsoft.com/pricing before acting on them.

Azure Reserved VM Instances Retirement: July 1, 2026 Is a Hard Deadline

This is the highest-priority announcement from May 2026 for any team running Azure VMs on commitment-based pricing. Microsoft confirmed on May 4, 2026 that Azure Reserved VM Instances for select legacy VM series will no longer be available for new purchase or renewal starting July 1, 2026.

Which VM Series Are Affected?

Microsoft has not published a single consolidated list in one blog post, but the retirement affects at least 14 to 18 older VM families. Based on Microsoft Learn documentation and the Azure Retail Prices API, the following series are confirmed or strongly indicated as affected:

VM Series	Generation	Affected RI Terms	Recommended Migration Target
A-series (Standard_A0-A7)	Legacy	1-year and 3-year	Bsv2-series or Dv5-series
Basic A-series (Basic_A0-A4)	Legacy	1-year and 3-year	Bsv2-series
Dv2 and DSv2	Gen 2 Intel Haswell	1-year and 3-year	Dv5 or Dsv5-series
Dv3 and Dsv3	Gen 3 Intel Broadwell	1-year	Dv5 or Dsv5-series
Ev3 and Esv3	Gen 3 Intel Broadwell	1-year	Ev5 or Esv5-series
HBv2	AMD EPYC 7002	1-year and 3-year (ended April 2, 2026)	HBv3, HBv4, HBv5, or HX-series
NP-series	Xilinx FPGAs	1-year and 3-year (ended April 2, 2026)	GPU series RIs or Azure Savings Plan

Verify the complete list at learn.microsoft.com/azure/cost-management-billing/reservations/manage-legacy-vm-reservations-after-july-1-2026 – the list is subject to change.

What Happens If You Do Nothing Before July 1?

If a reservation on an affected series expires on or after July 1, 2026, it will not renew. Even if auto-renew is enabled in Cost Management, the reservation simply ceases to exist and the underlying VMs revert immediately to pay-as-you-go rates. There is no grace period.

The cost impact is material. Pay-as-you-go rates can be up to 72% higher than Reserved Instance rates. A team running 20 D8s v3 VMs with active RIs faces an estimated $84,000 annual cost increase once those RIs expire without renewal. For Dv3 or Ev3 families on 3-year terms, the exposure is larger. Automated renewal scripts that target affected series by API will also break after July 1, as the reservation purchase API will reject purchases for retired series.

Three Options for Teams With Affected Reservations

Option 1 – Trade in for Azure Savings Plan for Compute: Trade existing RIs on affected series for Azure Savings Plan for Compute before expiry. Savings Plans provide flexibility across VM families and regions without tying to a specific instance type. Savings Plans save 17-65% versus pay-as-you-go depending on term and upfront payment (verify at azure.microsoft.com/pricing/offers/reservations – rates change). This is Microsoft’s primary recommendation for teams that expect workload evolution.
Option 2 – Migrate workloads to newer VM series and purchase new RIs: Migrate Dv3 to Dv5, Ev3 to Ev5, and Dv2 to Dv5 before July 1, then purchase new RIs on the supported series. Newer series provide better price-performance, and RIs remain available for current-generation families. Test performance on the new series before migrating production workloads.
Option 3 – Do nothing and accept pay-as-you-go: Only viable if the workload is being decommissioned before the RI expires or shortly after. For any long-running production workload, this option results in the largest cost increase.

Microsoft recommends the trade-in path for most customers. The trade-in returns prorated credit toward an Azure Savings Plan for Compute. Verify trade-in mechanics at learn.microsoft.com/azure/cost-management-billing/reservations/exchange-and-refund-azure-reservations – terms change.

For a breakdown of how Azure Savings Plans and Reserved VM Instances compare in savings rate, flexibility, and risk profile, the Usage.ai guide on commitment-based discounts across Azure covers each mechanism and when to use which.

AI and Developer Tools: Azure Functions GA, Microsoft Foundry Expands, and DeepSeek V4 Arrives

Azure Functions Durable Task Scheduler Consumption SKU: Generally Available

Azure Functions Durable Task Scheduler Consumption SKU reached general availability in early May 2026. The Consumption SKU uses pay-per-use pricing for durable workflows and AI agent orchestrations, with no idle capacity costs.

Previous Durable Task Scheduler tiers required pre-provisioned capacity, meaning teams paid for idle capacity between workflow executions. The Consumption SKU eliminates this: billing accrues only when tasks are actively executing. This is directly relevant for AI agent pipelines with bursty execution patterns – workflows that run intensively for short windows and idle for extended periods.

Cost consideration: the Consumption SKU trades predictable flat-rate cost for variable per-execution billing. For workflows with very high and consistent throughput, a provisioned tier may still be cheaper. For low-to-medium frequency agentic workflows, the Consumption SKU should reduce costs materially. Model your execution frequency and duration before choosing the tier. Verify pricing at azure.microsoft.com/pricing/details/functions – rates change.

DeepSeek V4 Flash and DeepSeek V4 Pro Now Available in Microsoft Foundry

DeepSeek V4 Flash and DeepSeek V4 Pro entered the Microsoft Foundry model catalog in early May 2026. Both models are available through the standard Azure AI Foundry API.

DeepSeek V4 Flash – optimized for speed and cost efficiency on high-volume inference tasks. Targets classification, extraction, and lightweight reasoning workloads where response latency and per-token cost are the primary optimization dimensions.
DeepSeek V4 Pro – the higher-capability variant for complex reasoning and multi-step tasks. Positioned as a cost-competitive alternative to OpenAI GPT-4o class models for use cases where output quality requirements are high but price sensitivity exists.

Cost implication: DeepSeek models typically carry lower per-token pricing than equivalent-capability OpenAI models. Teams evaluating model costs for production workloads should benchmark DeepSeek V4 Flash against GPT-4o mini for routine tasks before defaulting to higher-priced models. Verify current pricing at azure.microsoft.com/pricing/details/cognitive-services/openai-service – rates change.

Microsoft Foundry: Vercel AI SDK Support for TypeScript

Microsoft Foundry added native support for the Vercel AI SDK in TypeScript on May 29, 2026. TypeScript developers can now route AI calls through Foundry services using the same Vercel AI SDK patterns already established in their web application stacks. This removes the need for custom middleware to bridge Vercel-hosted workflows to Azure AI infrastructure.

The integration is relevant for teams building AI features into Next.js or other Vercel-hosted applications that need enterprise-grade security, governance, and cost controls available through Azure AI Foundry. Verify at ai.azure.com – feature availability may vary by region.

Microsoft Foundry: Bring Your Own AI Gateway – Generally Available

Bring Your Own AI Gateway in Foundry Agent Service reached general availability on April 30, 2026, carrying into May as a live billing change. Teams can now route Foundry agent traffic through a custom AI gateway – whether Azure API Management, a third-party gateway, or a self-hosted solution – while maintaining Foundry orchestration and governance.

Cost implication: routing through Azure API Management adds APIM transaction costs on top of Foundry inference costs. If your gateway is self-hosted, the compute cost of the gateway itself should be modeled against the Foundry native routing cost. For most enterprise teams, using Azure APIM as the AI gateway consolidates auth, rate limiting, and cost attribution under one managed service. Verify APIM pricing at azure.microsoft.com/pricing/details/api-management – rates change.

Compute and Networking: NSG and UDR Scale Limits Raised, AKS Improvements, VPN Updates

NSG and UDR Scale Limits Increased

Microsoft raised NSG (Network Security Group) and UDR (User Defined Route) scale limits in May 2026. Teams running large-scale deployments that previously hit NSG rule count or UDR table size ceilings can now operate within higher bounds without architecture workarounds.

Additionally, Network Watcher gained a rule impact analyzer, allowing teams to assess the effect of NSG rule changes before applying them in production. This reduces the risk of misconfigured rules causing unintended traffic drops or security gaps during maintenance windows.

Azure Front Door: WebSocket Support Generally Available

Azure Front Door now supports WebSocket connections, reaching general availability in May 2026. Teams running real-time applications – collaborative tools, live dashboards, chat interfaces, and AI agent streaming responses – can now route WebSocket traffic through Azure Front Door’s global load balancing, DDoS protection, and WAF capabilities without deploying a separate ingress layer for WebSocket-specific traffic.

Cost consideration: WebSocket connections billed through Front Door accrue connection-time charges in addition to data transfer charges. For applications with many long-lived WebSocket connections, model the connection-time cost against the alternative of self-managing WebSocket ingress on Application Gateway or AKS. Verify Front Door WebSocket pricing at azure.microsoft.com/pricing/details/frontdoor – rates change.

AKS: Application Insights Auto-Instrumentation and Backup via CLI

Two AKS updates reached generally available status in May 2026:

Application Insights auto-instrumentation for AKS – automatically configures tracing, metrics, and log collection for containerized workloads without requiring manual agent deployment or SDK integration. Teams gain observability on AKS without the per-pod sidecar overhead of the previous manual instrumentation approach.
AKS backup configuration via a single Azure CLI command – simplifies the previously multi-step backup setup process. Teams can now configure AKS backup in one CLI call, reducing the operational complexity of protecting Kubernetes workloads in production.

The Application Insights auto-instrumentation update is relevant to cost: removing manual sidecar instrumentation reduces per-pod compute overhead. For clusters with hundreds of pods, this can meaningfully reduce total node utilization. Verify Application Insights pricing at azure.microsoft.com/pricing/details/monitor – rates change.

VPN Gateway: S2S Certificate Authentication and P2S User Group IP Pools

Azure VPN Gateway received two updates in May 2026:

Site-to-Site (S2S) certificate authentication – teams can now use certificate-based authentication for S2S VPN tunnels, removing the reliance on pre-shared keys for environments with strict security requirements.
Point-to-Site (P2S) user group-specific IP pools – allows administrators to assign distinct IP address ranges to different user groups on P2S VPN connections, improving traffic segmentation and enabling policy-based access control per user group.

Both updates are security enhancements with no direct cost change. VPN Gateway billing continues at the same SKU-based hourly rate plus data transfer charges. Verify at azure.microsoft.com/pricing/details/vpn-gateway – rates change.

Storage and Data: Blob Storage SDK for Rust, Storage Mover Updates, Azure Files, and Cosmos DB

Azure Blob Storage SDK for Rust Now Available

Microsoft released an official Azure Blob Storage SDK for Rust in May 2026. Teams building high-performance, low-overhead storage clients in Rust can now use first-party Azure SDK tooling rather than relying on community-maintained crates or REST API wrappers. Relevant for systems engineering teams building data pipeline infrastructure, custom ingestion tooling, or edge compute components in Rust.

Azure Storage Mover: Blob-to-Blob Transfers and Scheduling Enhancements

Azure Storage Mover received two updates in May 2026:

Blob-to-blob transfer support – enables managed migration between Azure Blob Storage accounts, including cross-region and cross-subscription transfers with integrated monitoring and incremental sync.
Scheduling enhancements – allows data migration jobs to be scheduled for off-peak windows, reducing the risk of storage migration traffic competing with production workloads.

Cost implication: cross-region blob transfers incur data transfer egress charges. Use Storage Mover’s scheduling feature to run migrations during off-peak hours and monitor transfer volume against your egress budget before triggering large migrations. Verify Azure data transfer pricing at azure.microsoft.com/pricing/details/bandwidth – rates change.

Azure Files: Entra-Only Identity Support

Azure Files now supports Entra-only identity authentication, removing the dependency on Active Directory domain services for Azure Files access control in environments that have fully migrated to Entra ID. This simplifies identity management for teams running cloud-native environments without on-premises AD infrastructure, and reduces the operational overhead of maintaining AD domain join requirements for Azure Files shares.

Azure NetApp Files: Cache Volumes and Object REST API

Azure NetApp Files added two capabilities in May 2026:

Cache volumes – provides a local caching layer for frequently accessed data, reducing read latency and lowering the cost of repeated reads from high-latency storage tiers.
Object REST API – enables object-storage-style access to NetApp Files volumes through a REST interface, allowing applications designed for S3-compatible storage to access ANF volumes without code changes.

Cache volumes are a cost-reduction feature for read-heavy ANF workloads. Frequently accessed data served from cache avoids repeated reads from the backing NetApp volume, which reduces both latency and potential capacity charges for tiered ANF configurations. Verify ANF pricing at azure.microsoft.com/pricing/details/netapp – rates change.

Cosmos DB: LangChain and LangGraph Integration Expands

Azure Cosmos DB expanded its integration support for LangChain and LangGraph in May 2026, simplifying the pattern of using Cosmos DB as a vector store and memory backend for LLM-powered applications. The integration enables teams to use Cosmos DB as both the operational database and the AI context store in a single service, without requiring a separate vector database.

Cost implication: using Cosmos DB as a vector store adds RU consumption from vector indexing and similarity search operations on top of standard document read and write costs. Run a load test that models your expected vector query pattern before committing to Cosmos DB provisioned throughput for an AI workload. Verify Cosmos DB pricing at azure.microsoft.com/pricing/details/cosmos-db – rates change.

Security and Compliance: Secure Boot Certificate Expiry, Event Grid Updates, and France e-Invoicing

Secure Boot Certificates Expiring June and October 2026: Action Required for Managed Environments

Secure Boot certificates issued in 2011 expire in June and October 2026. Microsoft announced on May 19, 2026 that updates are being rolled out. For most Azure VMs running in-support Windows versions, certificate updates arrive automatically. For organizations with managed updates, certificates must be deployed manually using Microsoft Intune (recommended), registry keys, Windows Configuration System, or Group Policy.

For Azure-hosted Windows VMs: check whether your VMs receive automatic updates or fall under a managed update policy. Managed environments that have not deployed the updated certificates before the June 2026 expiry date will experience Secure Boot failures on affected devices. Verify your device status using the Microsoft Intune monitoring guide at support.microsoft.com/topic/monitoring-secure-boot-certificate-status – timelines subject to change.

Event Grid: Subscription Identifiers and Reliability Updates

Azure Event Grid received subscription identifier improvements and reliability updates in May 2026. Subscription identifiers allow event subscriptions to be tagged and tracked more precisely, enabling better cost allocation for high-volume eventing workloads across teams or applications. The reliability updates address edge cases in event delivery retry behavior. Verify Event Grid pricing at azure.microsoft.com/pricing/details/event-grid – rates change.

France e-Invoicing: Tax ID Submission Required Before September 1, 2026

Microsoft partners and customers transacting with Microsoft France must submit Tax ID details before September 1, 2026, when France e-Invoicing requirements take effect. Required details include a valid French VAT ID, SIREN (9 digits), and SIRET (14 digits). Missing or invalid Tax ID details may result in invoices being rejected by French government systems, blocking e-Invoice issuance. This is a compliance requirement, not a pricing change. Verify submission requirements at learn.microsoft.com/partner-center/announcements/2026-may.

FinOps and Cost Management: What May 2026 Means for Azure Spend

The July 1 RI Retirement Is the Most Urgent Azure Cost Action of the Year

No other Azure change in 2026 has a harder deadline with a higher per-account cost impact than the July 1 Reserved VM Instance retirement. The five-step action plan for FinOps teams:

Step 1 – Audit: In the Azure portal, go to Reservations and filter for Virtual Machines. Identify all RIs associated with the affected VM series (A-series, Basic A-series, Dv2, DSv2, Dv3, Dsv3, Ev3, Esv3, HBv2, NP-series, and related families).
Step 2 – Expiry dates: For each affected RI, note the expiry date. Any reservation expiring on or after July 1, 2026 will not renew. The nearer the expiry date to July 1, the more urgent the action.
Step 3 – Choose a path: Trade in for Azure Savings Plan for Compute (Microsoft primary recommendation for flexibility) or migrate workloads to Dv5, Ev5, or equivalent current-generation series and purchase new RIs.
Step 4 – Model the cost: Use the Azure Pricing Calculator to compare your current RI rate against the Savings Plan rate or the new RI rate on the target series. Confirm the expected savings coverage before transacting.
Step 5 – Execute before June 30: Do not wait until July 1. The portal blocks purchases on affected series starting that date. Execute trade-ins or new purchases before end of June. Verify at azure.microsoft.com/pricing/offers/reservations – rates change.

For teams evaluating whether to trade into Azure Savings Plans or purchase new RIs on current-generation series, the Usage.ai guide on Azure Cost Optimization covers the full decision framework including Azure Hybrid Benefit stacking, Reserved VM Instances, and Savings Plans comparison.

Durable Task Scheduler Consumption SKU: Update Your Azure Functions Cost Model

The Durable Task Scheduler Consumption SKU GA changes the cost model for any team running Azure Functions-based durable workflows or AI agent orchestrations. If your current cost model assumed provisioned Durable Task Scheduler pricing, the Consumption SKU introduces variable billing that scales down during low-activity periods. Pull your current Durable Task Scheduler execution metrics from Azure Monitor, model the Consumption SKU cost against your actual execution patterns, and update your monthly forecast accordingly. Verify pricing at azure.microsoft.com/pricing/details/functions – rates change.

DeepSeek V4 in Foundry: Model Cost Benchmarking Before Production

DeepSeek V4 Flash is positioned as a cost-efficient alternative to mid-tier OpenAI models for high-volume inference. Before switching production workloads, benchmark output quality on your actual task distribution. DeepSeek models may deliver equivalent quality at lower cost for structured extraction, classification, and summarization tasks, while more complex reasoning tasks may require V4 Pro or a higher-priced model. Never switch models in production based on pricing alone. Verify current Foundry model pricing at azure.microsoft.com/pricing/details/cognitive-services/openai-service – rates change.

For teams running Azure database services alongside AI workloads and looking to optimize commitment coverage, the Usage.ai guide on Azure Database Savings Plans covers which database services qualify for commitment discounts and when Savings Plans outperform Reserved Instances.

Cut cloud cost with automation

Latest from our blogs

View all posts

Azure

Azure VM Cost Optimization for CFOs: Real Growth vs Fixable Waste Guide (2026)

Azure, Guides

Azure Cloud Cost Management: From Cost Exports to Executive Dashboards

Azure, Savings Plans

Azure Savings Plan: How to Raise Coverage Without Overcommitting