GCP, Monthly Updates

GCP June 2026 Updates: CUD Sharing Changes by Default, Cloud Run MCP Reaches GA, and Gemini 3.1 Pro Enters Preview

Vishal Sharma

Senior Content Strategist

Originally Published on July 1, 2026

Updated July 1, 2026

8 min read

Executive Summary

June 2026 wasn’t packed with major launches, but it delivered one of the year’s biggest Google Cloud FinOps changes alongside important updates across Cloud Run, Vertex AI, Gemini, and Compute Engine. Three things require action:

Google Cloud changed the default scope for resource-based Committed Use Discounts (CUDs) on June 16, 2026. New billing accounts now default to billing-account scope with CUD sharing enabled, while existing accounts without active commitments also switched automatically. Existing accounts with active commitments remain unchanged. This is the biggest Google Cloud commitment management change of the year. Teams that rely on project-scoped resource-based CUDs should audit their current scope before purchasing new commitments.
Cloud Run Remote MCP Server reached general availability, making it easier to build and deploy MCP-compatible AI agents on Google Cloud using a production-ready standard interface. The GA removes preview limitations and makes it production-eligible.
Gemini 3.1 Pro entered preview on Vertex AI and Gemini Enterprise, giving teams a more capable reasoning model for complex AI workloads and another option to evaluate before production deployments. Teams running Gemini 2.5 Pro for complex reasoning tasks should benchmark 3.1 Pro before the next commitment renewal cycle.

Let’s have a look at the Google Cloud updates for June 2026.

What the June 16 CUD Scope Default Change Means for Your GCP Commitment Strategy

This is the highest-priority FinOps item of the month. On June 16, 2026, Google Cloud changed the default scope for resource-based Committed Use Discounts. The change was confirmed in the official Compute Engine release notes.

What Changed and Who Is Affected

Account Type	Before June 16	After June 16	Action Required
New billing accounts (created June 16+)	Project scope (default)	Billing account scope (CUD sharing enabled)	None – sharing is automatic
Existing accounts: no active commitments on June 16	Project scope	Billing account scope (auto-switched)	Audit if project isolation was intentional
Existing accounts: active commitments on June 16	Project scope	Unchanged – remains project scope	No change until commitments expire

When CUD sharing is enabled at the billing-account level, a single commitment covers eligible resource usage in any project linked to that billing account. Previously, resource-based CUDs only applied within the project where they were purchased unless sharing was explicitly enabled. The change improves coverage utilization for organizations with workloads spread across multiple projects.

The risk: organizations that intentionally used project-scoped CUDs for chargeback or cost isolation purposes may now have commitments applying across projects they did not intend to subsidize. Audit your commitment scope in the Google Cloud Console under Billing before purchasing new commitments.

For a full breakdown of how GCP resource-based and spend-based CUDs work, and when each type delivers the best discount, see the Usage.ai guide on GCP Committed Use Discounts.

AI Models: Gemini 3.1 Pro in Preview and Claude Opus 4.8 on Agent Platform

Gemini 3.1 Pro: Now in Preview on Vertex AI

Gemini 3.1 Pro launched in preview on Vertex AI and Gemini Enterprise in June 2026. Google described it as a noticeably smarter and more capable baseline for complex problem-solving. It is available in preview via the Gemini API in Google AI Studio, Android Studio, Google Antigravity, and Gemini CLI.

Cost context: Gemini 3.1 Pro is currently in preview. Teams should benchmark its performance and pricing against their existing Vertex AI models before moving production workloads. Teams should benchmark 3.1 Pro against their current model before switching production workloads. Preview pricing may differ from GA pricing. Verify at cloud.google.com/vertex-ai/generative-ai/pricing.

Claude Opus 4.8 Now Available on Gemini Enterprise Agent Platform

Anthropic’s Claude Opus 4.8 became available on the Gemini Enterprise Agent Platform in June 2026, per the official Google Cloud blog. It is designed for agentic coding workflows requiring extensive refactoring, dependency tracking, and long-horizon task execution. Teams running complex coding agent pipelines on GCP now have Opus 4.8 as an option alongside the Gemini model family.

Compute and Containers: Cloud Run MCP GA, C4D Hyperdisk, G4 Fractional GPUs, and Flex-Start VMs

Cloud Run Remote MCP Server: Now Generally Available

The Cloud Run remote MCP server reached general availability in June 2026. Agents and AI applications can now deploy with Cloud Run and interact through a standard MCP interface in production. This makes Cloud Run a first-class backend for agentic architectures, enabling teams to build MCP-compatible agent backends without managing custom server infrastructure.

Cloud Run also received three other updates in June: ephemeral disk (volumes that persist only for the duration of an instance) entered Preview; custom CPU and concurrency targets using scaling controls entered Preview; and startup CPU boost and session affinity both reached GA.

Compute Engine: C4D Hyperdisk HA, G4 Fractional GPUs, and AI Zones GA

Three Compute Engine updates reached GA in June:

C4D machine series now supports Hyperdisk Balanced High Availability disks, enabling synchronous dual-zone replication for mission-critical workloads on C4D without instance family constraints.
G4 accelerator-optimized machine series now supports fractional GPUs – teams can attach less than one full GPU to a VM, reducing the cost floor for small inference workloads that do not need a full GPU allocation.
AI Zones reached GA. AI Zones are specialized Compute Engine zones with high-density GPU and TPU availability designed for AI training and inference workloads requiring sustained accelerator access.

Hyperdisk Balanced High Availability: Throughput Doubled to 2,400 MiB/s

Google doubled the maximum throughput for Hyperdisk Balanced High Availability disks from 1,200 MiB/s to 2,400 MiB/s. This benefits high-throughput database workloads running on HA-replicated block storage. The increase is available at the same pricing tier – it is a capacity improvement, not a new pricing tier.

Flex-Start VMs for Managed Instance Groups: Generally Available

Flex-start VMs in managed instance groups (MIGs) reached GA. Teams can now gradually create Flex-start VMs in a MIG as capacity becomes available, rather than waiting for full capacity before any VMs are created. Flex-start VMs run for up to seven days and provide access to high-demand GPU resources at potentially lower cost than standard on-demand instances, depending on capacity availability. This is relevant for batch AI training jobs that need GPU capacity on a best-effort basis without paying full on-demand or reservation rates.

Security: Several CVEs Patched and Google SecOps Updates

Security Bulletins: GCP-2026-036, GCP-2026-031, GCP-2026-032, GCP-2026-040

Google patched four security vulnerabilities in Compute Engine infrastructure during June 2026:

GCP-2026-036 (CVE-2025-10263): Bypass of translation stages or GPT protections in some Arm core families. Addressed in Compute Engine infrastructure.
GCP-2026-031 and GCP-2026-032: AMD firmware vulnerabilities (CVE-2025-61971, CVE-2025-61972, CVE-2024-36315 and CVE-2025-54518) affecting SEV-SNP guests and Zen 2 microarchitecture processors. Addressed.
GCP-2026-040: Vulnerability affecting Cloud Service Mesh. Patches applied to versions 1.29, 1.28, and 1.27.

These are infrastructure-level patches. No customer action is required unless teams run self-managed Cloud Service Mesh. Teams on managed Cloud Service Mesh are patched automatically.

Google SecOps: Case Search in UDM, IoC Matching, and Apigee Emulator Security Fix

Google SecOps SIEM Search was updated on June 12 to allow security analysts to search cases and case history alongside UDM events, streamlining incident response workflows. A new Non-prioritized IoC Matching detection category was added on June 13 using IoC feeds and threat intelligence – this is a SecOps Enterprise Plus feature. The Apigee Emulator released version 2.0.1 on June 24 as a security hotfix addressing 10 Netty networking library vulnerabilities. Teams running Apigee Emulator 2.0.0 should update to 2.0.1.

$GCP June 2026 timeline divided into four sections showing model launches (Gemini 3.1 Pro, Claude Opus 4.8) in the first two weeks, the June 16 CUD scope default change as the highlighted FinOps anchor event, and compute and serverless updates including Cloud Run MCP GA, G4 fractional GPUs, and AI Zones GA in the second half of the month.$

FinOps and Cost Management: What June 2026 Means for GCP Spend

CUD Scope Change: Three Things to Do This Week

The June 16 CUD scope default change requires three immediate actions for teams that purchased resource-based CUDs before June 16:

Check your current scope: In the Google Cloud Console, go to Billing, then Commitments. For each active resource-based commitment, confirm whether it is now at billing-account scope or still at project scope.
Audit cross-project coverage: If your account auto-switched to billing-account scope, verify that commitments are not now covering projects you did not intend to subsidize. This matters for teams that use project-level chargeback or have separate cost centers per project.
Update future purchase strategy: For new commitments, billing-account scope is now the default. If project isolation is intentional for a specific commitment, you must explicitly set project scope at purchase time. The default no longer protects project boundaries.

For teams building GCP commitment strategy and evaluating how resource-based versus spend-based CUDs interact with the new billing-account scope default, the Usage.ai guide on GCP Cost Optimization covers the full framework.

G4 Fractional GPUs: Lower Cost Floor for Small Inference Workloads

The GA of fractional GPU support on G4 instances reduces the minimum cost of running small inference workloads on accelerator-optimized VMs. Previously, attaching a GPU required a full GPU allocation, which overprovisioned compute for workloads that only needed a fraction of GPU capacity. With fractional GPUs, teams can right-size GPU allocations to match actual inference demand, reducing per-inference cost for low-throughput workloads. Check current G4 fractional GPU pricing at cloud.google.com/compute/vm-instance-pricing.

Flex-Start VMs: On-Demand GPU Access Without Reservation

Flex-start VMs at GA provide access to high-demand GPU capacity at a discounted price without requiring a reservation. For teams running occasional or batch AI training jobs, flex-start VMs offer a cost-effective middle ground between on-demand (full price, guaranteed) and Spot VMs (deepest discount, preemptible). They run for up to seven days and provision capacity on a best-effort basis. The GA status means they are now suitable for production batch workloads where some scheduling flexibility is acceptable.

For teams evaluating the full spectrum of GCP compute pricing options for AI workloads including Spot, Flex-start, resource-based CUDs, and spend-based CUDs, the Usage.ai guide on Google Cloud Compute Pricing covers the decision framework.

Cut cloud cost with automation

Latest from our blogs

View all posts

BigQuery Cost Optimization: Fix Queries First, Then Tackle the Pricing Model

Azure, Monthly Updates

Azure in June 2026: Build Brings HorizonDB and Azure Linux, RI Deadline Is One Day Away, and Product Terms Get Rewritten

AWS, Monthly Updates

AWS in June 2026: Summit New York, Graviton5 GA, AWS FinOps Agent, and S3 Vectors Gets 80% Cheaper