How is Google Cloud Compute pricing calculated?

Google Cloud Compute pricing is based on machine type, vCPU and memory configuration, region, operating system, runtime duration, and selected discount model. Instances are billed per second, with costs varying by location and configuration. Additional savings may apply through Sustained Use Discounts or Committed Use Discounts.

What are Committed Use Discounts (CUDs) in Google Cloud?

Committed Use Discounts (CUDs) allow customers to commit to a specific amount of vCPU and memory usage for one or three years in exchange for discounted rates. If actual usage falls below the committed level, customers still pay for the full commitment.

How much can you save with Google Cloud Committed Use Discounts?

Google Cloud Committed Use Discounts can reduce Compute Engine costs by up to approximately 57% compared to on-demand pricing, depending on machine type and commitment term. Actual savings depend on commitment coverage and utilization rate.

What is commitment coverage in Google Cloud Compute pricing?

Commitment coverage is the percentage of your baseline compute usage protected by Committed Use Discounts. Higher coverage can increase savings but also increases risk if usage declines below the committed level.

What happens if you underutilize a Google Cloud commitment?

If actual compute usage falls below the committed level, you still pay for the full Committed Use Discount amount. Underutilization increases your effective blended cost per vCPU and reduces realized savings.

Is Google Cloud cheaper than AWS for compute?

Google Cloud and AWS offer competitive compute pricing with similar per-second billing and long-term commitment discounts. Actual cost differences depend on machine type, region, commitment structure, utilization rate, and coverage strategy rather than list price alone.

Does Google Cloud offer a free trial for Compute Engine?

Yes. Google Cloud offers a free trial that includes $300 in credits for new customers, which can be used to test Compute Engine and other services. This allows teams to explore Google Cloud Compute pricing before committing to production workloads.

Google Cloud Compute Engine Pricing Guide: Costs, CUDs & Optimization Strategy

Debesh Singh

Engineering and Chief of Staff

Originally Published on April 15, 2026

Updated April 29, 2026

25 min read

Google Cloud Compute pricing determines how much you pay to run virtual machines in Google Cloud. At its core, pricing is based on machine type, region, operating system, and how long your workloads run. According to Google Cloud’s official pricing documentation, Compute Engine instances are billed per second, with rates varying by configuration and location.

On the surface, Google Cloud Compute pricing appears straightforward. You choose a machine, deploy it, and pay for what you use. But as usage scales, pricing becomes more nuanced. Discounts are automatically applied for sustained usage, and deeper savings are available through Committed Use Discounts (CUDs), which require a one- or three-year commitment.

This is where complexity begins. While committed pricing can significantly reduce Google Cloud compute costs compared to on-demand rates, it also introduces financial exposure if your usage drops below the committed level. Many teams understand the pricing models individually but struggle to determine the right commitment strategy for their workloads.

In this guide, we’ll break down how Google Cloud Compute pricing works, what factors influence your bill, and how to approach commitment decisions in a way that balances savings with flexibility.

How Does Google Cloud Compute Engine Pricing Work?

Compute Engine charges per second with a one-minute minimum on most instance types. You pay for provisioned capacity, not utilized capacity. It means a VM running at 5% CPU costs the same as one at 95%. This makes rightsizing the first lever to pull before any discount program enters the picture.

The five billing variables:

vCPU and Memory are charged separately at per-resource rates that vary by machine family. Custom machine types allow you to adjust the vCPU-to-memory ratio, but carry a 5% premium over standard commitment prices when run under a resource-based CUD.

Region is the second variable. Prices in us-central1 (Iowa) are consistently among the lowest. europe-west4 (Netherlands) runs approximately 10–15% higher. asia-northeast1 (Tokyo) higher still. Deployment region selection is a real cost variable that compounds at scale.

OS licensing is zero cost for Linux. Windows Server, RHEL, and SUSE add per-vCPU/hour premiums that can exceed underlying compute cost for smaller instance types. For Windows fleets, always calculate total cost including OS licensing before comparing machine types.

Attached resources. Persistent disks, Hyperdisks, GPUs, TPUs, and Local SSDs are billed separately from compute. For AI/ML workloads, GPU cost often dominates the bill.

Network egress. Traffic leaving a region or flowing to another cloud or on-premises destination is charged per GB. Intra-region VM-to-VM traffic is free.

What Are the Google Cloud Compute Engine Machine Families?

GCP’s current general-purpose machine series includes C3, C3D, C4, C4A, C4D, E2, N1, N2, N2D, N4, N4D, and N4A. Here are the key categories and their 2026 pricing positioning:

E2 Series: Cost-Optimized General Purpose

E2 is the most cost-efficient family for standard web, application, and dev/test workloads. The e2-standard-4 (4 vCPU / 16 GB) in us-central1 is priced as follows:

On-demand: $0.134/hr (~$97.84/mo)
1-year CUD: $0.0844/hr (~$61.64/mo); save 37%
3-year CUD: $0.0603/hr (~$44.03/mo); save 55%

N2 Series: Balanced Performance

N2 runs on Intel Ice Lake / Cascade Lake and is the step up from E2 for workloads that need higher sustained compute throughput. The n2-standard-4 (4 vCPU / 16 GB) in us-central1 is priced as follows:

On-demand: $0.1942/hr (~$141.79/mo)
1-year CUD: $0.1224/hr (~$89.33/mo); save 37%
3-year CUD: $0.0874/hr (~$63.81/mo); save 55%

N4 / C4: Current Generation (2024–2026)

This generation covers four distinct series, each targeting a different price-performance profile:

N4 is powered by 5th-gen Intel Xeon (Emerald Rapids), scales up to 80 vCPUs and 640 GB DDR5, and is the best all-round current-gen choice for general-purpose workloads.
C4 runs 6th-gen Intel Xeon (Granite Rapids) at a sustained all-core turbo of 3.9 GHz, supports up to 200 Gbps per-VM networking and 2.2 TB DDR5 memory, and is best suited for compute-dense applications.
N4A is powered by Google’s Axion processor (Arm Neoverse N3) and is engineered as the most efficient Arm-based series for containerized workloads and GKE deployments.
N4D runs AMD EPYC Turin, reaches up to 96 vCPUs and 768 GB DDR5 memory, and includes dynamic resource management for improved host utilization.

Memory-Optimized: M1, M2, M3, M4

This family is purpose-built for SAP HANA, large in-memory databases, and OLAP workloads where memory capacity matters more than vCPU count.

M4 supports up to 6 TB of memory and M2 up to 12 TB, making them the highest-memory options on Compute Engine.
3-year resource-based CUDs deliver greater than 60% savings on this family; the deepest discount available on GCP.
M1 and M2 qualify for SUDs of up to 30% if you prefer not to commit upfront.
Flexible (spend-based) CUDs for memory-optimized VMs are available on 3-year terms only; a 1-year flexible commitment provides zero discount on this family, which is a common sizing mistake.

Compute-Optimized: C2

C2 is designed for HPC, gaming servers, and compute-intensive single-threaded applications where raw per-core performance matters more than memory density. It carries a higher per-vCPU price than E2 and N2, and it is eligible for SUDs, though it does not offer the same CUD discount depth as the general-purpose families.

Accelerator-Optimized: A2, A3, G2, A4

The accelerator-optimized family is the right choice for any massively parallelized CUDA workload, like ML training, inference, and HPC. Here are two things to consider before committing:

A4X Max bare metal instances offer up to 144 vCPUs, 960 GB of memory, and 4 NVIDIA B300 GPUs per instance, making them the highest-density option on the platform.
CUD discounts on GPU instances are significantly shallower than on general-purpose compute. A g2-standard-4 with an NVIDIA L4 GPU saves only 8–11% on a 1-year CUD, compared to 37–55% for an E2 or N2 instance of equivalent vCPU count. GPU instances are also not eligible for compute flexible CUDs.

For GPU workloads, driving utilization above 80% delivers more cost reduction than commitment discounts alone.

Note: Verify at Google Cloud Pricing as rates keep changing.

Reference Pricing Table: Selected Instance Types, us-central1, Linux, April 2026

Instance	vCPU	RAM	On-Demand/hr	1-Yr CUD/hr	3-Yr CUD/hr	Spot/hr
e2-standard-4	4	16 GB	$0.1340	$0.0844	$0.0603	$0.0617
n2-standard-4	4	16 GB	$0.1942	$0.1224	$0.0874	$0.0543
n1-standard-4	4	15 GB	$0.1900	~$0.1197	~$0.0855	varies
g2-standard-4 (NVIDIA L4 GPU)	4	16 GB	$0.7070	$0.6525	$0.6261	$0.6188

How Much Does Google Cloud GPU Compute Cost?

GPU pricing on GCP operates on a fundamentally different cost structure from general-purpose compute. The accelerator cost dominates, and CUD discounts are far shallower than on CPU instances.

Current GPU instance pricing, us-central1, Linux

Instance	GPU	vCPU	RAM	On-Demand/hr	1-Yr CUD/hr	3-Yr CUD/hr	Spot/hr
g2-standard-4	1× NVIDIA L4	4	16 GB	$0.7070	$0.6525	$0.6261	$0.6188
g2-standard-8	1× NVIDIA L4	8	32 GB	~$1.1172	~$1.030	~$0.988	~$0.976

(Verify at cloud.google.com/compute/all-pricing — rates change. Sourced April 2026.)

Four things stand out versus general-purpose pricing:

CUD discounts are minimal on GPU instances. The g2-standard-4 sees only 8% savings on a 1-year CUD and 11% on a 3-year CUD. Compare that to 37% and 55% respectively for n2-standard-4. The reason is GPU hardware costs dominate the instance price, and Google prices GPU commitment discounts conservatively.
Spot discounts on GPU instances are also limited. The g2-standard-4 drops only 12% on Spot versus 72% for n2-standard-4. For ML training batch jobs, Spot GPUs still make sense over long runs, but the economics are tighter than CPU Spot.
GPUs are not eligible for compute flexible CUDs. Only vCPUs, memory, and Local SSD resources qualify for flexible (spend-based) commitments. GPU usage must be covered by resource-based CUDs or run on-demand / Spot.
Utilization is the real lever for GPU cost control. A GPU running at 20% utilization between training runs costs the same as one at 100%. For inference workloads, right-sizing from an A2/G2 to a smaller accelerator type or batching requests to drive GPU utilization above 80% typically recovers more cost than commitment discounts alone.

What Is the Free Tier for Google Cloud Compute?

New customers receive $300 in free credits for Compute Engine during the first 90 days. The always-free tier includes:

one non-preemptible e2-micro VM per month in us-west1, us-central1, or us-east1
plus 5 GB of snapshot storage in select regions
1 GB of network egress from North America per month

The e2-micro free tier covers continuous operation within its hour allowance, which is usually sufficient for a small personal project or testing environment. Any VM usage outside these allowances is billed at standard rates from the first second.

What Are Sustained Use Discounts (SUDs) and How Do They Work?

Sustained Use Discounts are automatic. Google calculates SUDs at the end of each billing month and applies credits to qualifying resources. There is no action required from you to receive them.

How SUDs calculate

Compute Engine applies SUD credits incrementally as a resource crosses usage thresholds at 25%, 50%, 75%, and 100% of the billing month. If you run a VM for the full month, the incremental discounts combine to a maximum net discount of 30% off the on-demand rate.

Which machine series are eligible

SUDs apply to vCPU and memory for N1, N2, N2D, C2, M1, and M2 machine types, as well as GPUs attached to N1 instances. The following families are not eligible:

E2, C3, C4, N4, and all accelerator-optimized series

(Verify at cloud.google.com/compute/docs/sustained-use-discounts as eligibility can change.)

The critical rule: SUDs and CUDs cannot be combined

SUDs do not apply to resources already covered by CUDs. CUDs take precedence. For any workload running full-time on an eligible machine series, a 3-year CUD delivers 55% savings versus the 30% SUD cap. For predictable, steady-state workloads, the CUD is always the better choice.

SUD vs. CUD at scale: what the gap actually costs

A team running 10 × n2-standard-4 instances 24/7 on SUD saves approximately 26–30%, recovering roughly $370–$424/month versus on-demand. The same fleet under a 3-year CUD saves approximately $778/month. The numbers at scale:

10 instances over 36 months: $12,000+ left on the table by relying on SUDs alone
100 instances over 36 months: $120,000+ recoverable by switching to commitments

[SCREENSHOT: GCP Billing Console — Credits tab showing SUD credits line item for a project running N2 instances continuously]

Alt text: GCP Billing Console Credits tab showing Sustained Use Discount credit line item for Compute Engine N2 instances

What Are Google Cloud Committed Use Discounts (CUDs) and Which Type Should You Use?

CUD is the highest-leverage discount mechanism for predictable workloads. GCP offers two types: resource-based CUDs and compute flexible CUDs (spend-based).

Resource-Based CUDs: Commit to Specific Resources in a Region

Resource-based commitments are purchased for specific machine series resources, like vCPUs, memory, GPUs, or Local SSD in a specific region and project, for a 1-year or 3-year term.

Discount levels include:

Up to 55% off for most machine types on a 3-year resource-based CUD; up to 70% for memory-optimized types.
1-year resource-based CUDs deliver approximately 37% savings on general-purpose families.

Key constraints:

Region-locked. A us-central1 commitment cannot cover a europe-west4 VM.
Does not apply to preemptible VMs, N1 shared-core types, or extended memory.
Custom machine types incur a 5% premium over standard commitment prices.
You pay for committed resources whether you run them or not.

Flexible (Spend-Based) CUDs: Commit to Hourly Spend Across Services

Compute flexible CUDs are spend-based commitments where you commit to a minimum hourly spend on eligible services. Expanded coverage now includes Compute Engine, GKE, and Cloud Run across all Cloud Billing accounts following the 2025–2026 rollout. All billing accounts were automatically migrated to the new model.

Flexible CUDs offer lower discount percentages than resource-based CUDs but eliminate machine-type and regional lock-in. They are the better choice for teams with mixed or shifting workload composition.

Flexible CUD constraints:

Cannot be used for Spot VMs or preemptible VMs.
Memory-optimized VMs are eligible only on 3-year flexible commitments; a 1-year flexible commitment provides no discount for memory-optimized usage.
GPUs are not eligible for compute flexible CUDs; only vCPUs, memory, and Local SSD resources qualify.

Resource-Based vs. Flexible CUD: Decision Framework

Factor	Resource-Based CUD	Flexible (Spend-Based) CUD
Discount ceiling	Up to 55–70%	Lower % but broader coverage
Scope	Specific region, machine series	Compute Engine + GKE + Cloud Run
Workload fit	Stable, predictable VM fleet	Mixed or shifting workload mix
Memory-optimized	1-year or 3-year	3-year only
GPU coverage	Yes (with attachment)	No
Flexibility	Region and type locked	Broader application
Risk if underutilized	Pay for unused commitment	Pay for unused commitment

Use resource-based CUDs when: Your fleet is stable under the same region and same machine series for 1–3 years. Common for production databases, Kafka clusters, and dedicated application-tier services.

Use flexible CUDs when: You run mixed workloads across Compute Engine and GKE, your instance mix shifts quarterly, or you want CUD coverage to follow your spend rather than your resource configuration.

What Is the Real Cost of CUD Underutilization?

When you purchase a CUD directly from Google and your usage falls below the committed level, you pay for the full commitment anyway. There is no refund, no credit, and no cashback. The discount you were counting on becomes a sunk cost for capacity you never ran.

A real scenario: A team commits to 50 × n2-standard-4 (1-year, us-central1) at the start of a product cycle. Six months in, they containerize 20 of those workloads and migrate to GKE Autopilot. Those 20 VMs are gone. The commitment remains. For the remaining 6 months, they pay approximately $0.1224/hr × 20 instances × 720 hours/month = $1,763/month for compute they no longer run. Over the 6-month tail: $10,578 in wasted commitment spend.

At scale: On a $100K/month GCP Compute bill with 60% on CUDs:

Remaining $40K/month runs on-demand
If that $40K qualifies for 3-year CUD rates, covered rate would be ~$18K (55% savings)
Monthly overpay from incomplete coverage: $22K
Over 6 months of manual optimization delay: $132,000 in recoverable savings left on the table

The three CUD failure modes:

Over-commitment: Committing to more vCPUs/memory/spend than actual baseline usage.
Wrong-region commitment: Committing to us-central1 while workloads migrate to europe-west4.
Wrong-machine-series commitment: Committing to N2 while migrating to E2 or N4.

All three result in paying for CUDs that don’t cover live usage, with no recourse from Google.

Not sure how much of your GCP Compute spend is unprotected by CUDs? Usage.ai’s 15-minute free savings analysis identifies your coverage gap, sizes your commitment baseline, and shows the exact dollar recovery available, at zero cost and zero commitment.

Book a Free Savings Analysis → usage.ai

How to Choose Between On-Demand, CUD, SUD, and Spot VMs

Google Cloud Billing Model Comparison

Model	Max Discount	Term	Interruptible	Best For
On-Demand	0%	None	No	Unpredictable or variable workloads
Sustained Use (SUD)	Up to 30%	Auto-applied monthly	No	N1, N2, N2D, C2 running 25%+ of month
1-Year Resource CUD	~37%	12 months	No	Stable, steady-state production VMs
3-Year Resource CUD	Up to 55–70%	36 months	No	Long-running, predictable workloads
Flexible (Spend) CUD	Variable	12–36 months	No	Mixed Compute Engine + GKE + Cloud Run
Spot VM	Up to 91%	None	Yes	Batch, ML training, fault-tolerant jobs

Choose on-demand when: Workloads are new, variable, seasonal, or expected to change significantly within 12 months. Appropriate for burst capacity above your baseline commitment.

Choose SUD when: Running N1, N2, N2D, or C2 instances for most of the month without the certainty needed for a multi-year commitment. SUDs apply automatically with zero action.

Choose a 1-year resource CUD when: You have 12+ months of usage data showing a consistent baseline of specific instance types in a specific region, and that baseline is unlikely to shift by more than 20%.

Choose a 3-year resource CUD when: The above applies and you are confident in a 3-year architecture trajectory. The incremental discount over a 1-year CUD (37% → 55%) is significant at scale, but the risk of stranded commitment doubles.

Choose flexible CUDs when: Your instance mix includes GKE and Cloud Run, or shifts regularly. The lower ceiling on discounts is the tradeoff for reduced lock-in risk.

Choose Spot VMs when: The workload tolerates interruption. ML training jobs, data pipeline batch steps, CI/CD runners, and video rendering are ideal. Never use Spot for stateful, latency-sensitive production services. Spot VMs cannot receive CUDs or SUDs.

Worked Example: What Does 50 × n2-standard-4 Cost at Every Billing Model?

Let’s consider an assumption. 50 × n2-standard-4, us-central1, Linux, running 730 hours/month (full month).

Billing Model	Rate/hr	Monthly Cost	Annual Cost	Annual Saving vs. On-Demand
On-Demand	$0.1942	$7,089	$85,071	—
SUD (full month, ~26% net)	~$0.1437	~$5,245	~$62,943	~$22,128
1-Year Resource CUD	$0.1224	$4,467	$53,604	~$31,467
3-Year Resource CUD	$0.0874	$3,190	$38,286	~$46,785

At 500 instances, the 3-year CUD vs. on-demand gap is approximately $467,850/year, before any rightsizing. At 3 years, that single commitment decision is worth $1.4 million on a 500-instance fleet.

How Do Spot VMs and Preemptible VMs Work?

Spot prices can change up to once per day and provide discounts of up to 91% off on-demand pricing for many machine types, GPUs, TPUs, and Local SSDs. Compute Engine can reclaim Spot VMs at any time, so they are only recommended for fault-tolerant, interruptible applications.

Spot VM pricing examples (us-central1, Linux):

n2-standard-4 Spot: $0.0543/hr (~$39.64/mo) vs $0.1942 on-demand — 72% discount
e2-standard-4 Spot: $0.0617/hr (~$45.07/mo) vs $0.134 on-demand — 54% discount
g2-standard-4 (NVIDIA L4) Spot: $0.6188/hr vs $0.707 on-demand — only 12% discount

(Verify at cloud.google.com/compute/all-pricing as spot prices fluctuate.)

GPU Spot discounts are significantly lower than CPU Spot. For ML training, Spot is still economical for long batch runs on CPU-type instances, but the economics are tighter on accelerator-optimized families.

Stacking rule: Spot VMs cannot receive CUDs or SUDs. Spot is a standalone pricing tier with no commitment overlay possible.

What Additional Costs Does Compute Engine Carry Beyond Instance Pricing?

Instance pricing is only part of your GCP Compute bill. Four additional cost categories consistently catch teams off guard, especially at scale. They are:

Persistent Disk and Hyperdisk Storage

Storage is billed per GB per month, entirely separate from compute, and adds up fast across large fleets:

Standard persistent disk (us-central1): ~$0.040/GB/month
PD SSD (us-central1): ~$0.170/GB/month

A 100 GB boot disk on every instance in a 500-instance fleet adds $2,000–$8,500/month in storage charges before a single compute hour is counted.

Network Egress

Egress is the most commonly overlooked line item on large GCP bills. The rates vary significantly depending on where the traffic is going:

GCP to the internet (North America): ~$0.085/GB for the first 1 TB/month, with decreasing tiers after
Between GCP regions: ~$0.01–$0.02/GB
Intra-zone VM-to-VM traffic: free

Egress spikes typically trace to two causes: uncompressed data transfers or misconfigured CDN routing. Both are fixable, but only if you’re watching the line item.

OS Licensing

Linux VMs carry no OS licensing cost. Windows changes the math significantly:

Windows Server adds ~$0.04/vCPU/hour on top of compute
On a fleet of 50 × n2-standard-4 Windows instances, that adds approximately $5,840/month in licensing overhead, before any compute or storage charges

If you’re comparing machine types or evaluating a migration, always calculate the total cost including OS licensing, not just the compute rate.

GPU and TPU

GPU instances are priced differently from CPU instances. The accelerator dominates the cost, not the vCPU or memory:

An NVIDIA L4 GPU (g2 series) adds approximately $0.70/hr above base compute
A4X instances with NVIDIA B300 GPUs run significantly higher

A GPU sitting at 20% utilization between ML training runs is one of the highest per-unit waste sources in cloud infrastructure, driving utilization above 80% recovers more cost than any commitment discount on this family

Sole-Tenant Nodes

Sole-tenant nodes carry a tenancy premium on top of the underlying vCPU and memory cost. One thing worth knowing is the tenancy premium itself is eligible for CUDs, even when the underlying compute resources are already covered by a separate resource-based commitment.

How to Optimize Google Cloud Compute Engine Costs: Step-by-Step

Prerequisites

Before purchasing any CUDs, complete the following:

Minimum 30 days of billing data in Cloud Billing or BigQuery export
Identify which project or billing account holds the majority of Compute Engine spend
Map instances to workload categories: always-on production, batch, dev/test, variable

Estimated time: 2–3 hours for analysis; CUD purchases take 5–10 minutes per commitment once sized.

Step 1: Export Billing Data to BigQuery

The GCP Cost Table report gives per-SKU breakdowns but limited filtering. For serious CUD sizing, export billing data to BigQuery via the gcp_billing_export_v1 dataset. Query 90 days of hourly usage patterns per region and machine series before committing.

— Find your N2 vCPU baseline by region (past 90 days)
SELECT
resource.region,
SUM(usage.amount) / COUNT(DISTINCT DATE(usage_start_time)) AS avg_daily_vcpu_hours
FROM `project.dataset.gcp_billing_export_v1_*`
WHERE service.description = ‘Compute Engine’
AND sku.description LIKE ‘%N2 Instance Core%’
AND usage_start_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
GROUP BY resource.region
ORDER BY avg_daily_vcpu_hours DESC;

Step 2: Identify Your Baseline vs. Variable Usage

Your CUD target is your floor, i.e. the minimum vCPU/memory usage you run every hour, every day. Not your average. Not your peak. Your floor.

Here’s a practical method: Take the 10th percentile (P10) of hourly vCPU usage over 90 days, per region and machine series. Committing to your median or average is how teams end up with stranded CUDs when workloads dip.

Step 3: Choose Resource-Based vs. Flexible CUDs

Apply the comparison table from the section above. For a dedicated VM fleet on a single machine series in a stable region: resource-based CUD. For a mixed Compute Engine + GKE environment or a regularly changing instance mix or flexible CUD.

Step 4: Purchase Commitments in GCP Console

Navigate to: Compute Engine → Committed Use Discounts → Purchase Commitment

Select: Region, machine series (for resource-based), term (1 or 3 years), and quantity at your P10 baseline.

For flexible CUDs: Navigate to Billing → Commitments and set hourly spend targets per eligible service.

Auto-renewal is available for both types — enable if your baseline is likely to persist beyond the initial term.

Step 5: Monitor Utilization Monthly

A CUD at 95%+ utilization is working correctly. A CUD at 70% utilization means 30% of your committed spend covers nothing. Set a monthly review in the CUD Analysis Report under the GCP console Billing section.

Step 6: Rightsize and Renew Based on Current Baseline

Before any CUD renewal, re-run the baseline analysis using the most recent 90 days. Teams that migrate to newer machine series (N2 → E2 for cost reduction, N2 → N4 for performance) need updated commitments to match the new architecture. Never auto-renew without re-validating the commitment quantity.

How Does Automated CUD Management Close the Coverage Gap?

The industry benchmark is 6–9 months for teams optimizing GCP CUDs manually to reach full coverage. During that window, every on-demand hour is an overpay against what a CUD rate would have cost.

On a $100K/month GCP compute bill running 40% uncovered:

$40K/month running on-demand
Covered at 3-year CUD rates, that same compute costs ~$18K
Monthly overpay: $22K
Six-month delay: $132,000 in recoverable savings

Usage.ai’s CoPilot identifies baseline usage across Compute Engine, GKE Standard, GKE Autopilot, and Cloud Run, sizes commitments to your actual P10 floor, and purchases and manages commitments automatically. The platform operates at billing-layer access only with no infrastructure changes, no code changes, and a 30-minute setup.

Usage.ai GCP Product Suite:

Product	Covers	Savings Range
Usage Flex Compute Engine CUD	Compute Engine VMs, GKE (Std/Autopilot), Cloud Run, Sole-tenant nodes	28–46%
Usage Flex GKE Autopilot CUD	GKE Autopilot clusters, GKE Standard resources	20–46%
Usage Flex Cloud SQL CUD	Cloud SQL PostgreSQL, MySQL, SQL Server	25–52%

Usage.ai’s CoPilot provides automated cash-back rebates for any underutilization of commitments purchased through the platform. Google does not offer refunds on directly purchased CUDs. This is the difference between taking on commitment risk yourself and having it insured.

The fee structure is a percentage of realized savings only. Zero fee if Usage.ai saves nothing.

See exactly how much your GCP Compute Engine fleet is overpaying. 30-minute setup. Billing-layer access only. No infrastructure changes.

Book a Free Savings Analysis → usage.ai

How Does Google Cloud Compute Pricing Compare to AWS and Azure?

For multi-cloud teams, the commitment structure comparison across providers is a real decision variable.

Dimension	GCP CUDs	AWS Savings Plans	Azure Reserved VMs
Commitment types	Resource-based + Flexible (spend-based)	Compute SP, EC2 Instance SP	Reserved VM Instances
Term options	1-year, 3-year	1-year, 3-year	1-year, 3-year
Max discount	Up to 70% (memory-optimized, 3-yr)	Up to ~66% (EC2 SP, 3-yr)	Up to ~72% (3-yr, all upfront)
Automatic discount equivalent	SUD: up to 30% (no action)	None	None
Spot equivalent	Spot VMs (up to 91%)	Spot Instances (up to 90%)	Azure Spot (up to 90%)
Underutilization protection (native)	None	None	None
Underutilization protection (Usage.ai)	Cashback guarantee	Cashback guarantee	Cashback guarantee

GCP’s SUD mechanism is unique among the three major clouds. It provides automatic discounts for resources that are simply running, without any commitment. This makes GCP the most forgiving environment for variable workloads, while CUDs remain the highest-value path for predictable ones.

For a full multi-cloud pricing comparison, see: AWS vs Azure vs GCP: Cloud Pricing Comparison Across Top Services

What Changed in GCP Compute Engine Pricing in 2026?

Flexible CUD expansion: Expanded coverage for compute flexible CUDs became available to all Cloud Billing accounts, with automatic migration to the new spend-based model. The list of eligible SKUs now includes Compute Engine, GKE, and Cloud Run. This is the most significant structural change to GCP commitment purchasing in several years; teams with flexible CUDs now see automatic coverage of containerized workloads previously handled separately.

New machine series GA: N4 (Intel Emerald Rapids), C4 (Intel Granite Rapids), C4A (Google Axion/Arm), and N4D (AMD EPYC Turin) reached GA or expanded preview in 2024–2026. These offer higher density and better price-performance than N2/C2 for containerized and microservices deployments.

M4 CUD availability: Resource-based CUDs became available for M4 machine types (up to 6 TB memory), opening the highest-memory GCP instances to 3-year commitment discounts exceeding 60%.

Pricing Calculator: Google’s pricing calculator at calculator.cloud.google.com reflects current rates and supports side-by-side billing model comparison. Use it to validate worked examples before committing.

Every day without full CUD coverage = $6–12K in recoverable GCP spend. Usage.ai’s GCP CoPilot sizes, purchases, and manages your commitments automatically with cashback protection on any underutilization. Only pay when you save.

Book a Free Savings Analysis → usage.ai

Setup in 30 minutes. No infrastructure changes.

Frequently Asked Questions

1. How much does Google Cloud Compute Engine cost per hour?

Pricing depends on machine type, region, and OS. An e2-standard-4 (4 vCPU / 16 GB) in us-central1 costs $0.134/hr on-demand, dropping to $0.0844/hr on a 1-year CUD (37% savings) and $0.0603/hr on a 3-year CUD (55% savings). An n2-standard-4 costs $0.1942/hr on-demand. GPU instances start significantly higher like a g2-standard-4 (NVIDIA L4) runs approximately $0.707/hr.

2. Is Google Compute Engine free?

Not in general. Google’s Always Free tier includes one non-preemptible e2-micro VM per month in us-west1, us-central1, or us-east1. New customers also receive $300 in credits usable for 90 days. Any VM usage beyond these allowances is billed at standard rates from the first second of use.

3. What is the difference between a CUD and a SUD on GCP?

A Sustained Use Discount (SUD) is automatic; up to 30% off when you run a qualifying VM (N1, N2, N2D, C2, M1, M2) for more than 25% of a billing month, with no purchase required. A Committed Use Discount (CUD) requires a 1- or 3-year purchase commitment in exchange for 37–70% savings depending on machine type and term. CUDs deliver higher savings but carry commitment risk. The two cannot be combined. CUDs take precedence when both would otherwise apply.

4. What happens if I over-commit on a GCP CUD?

You pay for the full commitment regardless of actual usage. Google does not offer refunds, credits, or cashback on unused CUDs purchased directly. If your fleet shrinks mid-term and CUD utilization drops to 70%, you are paying for 30% of committed capacity you no longer run. This underutilization risk is the primary reason teams use automated CUD management with cashback protection rather than self-managing commitments directly.

5. What are Spot VMs and when should I use them?

Spot VMs are Compute Engine instances available at discounts of up to 91% off on-demand pricing. Google can reclaim Spot VMs at any time, making them suitable only for fault-tolerant, interruptible workloads: ML training batch jobs, data pipeline steps, CI/CD runners, and video rendering. Spot VMs cannot receive CUDs or SUDs and must be treated as a standalone pricing tier. Spot prices vary by machine type and region and can change up to once per day.

6. How do I calculate the right CUD size without over-committing?

Export 90 days of billing data to BigQuery. Calculate hourly vCPU and memory usage per region and machine series. Target the 10th percentile (P10) of the hourly distribution as your CUD quantity. This is your true floor. Committing to your median or average results in over-commitment during workload downturns, which Google will not refund. Review and resize commitments at every renewal.

7. Can GCP CUDs cover GKE and Cloud Run in addition to Compute Engine?

Yes. Flexible (spend-based) CUDs now cover Compute Engine, GKE (Standard and Autopilot), and Cloud Run under the expanded eligibility model rolled out to all Cloud Billing accounts in 2025–2026. Resource-based CUDs cover only specific Compute Engine VM resources. If workloads span GKE and Compute Engine, flexible CUDs reduce per-service commitment complexity at the cost of slightly lower discount percentages.

8. What is the Google Cloud Pricing Calculator and how do I use it?

The Google Cloud Pricing Calculator at calculator.cloud.google.com lets you select any service, machine type, region, and billing model to get an estimated monthly and annual cost. It supports multi-service estimates so you can model compute, storage, and networking together. For commitment decisions, run both 1-year and 3-year CUD scenarios and compare break-even timelines against your expected workload stability before purchasing.

9. What is the Google Cloud Compute Engine pricing calculator and how do I use it?

The Google Cloud Pricing Calculator at calculator.cloud.google.com lets you estimate costs for any Compute Engine machine type, region, OS, and billing model for on-demand, 1-year CUD, or 3-year CUD before committing. Enter your expected instance count, hours per month, and storage needs to get monthly and annual totals. For commitment decisions, run both 1-year and 3-year CUD scenarios and compare break-even timelines against your expected workload stability. Always cross-reference results against live pricing at cloud.google.com/compute/all-pricing.

Cut cloud cost with automation

Latest from our blogs

View all posts

GCP in May 2026: Gemini 3.5 Flash Lands at Google I/O, 2.0 Models Die June 1, and Airflow 3.1 Reaches GA

GCP, Guides

Google Cloud SQL Pricing 2026: Instance Costs, Storage, and What the Calculator Hides

GCP, Guides

Google Kubernetes Engine (GKE): Deploy, Scale & Cut Container Costs on GCP