Gotchas

The 8 hidden costs of cloud monitoring

Verified April 2026

Vendor pricing pages do not lie. They are also not complete. Here are eight specific charges that consistently turn budgeted estimates into surprise invoices.

TL;DR

The median actual monitoring invoice runs 37 to 97 percent above the initial list-pricing estimate. The reasons are consistent: custom metric cardinality, log indexing on top of ingest, retention defaults, APM span indexing, container counting, high-water mark host accounting, migration overhead, and tool sprawl across overlapping platforms.

Eight charges that do not appear on the pricing page

Hidden cost

Custom metrics explosion

Kubernetes labels generate metric series in the hundreds of thousands.

A typical Kubernetes deployment carries five labels (pod, namespace, container, version, region). Each unique combination of label values is a separate metric time series. A 50-host cluster with moderate microservice count routinely produces 500,000+ custom metric series. At Datadog's list rate of $0.05/100/mo above the included 100/host, that is roughly $2,000/month of custom metric overage on top of the base infrastructure bill.

Typical impact

10 to 30 percent of total bill, surprise overages most common

Hidden cost

Log indexing on top of ingestion

Two meters fire for every gigabyte of logs you send.

Datadog charges $0.10/GB to ingest logs and an additional $1.70 per million indexed events to make them searchable. At an average 4 million events per gigabyte, a 100 GB/day log volume costs roughly $300/month in ingestion and $5,100/month in indexing. Splunk and Elastic apply analogous splits between ingest and search-tier storage.

Typical impact

20 to 50 percent of total bill for log-heavy stacks

Hidden cost

High-water mark host counting

Autoscaling spikes during peak traffic raise the bill for the whole month.

Datadog and several peers count host-hours rather than instantaneous host count. A scale-out event during a marketing campaign or a backfill batch job inflates the monthly average permanently. Teams running event-driven workloads often discover that their effective host count is two to three times the steady-state baseline.

Typical impact

10 to 25 percent above expected

Hidden cost

Retention defaults

Past 15 days, retention multiplies log indexing cost roughly 1.5x to 4x.

Default log retention is typically 15 days. Moving to 30, 60, or 90 days multiplies indexed-event storage cost by approximately 1.5x, 2.5x, and 4x respectively. Most teams discover this when an incident or compliance audit forces them to query historical data and the next invoice doubles.

Typical impact

1.5x to 4x multiplier on log line items

Hidden cost

APM span indexing overage

1M indexed spans per APM host included; busy services produce more.

Datadog APM includes 1 million indexed spans per APM host per month. Above that, teams pay roughly $1.70/million. A single busy microservice on a single host can generate millions of spans daily. Trace sampling at 5 to 10 percent restores predictability without meaningful loss of fidelity for most workloads.

Typical impact

5 to 20 percent overage on APM line

Hidden cost

Container counting

Misconfigured agents bill every pod as a fractional or full host.

Datadog counts containers as fractional hosts. Misconfigured agents, sidecars, and DaemonSets occasionally trigger every pod to be billed as a separate host, multiplying the apparent host count tenfold. Audits of Kubernetes monitoring deployments routinely uncover this misconfiguration.

Typical impact

Up to 10x cost multiplication when triggered

Hidden cost

Migration and switching costs

Six to twelve weeks of engineering time, dashboards, alerts, runbooks, retraining.

Switching observability vendors is rarely a flip of a config file. Dashboards must be rewritten, alerts re-authored, runbooks updated, on-call teams retrained. A realistic budget for a 100-host migration is 6 to 12 weeks of engineering time across multiple roles, which at loaded engineering cost is $30,000 to $80,000 one-off.

Typical impact

One-off cost, often justifies staying or accelerates leaving

Hidden cost

Observability tool sprawl

Multiple paid platforms covering overlapping signal categories.

Datadog for infra, PagerDuty for alerts, Splunk for logs, Sentry for errors, plus a homegrown dashboard. Total spend across overlapping tools commonly runs two to three times what a consolidated single vendor would cost. Consolidation is one of the largest discrete levers available, particularly when integrated with the per-vendor cost-reduction strategies elsewhere on this site.

Typical impact

20 to 40 percent saving on total observability spend

Audit yourself

A 10-point cost audit checklist

Run this once a quarter. Most teams find at least one substantial saving in under an hour.

01Pull a 12-month invoice trend. Identify month-over-month growth that exceeds infra growth.
02Categorise spend: infra, APM, logs, custom metrics, RUM, synthetics, seats. Most teams find one category at 40 percent of total.
03Audit custom metric cardinality. List the top 10 highest-cardinality metrics and the labels generating them.
04Audit log volume by source. Drop or sample any log source above 10 GB/day that is not actively queried.
05Audit retention setting per pipeline. Match retention to actual query patterns.
06Audit APM trace sampling. Default to 10 percent unless there is a documented reason for 100 percent.
07Audit container monitoring configuration. Confirm container-as-host billing is intentional.
08Audit dev/staging monitoring. Confirm only production is on the paid tier.
09Audit overlapping vendors. Identify any signal type covered by two or more paid platforms.
10Schedule a quarterly review. Cost growth that outpaces infra growth is a leading indicator of trouble.

Where to go next

Reduce costs →

Twelve concrete strategies ranked by saving and effort.

Datadog pricing →

Every product, line by line.

Run the calculator →

Plug in your numbers, see what you should pay.

Frequently asked

Why is my Datadog bill so high?

The most common causes, in order of frequency: custom metrics generated from high-cardinality Kubernetes labels, log indexing charges on top of ingestion, retention upgrades beyond the 15-day default, APM span indexing above the included 1M per host, and container counting that bills each pod as a fractional or full host.

What are the hidden costs of observability tools?

Vendor pricing pages do not typically itemise custom-metric cardinality charges, log indexing on top of ingestion, retention multipliers, APM span overages, high-water mark hourly host counting, container-to-host conversion, migration costs, or tool sprawl across overlapping platforms.

How much higher than list price are real bills?

Independent and vendor research consistently puts the median actual invoice at 37 to 97 percent above the initial list-pricing estimate that buyers receive during the sales cycle. The variance is driven by the eight mechanisms documented on this page.

How do I audit my current monitoring spend?

Pull twelve months of invoices, categorise spend by signal type, identify the single largest line item, and audit the configuration that drives it (custom metric cardinality, log volume, retention setting, APM sampling rate). Repeat quarterly. Most teams find one obvious lever inside the first hour.