Drift Monitor

Aggregate dashboards catch the obvious. Drift Monitor catches the slow, distributional changes that aggregate charts miss — a small but steady increase in tool-call retries, a quiet shift in response length, a gradually rising refusal rate.

What “drift” means here

Drift Monitor compares the distribution of a metric in a recent window against a baseline distribution. When the shape of the distribution moves enough, it’s flagged. The composite drift score combines movement across all five tracked metrics.

The five metrics

Metric	What it captures
Latency distribution	Wall-clock duration of each call.
Token-count distribution	Total tokens (input + output) per call.
Cost distribution	Cost per call.
Error rate	Share of errored calls in the window.
Output-length distribution	Character or token count of completions.

Each contributes to the composite score with a configurable weight.

Running a drift check

Drift snapshots can run on a schedule or be triggered ad hoc through the API.

POST /api/drift/compute
{
  "app_id": "...",
  "baseline_window": { "days": 30 },
  "recent_window": { "days": 7 }
}

Each computation produces a DriftSnapshot and one DriftResult per metric. The snapshot powers the timeline chart; the per-metric results power the distribution histograms.

Reading the results

Open the timeline

The timeline plots the composite score over time. Spikes are the moments worth investigating.

Drill into one snapshot

Click a snapshot. The detail view splits the score into the five per-metric contributions.

Compare distributions

For each metric, the histogram overlays the baseline and recent distributions so the shift is visible.

Jump to the runs

From the snapshot detail, link out to the affected runs in LLM Calls for ground-truth inspection.

When to wire drift in

After a model swap

Compare 7 days before vs 7 days after to confirm the new model behaves consistently.

After a prompt change

Use the deployment timestamp as the boundary between baseline and recent.

As a regression backstop

Schedule daily drift checks against a 30-day baseline. Spikes flag silent regressions.

Before regulatory reviews

Auditors care about stability. Drift evidence is one of the artefacts they ask for.

Alerts

Trigger alerts when the composite score breaches a threshold.

Compliance heatmap

Drift evidence feeds into the operational-control views.

Last modified on June 3, 2026

Get started

Observability

Governance

Compliance

ROI & value

SDK reference

Setup

Drift Monitor

What “drift” means here

The five metrics

Running a drift check

Reading the results

When to wire drift in

After a model swap

After a prompt change

As a regression backstop

Before regulatory reviews

Alerts

Compliance heatmap

​What “drift” means here

​The five metrics

​Running a drift check

​Reading the results

​When to wire drift in

After a model swap

After a prompt change

As a regression backstop

Before regulatory reviews

​Related resources

Alerts

Compliance heatmap

What “drift” means here

The five metrics

Running a drift check

Reading the results

When to wire drift in

Related resources