Feature Request: px.distribution_drift, px.model_disagreement, px.quantile_evolution — ML Monitoring Visualization Primitives
Hi Plotly maintainers,
I'd like to propose three new high-level functions for plotly.express that address a real gap in the ML observability and data science space. These are general-purpose visualization primitives — not domain-specific — that fill gaps common in production ML workflows.
I've built working prototypes with full test coverage and a live interactive demo. Happy to take these through the full contribution process.
1. px.distribution_drift(reference, current, ...)
What it does: Compares two distributions (e.g. training vs. live inference data) with overlapping normalized histograms and a scalar KL-divergence annotation. When drift exceeds a configurable threshold, the current distribution is highlighted in a warning color.
fig = px.distribution_drift(
reference, # array-like: baseline samples
current, # array-like: current samples
bins=50,
divergence_threshold=0.1,
reference_name="Reference",
current_name="Current",
title=None,
template=None,
)
Gap it fills: px.histogram with barmode="overlay" gets you close, but there's no built-in divergence scoring, threshold annotation, or drift-aware color logic. This is a one-liner for a very common ML monitoring pattern.
2. px.model_disagreement(x, y, predictions, ...)
What it does: Scatter plot of samples in a 2-D reduced feature space (UMAP/t-SNE/PCA), colored by ensemble variance. Samples above a disagreement threshold are fully opaque; others are dimmed. Includes a marginal histogram of variance scores.
fig = px.model_disagreement(
x, # dim 1 of reduced space (per sample)
y, # dim 2 of reduced space
predictions, # shape (n_samples, n_models) — ensemble preds
threshold=0.05,
colorscale="Viridis",
title=None,
template=None,
)
Gap it fills: px.scatter with color= handles the spatial encoding, but computing ensemble variance, dimming low-uncertainty samples, and combining with a marginal histogram requires significant boilerplate. This surfaces a critical active-learning and model-audit pattern as a single call.
3. px.quantile_evolution(timestamps, p50, *, p10, p90, p25, p75, ...)
What it does: Layered ribbon/band chart tracking P10–P90 and P25–P75 bands with P50 as a bold center line, over time or cohorts. Automatically annotates timestamps where the spread exceeds a volatility threshold.
fig = px.quantile_evolution(
timestamps, # x-axis: ISO strings, labels, or numbers
p50=median_values, # required: median (center line)
p10=p10_values, # optional: outer lower band
p90=p90_values, # optional: outer upper band
p25=p25_values, # optional: IQR lower bound
p75=p75_values, # optional: IQR upper bound
mean=mean_values,
show_mean=False,
volatility_multiplier=1.5,
title=None,
template=None,
)
Gap it fills: Building fill-between ribbon plots requires chaining 5+ go.Scatter traces with careful fill="tonexty" sequencing. This is error-prone and undiscoverable. A single px call makes this pattern accessible to the full data science audience.
Why these belong in plotly.express
- Each returns a
go.Figure — composable and customizable after the fact
- They accept lists, numpy arrays, and pandas Series via a thin
_to_list wrapper
- They use only existing trace types (
go.Histogram, go.Scatter) — no new trace types needed
- They follow the existing
px design philosophy: one call, sensible defaults, everything overridable
Working prototype
Zero dependencies beyond plotly itself. 12-test suite passes on Python 3.9–3.12.
I'm ready to adapt the implementation to Plotly's internal conventions, add API docs, write tests in the tests/test_core/test_px/ format, and address any design feedback on the signatures.
Thank you for considering it!
Feature Request:
px.distribution_drift,px.model_disagreement,px.quantile_evolution— ML Monitoring Visualization PrimitivesHi Plotly maintainers,
I'd like to propose three new high-level functions for
plotly.expressthat address a real gap in the ML observability and data science space. These are general-purpose visualization primitives — not domain-specific — that fill gaps common in production ML workflows.I've built working prototypes with full test coverage and a live interactive demo. Happy to take these through the full contribution process.
1.
px.distribution_drift(reference, current, ...)What it does: Compares two distributions (e.g. training vs. live inference data) with overlapping normalized histograms and a scalar KL-divergence annotation. When drift exceeds a configurable threshold, the current distribution is highlighted in a warning color.
Gap it fills:
px.histogramwithbarmode="overlay"gets you close, but there's no built-in divergence scoring, threshold annotation, or drift-aware color logic. This is a one-liner for a very common ML monitoring pattern.2.
px.model_disagreement(x, y, predictions, ...)What it does: Scatter plot of samples in a 2-D reduced feature space (UMAP/t-SNE/PCA), colored by ensemble variance. Samples above a disagreement threshold are fully opaque; others are dimmed. Includes a marginal histogram of variance scores.
Gap it fills:
px.scatterwithcolor=handles the spatial encoding, but computing ensemble variance, dimming low-uncertainty samples, and combining with a marginal histogram requires significant boilerplate. This surfaces a critical active-learning and model-audit pattern as a single call.3.
px.quantile_evolution(timestamps, p50, *, p10, p90, p25, p75, ...)What it does: Layered ribbon/band chart tracking P10–P90 and P25–P75 bands with P50 as a bold center line, over time or cohorts. Automatically annotates timestamps where the spread exceeds a volatility threshold.
Gap it fills: Building fill-between ribbon plots requires chaining 5+
go.Scattertraces with carefulfill="tonexty"sequencing. This is error-prone and undiscoverable. A singlepxcall makes this pattern accessible to the full data science audience.Why these belong in
plotly.expressgo.Figure— composable and customizable after the fact_to_listwrappergo.Histogram,go.Scatter) — no new trace types neededpxdesign philosophy: one call, sensible defaults, everything overridableWorking prototype
Zero dependencies beyond
plotlyitself. 12-test suite passes on Python 3.9–3.12.I'm ready to adapt the implementation to Plotly's internal conventions, add API docs, write tests in the
tests/test_core/test_px/format, and address any design feedback on the signatures.Thank you for considering it!