Calculating Weighted Cost Deltas for Multi-Table Queries

Weighted cost deltas re-scale each per-table optimizer cost by cardinality, I/O tier, join algorithm, and execution frequency so that a regression on a 50M-row fact table can never be diluted by trivial offsets on lookup tables. This runbook documents the exact weighting formula, a production async implementation with structured logging and tracing, and the fallback protocol to run during a live incident. It is the multi-table refinement of the aggregate comparator in Tracking Cost Deltas Across Baseline Versions, and it consumes the engine-normalized cost vectors defined in Cost Estimation Mapping Across PostgreSQL and MySQL.

Symptom Identification and Production Thresholds

Aggregate cost tracking produces false negatives on wide join trees because a large shift on one relation is netted against small opposite moves on others. Treat the following as breach conditions — each is a hard numeric trigger, not a guideline:

Latency/cost divergence. p95 query latency rises by more than 15% while the raw aggregate optimizer cost delta stays below 5%. The two numbers should move together; a gap this wide means the aggregate is masking a localized regression.
Join-order regression on large relations. The plan introduces a nested-loop scan whose inner relation has an estimated row count above 1,000,000. Nested-loop over a million-row inner side is a near-certain latency cliff.
Dominant-table cost collapse. A relation that previously accounted for more than 40% of estimated plan cost drops below 10% of the total between two baseline versions — cost did not disappear, it migrated to a more expensive access path elsewhere.
Weighted-delta breach. The normalized weighted delta $\Delta WC > 0.12$ (12%) and the absolute weighted cost exceeds the per-workload operational floor. The dual gate suppresses noise from low-traffic queries while still catching high-impact shifts.
Buffer-pool pressure correlated with deploy. shared_buffers eviction rate climbs more than 25% within 15 minutes of a plan-version rollout on the affected fingerprint.

When two or more of these fire together, escalate to the weighted calculation immediately to isolate the offending table and operator pair.

Root Cause Analysis

Optimizers cost CPU, I/O, and memory with per-engine heuristics; a 20% increase on a small dimension table is operationally irrelevant next to a 5% increase on a fact table, yet unweighted aggregation scores them identically. Four failure domains recur.

Cardinality estimation drift. Statistics go stale after a bulk load or partition-pruning change and the planner re-costs joins against wrong row estimates. Confirm the drift before re-baselining:

SQL

-- PostgreSQL: relations whose stats are stale relative to write volume
SELECT relname, n_live_tup, n_mod_since_analyze,
       round(100.0 * n_mod_since_analyze / GREATEST(n_live_tup, 1), 1) AS pct_churn,
       last_analyze, last_autoanalyze
FROM pg_stat_user_tables
WHERE n_mod_since_analyze > 0
ORDER BY pct_churn DESC
LIMIT 20;

Join-algorithm transition. A switch from hash join to merge join (or to nested loop) redistributes cost across operators without changing the logical result set, so the aggregate barely moves. Diff the operator mix directly with psql:

BASH

psql -qAtX -d prod_replica -c \
  "EXPLAIN (FORMAT JSON) $(cat query.sql)" \
  | jq -r '.. | objects | select(."Node Type") | ."Node Type"' | sort | uniq -c

Any change in the operator histogram between baseline and candidate is a lead; correlate it with Detecting Join Type Shifts in Execution Plans.

I/O tier / cost-multiplier mismatch. A storage migration changes real disk latency but the weighting map still carries the old io_cost_multiplier, producing a false negative. Check the planner’s cost knob against measured latency:

SQL

SELECT name, setting, unit FROM pg_settings
WHERE name IN ('random_page_cost', 'seq_page_cost', 'effective_io_concurrency');

Parameter sniffing / plan-cache reuse. A cached plan built for one parameter range is reused for a skewed range, so the weighted delta flags only for specific bind values. Segment baselines by parameter quartile rather than aggregating; the normalization prerequisite for this lives in Normalizing Query Plans for Cross-Engine Comparison.

Weighting Methodology and Formula

The weighted cost delta normalizes raw optimizer cost against operational impact through a deterministic pipeline:

Extract per-table costs. Parse the plan to isolate Total Cost, Plan Rows, and Node Type for each relation access node.
Apply dynamic multipliers:
- W_cardinality = log10(estimated_rows + 1)
- W_io = io_cost_multiplier — storage tier: NVMe = 1.0, SSD = 1.5, HDD = 3.0
- W_join = join_complexity_factor — Nested Loop = 2.0, Hash Join = 1.2, Merge Join = 1.0
- W_freq = historical_executions_24h / total_query_executions_24h
Sum weighted cost: $WC = \sum (Cost_{i} \times W_{cardinality,i} \times W_{io,i} \times W_{join,i} \times W_{freq,i})$
Compute delta: $\Delta WC = (WC_{current} − WC_{baseline}) / WC_{baseline}$

A regression is flagged when $\Delta WC > 0.12$ and the absolute WC_current clears the workload’s operational floor. Because every weight derives from historical row estimates and static tier configuration — never live execution timings — the calculation is idempotent and reproducible across replicas, exactly like the aggregate comparator it extends.

Step-by-Step Remediation and Python Automation

The reference implementation ingests plan JSON, extracts per-table nodes, applies the weights, and emits a routing decision with structured logs and OpenTelemetry spans so a flagged delta is traceable end to end in your observability stack.

PYTHON

import json
from dataclasses import dataclass, field
from math import log10

import structlog
from opentelemetry import trace

log = structlog.get_logger("costdelta.weighted")
tracer = trace.get_tracer("costdelta.weighted")

IO_MULTIPLIERS = {"NVMe": 1.0, "SSD": 1.5, "HDD": 3.0}
JOIN_COMPLEXITY = {
    "Nested Loop": 2.0, "Hash Join": 1.2, "Merge Join": 1.0,
    "Index Scan": 0.8, "Index Only Scan": 0.7, "Seq Scan": 1.5,
}
DELTA_THRESHOLD = 0.12          # 12% weighted-delta gate
OFFENDER_RATIO = 1.15           # per-table weighted-cost growth that names an offender


@dataclass(frozen=True)
class TableNode:
    name: str
    raw_cost: float
    estimated_rows: int
    operator_type: str
    io_tier: str = "SSD"
    exec_frequency: float = 1.0


@dataclass
class WeightedDeltaResult:
    query_fingerprint: str
    baseline_wc: float
    current_wc: float
    delta_pct: float
    routing_flag: str
    offending_tables: list[str] = field(default_factory=list)


def _table_weight(n: TableNode) -> float:
    return (
        log10(n.estimated_rows + 1)
        * IO_MULTIPLIERS.get(n.io_tier, 1.5)
        * JOIN_COMPLEXITY.get(n.operator_type, 1.0)
        * max(n.exec_frequency, 0.01)
    )


def calculate_weighted_cost(nodes: list[TableNode]) -> float:
    return sum(n.raw_cost * _table_weight(n) for n in nodes)


def compute_weighted_delta(
    fingerprint: str,
    baseline: list[TableNode],
    current: list[TableNode],
    threshold: float = DELTA_THRESHOLD,
) -> WeightedDeltaResult:
    with tracer.start_as_current_span("compute_weighted_delta") as span:
        span.set_attribute("query.fingerprint", fingerprint)
        baseline_wc = calculate_weighted_cost(baseline)
        current_wc = calculate_weighted_cost(current)

        if baseline_wc == 0.0:
            log.warning("empty_baseline", fingerprint=fingerprint)
            return WeightedDeltaResult(fingerprint, 0.0, current_wc, 0.0, "STABLE")

        delta = (current_wc - baseline_wc) / baseline_wc
        flag = "REGRESSION" if delta > threshold else "DRIFT" if delta > 0.05 else "STABLE"

        offenders: list[str] = []
        if flag == "REGRESSION":
            by_name = {n.name: n for n in baseline}
            for c in current:
                b = by_name.get(c.name)
                if b is None:
                    continue
                bw, cw = b.raw_cost * _table_weight(b), c.raw_cost * _table_weight(c)
                if bw > 0 and cw > bw * OFFENDER_RATIO:
                    offenders.append(c.name)

        span.set_attribute("costdelta.pct", round(delta, 4))
        span.set_attribute("costdelta.flag", flag)
        log.info("weighted_delta", fingerprint=fingerprint, delta_pct=round(delta, 4),
                 flag=flag, offenders=offenders)
        return WeightedDeltaResult(fingerprint, baseline_wc, current_wc, delta, flag, offenders)


def parse_explain_json(plan_json: str) -> list[TableNode]:
    """Flatten PostgreSQL EXPLAIN (FORMAT JSON); root is plan[0]['Plan']."""
    root = json.loads(plan_json)[0].get("Plan", {})
    nodes: list[TableNode] = []

    def walk(node: dict) -> None:
        if "Relation Name" in node:
            nodes.append(TableNode(
                name=node["Relation Name"],
                raw_cost=float(node.get("Total Cost", 0.0)),
                estimated_rows=int(node.get("Plan Rows", 0)),
                operator_type=node.get("Node Type", "Unknown"),
            ))
        for child in node.get("Plans", []):
            walk(child)

    walk(root)
    return nodes

Running the comparator against a real fact-table regression emits a single structured line naming the offender:

TEXT

2026-07-04T18:02:11Z [info] weighted_delta fingerprint=a1f3…9c delta_pct=0.184 flag=REGRESSION offenders=['orders_fact']

When flag == "REGRESSION", execute the safe fallback chain:

Isolate. Read offending_tables to get the exact relation(s) driving the delta.
Validate on a replica. Run EXPLAIN (ANALYZE, BUFFERS) on staging to confirm estimated-vs-actual divergence — never against the primary.
Override narrowly. Apply a targeted query hint, SQL profile, or plan guide to pin the baseline join order. Do not set enable_seqscan = off or disable the optimizer globally.
Watch. Track p95 latency and buffer hit ratio for 15 minutes post-override.
Roll back. If latency degrades further, revert the hint and run a targeted ANALYZE on the offending relation.

Export delta_pct and routing_flag as OpenTelemetry span attributes and correlate them with query latency using the OpenTelemetry database semantic conventions.

Verification Checklist

Work through each item after applying a fix; every box must be checked before the fingerprint returns to STABLE:

[ ] $\Delta WC$ for the affected fingerprint has fallen back under 0.12 on the last three captures.
[ ] offending_tables is empty on the most recent comparison.
[ ] EXPLAIN (ANALYZE, BUFFERS) on the replica shows estimated rows within 2× of actual for every relation in the join.
[ ] p95 latency for the query pattern is within 5% of the pre-regression baseline.
[ ] IO_MULTIPLIERS matches the current storage tier of each relation’s tablespace.
[ ] Any temporary session override (hint, SET) has been reverted and the fix is durable across a fresh connection.
[ ] The new plan version is promoted as the anchored baseline only after two clean capture cycles.

Compatibility and Engine-Specific Notes

The weighting shape is engine-agnostic, but the fields you feed it differ per engine. Map them before wiring the parser.

Concern	PostgreSQL	MySQL 8.x	Distributed SQL (CockroachDB / Yugabyte)
Per-node cost field	`Total Cost` (`EXPLAIN FORMAT JSON`)	`cost_info.query_cost` (`EXPLAIN FORMAT=JSON`)	Cost / `estimated row count` in `EXPLAIN (VERBOSE)`
Row estimate	`Plan Rows`	`rows_examined_per_scan`	`estimated row count`
Relation identifier	`Relation Name`	`table_name`	`table` (may be range-partitioned)
Join operator label	`Nested Loop` / `Hash Join` / `Merge Join`	`nested_loop` / `hash_join` (block-nested-loop pre-8.0.18)	`hash-join` / `lookup-join` / `merge-join`
I/O weight caveat	`random_page_cost` reflects tier	cost unit is not disk-time calibrated; tune `W_io` manually	per-range replica placement makes `W_io` node-dependent

MySQL exposes no Merge Join operator, so fold its equivalents into the Hash Join weight. On distributed engines a single relation can span nodes on different storage tiers, so derive W_io per range rather than per table. Ground the cross-engine cost mapping in Cost Estimation Mapping Across PostgreSQL and MySQL and its latency-anchoring companion, Mapping EXPLAIN Costs to Real-World Latency Metrics.

← Back to Tracking Cost Deltas Across Baseline Versions (parent topic)
Correlate cost with structure: Detecting Join Type Shifts in Execution Plans
Correlate cost with access paths: Monitoring Index Usage Changes for Regression Signals
Suppress noisy weighted deltas: Tuning Thresholds for False Positive Reduction
Wider context: Regression Detection & Rule Engines

Symptom Identification and Production Thresholds #

Root Cause Analysis #

Weighting Methodology and Formula #

Step-by-Step Remediation and Python Automation #

Verification Checklist #

Compatibility and Engine-Specific Notes #

Related #