Plan Hashing Algorithms for SQL Engines

The plan hashing stage is the deterministic fingerprinting layer that turns a normalized execution plan into a single stable identifier every downstream stage can key on, without ever evaluating performance itself.

Execution plans are volatile by nature: row estimates drift, timings jitter, buffer counters change on every run, and minor engine patches reorder JSON keys. A regression pipeline cannot compare two plans byte-for-byte and expect a meaningful answer. This stage exists to collapse that volatility into a canonical form and emit a 64-character SHA-256 fingerprint plus a compact metadata envelope. It performs no threshold math, no baseline lookup, and no CI gating — those are separate responsibilities, isolated so a fault in fingerprinting can never corrupt a gate verdict. This page defines the stage’s input and output contracts, a runnable async implementation, the numeric service-level objectives it must hold, and the failure modes you will actually page on. It is a component of the Core Architecture & Baselining Fundamentals reference architecture.

Architectural boundaries

Strict isolation is what makes a fingerprint trustworthy. This stage consumes a normalized plan tree produced upstream and emits an immutable (plan_hash, metadata) pair to storage and the regression queue. It sits after normalization and before evaluation, and it holds no state between payloads.

The stage does not parse raw EXPLAIN text from a live database — that collection work belongs to the Automated EXPLAIN Capture & Storage Workflows pipeline, and the cross-dialect flattening happens in normalizing query plans for cross-engine comparison. By the time a payload reaches the hasher, operator names are already mapped to a unified vocabulary and cost fields are already engine-normalized by Cost Estimation Mapping Across PostgreSQL and MySQL. The hasher’s only job is canonical serialization and cryptographic digest.

The contract at each boundary is explicit:

Ingress: a structured JSON payload carrying a normalized plan tree, an engine identifier, an engine version, and a schema version. Payloads lacking a Node Type root or version metadata are rejected synchronously before any hashing runs.
Processing: normalization and hashing execute in a pure, stateless context. Metrics and traces are emitted through async side channels so the hot path stays deterministic and CPU-bound.
Egress: the (plan_hash, metadata) envelope routes to immutable storage and is published to the regression queue. On failure the payload routes to a dead-letter queue (DLQ) with an explicit error code — ERR_MALFORMED_JSON, ERR_MISSING_ROOT, or ERR_VERSION_DRIFT — never a silent fallback that would poison baseline history.

Because the emitted identifier is a pure function of plan structure — not of timings or cost magnitudes — the same logical plan hashes identically across replicas, across restarts, and across minor engine patches. That stability is the precondition the downstream regression threshold logic relies on to attach historical telemetry to the right baseline window.

Deterministic routing and schema enforcement

Every payload is validated against a strict field contract before it is admitted. The canonical JSON Schema pins the accepted engines, forbids unknown top-level fields, and requires the version metadata that governs which normalization ruleset applies:

JSON

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "PlanHashingIngress",
  "type": "object",
  "additionalProperties": false,
  "required": ["engine", "engine_version", "schema_version", "plan"],
  "properties": {
    "engine": { "enum": ["postgresql", "mysql"] },
    "engine_version": { "type": "string", "pattern": "^\\d+\\.\\d+(\\.\\d+)?$" },
    "schema_version": { "type": "string", "pattern": "^v\\d+$" },
    "plan": {
      "type": "object",
      "required": ["Node Type"],
      "properties": { "Node Type": { "type": "string", "minLength": 1 } }
    }
  }
}

Canonicalization is the heart of determinism. Before a digest is computed the tree passes through five deterministic transforms, each of which removes a documented source of hash drift:

Strip volatile keys. Fields such as Actual Rows, Actual Total Time, Buffers, Shared Hit Blocks, and Sort Method carry per-run noise and are dropped entirely.
Canonicalize operators. Dialect-specific node names (Hash Join vs hash_join, Seq Scan vs table_scan) are mapped to a single vocabulary so an equivalent plan on either engine yields one identifier.
Sort commutative children. Inputs to order-independent operators (a hash join’s two sides, a union’s branches) are sorted lexicographically by their serialized form, so input ordering cannot change the hash.
Serialize canonically. The cleaned tree is emitted as UTF-8 JSON with sorted keys and no incidental whitespace (separators=(",", ":")), pinning a single byte representation.
Digest. SHA-256 over the canonical bytes yields the 64-hex-character plan_hash.

Routing of the resulting envelope is formula-driven, not ad hoc. The storage partition is derived from the fingerprint itself so writes distribute uniformly and idempotently:

Partition key: partition = int(plan_hash[:4], 16) % PLAN_HASH_SHARD_COUNT — the first 16 bits of the digest fan out across a fixed ring (default 256), giving even distribution independent of query text.
Namespace key: namespace = f"{engine}/{engine_version}/" — for example postgresql/17.5/ or mysql/8.4.5/, so a version upgrade opens a new namespace instead of colliding with prior baselines.
Idempotent write: storage upserts are conditional on plan_hash; a digest that already exists is a silent no-op, so replaying a payload never duplicates metadata.

This is the identifier every other stage keys on: the same SHA-256 fingerprinting approach is the join key the regression evaluator uses to fetch a baseline, so any non-determinism here multiplies into false positives everywhere downstream.

Production-ready implementation

The reference worker is asyncio-native. Validation and hashing are pure and CPU-bound; storage and queue writes are awaited through asyncpg; observability is emitted through structlog and OpenTelemetry without blocking the digest path. The heavy normalize_plan recursion is offloaded to a thread executor so a large plan tree cannot stall the event loop.

PYTHON

import asyncio
import hashlib
import json
from typing import Any

import asyncpg
import structlog
from opentelemetry import trace
from opentelemetry.metrics import get_meter

log = structlog.get_logger("plan_hashing")
tracer = trace.get_tracer("plan_hashing")
meter = get_meter("plan_hashing")

HASH_TOTAL = meter.create_counter("plan_hash_total")
HASH_REJECTED = meter.create_counter("plan_hash_rejected_total")
NORMALIZE_LATENCY = meter.create_histogram("plan_normalize_latency_ms")

SHARD_COUNT = 256

VOLATILE_KEYS = frozenset({
    "Actual Rows", "Actual Total Time", "Actual Loops", "Actual Startup Time",
    "Execution Time", "Planning Time", "Buffers", "Sort Method",
    "Shared Hit Blocks", "Shared Read Blocks", "Temp Written Blocks",
})
OPERATOR_MAP = {
    "Hash Join": "hash_join", "Merge Join": "merge_join",
    "Nested Loop": "nested_loop", "Seq Scan": "seq_scan",
    "Index Scan": "index_scan", "Index Only Scan": "index_only_scan",
    "table_scan": "seq_scan",
}


class RejectedPayload(Exception):
    def __init__(self, code: str) -> None:
        self.code = code
        super().__init__(code)


def normalize_plan(node: Any) -> Any:
    """Pure, deterministic: strip volatility, canonicalize, sort commutative children."""
    if isinstance(node, dict):
        cleaned = {k: v for k, v in node.items() if k not in VOLATILE_KEYS}
        if "Node Type" in cleaned:
            nt = cleaned["Node Type"]
            cleaned["Node Type"] = OPERATOR_MAP.get(nt, nt.lower())
        result = {k: normalize_plan(v) for k, v in cleaned.items()}
        if isinstance(result.get("Plans"), list):
            result["Plans"] = sorted(
                result["Plans"], key=lambda x: json.dumps(x, sort_keys=True)
            )
        return result
    if isinstance(node, list):
        return sorted(
            (normalize_plan(item) for item in node),
            key=lambda x: json.dumps(x, sort_keys=True),
        )
    return node


def compute_hash(plan: dict[str, Any]) -> str:
    canonical = normalize_plan(plan)
    canonical_bytes = json.dumps(
        canonical, sort_keys=True, separators=(",", ":")
    ).encode("utf-8")
    return hashlib.sha256(canonical_bytes).hexdigest()


def validate(payload: dict[str, Any]) -> None:
    if not isinstance(payload.get("plan"), dict):
        raise RejectedPayload("ERR_MALFORMED_JSON")
    if "Node Type" not in payload["plan"]:
        raise RejectedPayload("ERR_MISSING_ROOT")
    if not payload.get("engine_version") or not payload.get("schema_version"):
        raise RejectedPayload("ERR_VERSION_DRIFT")


async def hash_and_route(payload: dict[str, Any], pool: asyncpg.Pool) -> str:
    with tracer.start_as_current_span("hash_and_route") as span:
        try:
            validate(payload)
        except RejectedPayload as exc:
            HASH_REJECTED.add(1, {"code": exc.code})
            await log.awarning("payload_rejected", code=exc.code)
            await route_to_dlq(payload, exc.code, pool)
            raise

        engine = payload["engine"]
        version = payload["engine_version"]

        loop = asyncio.get_running_loop()
        start = loop.time()
        plan_hash = await loop.run_in_executor(None, compute_hash, payload["plan"])
        NORMALIZE_LATENCY.record((loop.time() - start) * 1000.0)

        partition = int(plan_hash[:4], 16) % SHARD_COUNT
        namespace = f"{engine}/{version}/"
        span.set_attribute("plan_hash", plan_hash)
        span.set_attribute("partition", partition)

        # Idempotent conditional upsert keyed on the fingerprint.
        await pool.execute(
            """
            INSERT INTO plan_fingerprints
                (plan_hash, partition, namespace, engine, engine_version, schema_version)
            VALUES ($1, $2, $3, $4, $5, $6)
            ON CONFLICT (plan_hash) DO NOTHING
            """,
            plan_hash, partition, namespace, engine, version, payload["schema_version"],
        )
        HASH_TOTAL.add(1, {"engine": engine})
        return plan_hash


async def route_to_dlq(payload: dict[str, Any], code: str, pool: asyncpg.Pool) -> None:
    await pool.execute(
        "INSERT INTO plan_hash_dlq (raw_payload, error_code) VALUES ($1, $2)",
        json.dumps(payload), code,
    )

The DLQ path is deliberate. A payload that fails validation is never guessed at — it is persisted verbatim with its error code so a parser regression is auditable and replayable once the normalization ruleset is fixed. The recursion runs in a thread executor because a deeply nested plan on a wide analytical query can hold the GIL long enough to add tail latency; keeping it off the event loop protects ingestion throughput.

Threshold table

The stage is CPU-bound and stateless, so its service-level objectives are about latency, rejection rate, and hash stability rather than data freshness. These are the numbers the on-call rotation gates on:

Metric	Pass	Warn	Block
`plan_normalize_latency_ms` p95	≤ 8 ms	8–25 ms	> 25 ms
`plan_normalize_latency_ms` p99	≤ 20 ms	20–60 ms	> 60 ms
Rejection rate (`plan_hash_rejected_total` / `plan_hash_total`)	< 0.5%	0.5–2%	> 2%
DLQ depth (payloads)	< 100	100–1000	> 1000
Hash-stability replay mismatch rate	0%	—	> 0%

Hash stability is a hard zero: replaying yesterday’s captured plans through the current worker must reproduce every fingerprint exactly. Any non-zero mismatch means a normalization change silently rewrote history and must block deploy. The alert rules encode the block bands:

YAML

groups:
  - name: plan-hashing-slo
    rules:
      - alert: PlanHashNormalizeLatencyHigh
        expr: histogram_quantile(0.95, rate(plan_normalize_latency_ms_bucket[5m])) > 25
        for: 10m
        labels: { severity: page }
        annotations:
          summary: "Plan hashing p95 normalize latency above 25ms"
      - alert: PlanHashRejectionRateHigh
        expr: |
          sum(rate(plan_hash_rejected_total[5m]))
            / sum(rate(plan_hash_total[5m])) > 0.02
        for: 10m
        labels: { severity: page }
        annotations:
          summary: "Plan hashing rejection rate above 2% — upstream schema drift"
      - alert: PlanHashReplayMismatch
        expr: plan_hash_replay_mismatch_total > 0
        for: 0m
        labels: { severity: page }
        annotations:
          summary: "Fingerprint drift — a normalization change rewrote existing hashes"

Failure scenarios and root cause analysis

Fingerprint drift after a normalization change. Symptom: a sudden spike of never-before-seen plan_hash values with no query changes, and the regression evaluator loses baselines for stable queries. Root cause: an edit to VOLATILE_KEYS, OPERATOR_MAP, or serialization settings altered the canonical bytes. Diagnose by replaying an archived corpus: python -m plan_hashing.replay --corpus s3://baselines/replay/ --expect-stable should report zero mismatches. Mitigation: gate every normalization change behind the replay harness in CI, and version the ruleset via schema_version so old fingerprints are recomputable rather than orphaned.

Non-deterministic ordering of commutative children. Symptom: two runs of an identical query on the same engine yield different hashes intermittently. Root cause: the plan emits a hash join’s inputs in optimizer-dependent order and the commutative-child sort is missing or keyed on an unstable field. Diagnose by hashing the same normalized tree twice with children shuffled: assert compute_hash(shuffle(tree)) == compute_hash(tree). Mitigation: ensure the lexicographic sort keys on the fully serialized child (as in normalize_plan), not on a partial field like operator name alone.

Malformed or truncated upstream JSON. Symptom: rising ERR_MALFORMED_JSON in the DLQ and a climbing rejection rate. Root cause: the capture layer truncated a large plan, or a proxy re-encoded UTF-8. Diagnose with SELECT error_code, count(*) FROM plan_hash_dlq WHERE created_at > now() - interval '1 hour' GROUP BY 1. Mitigation: enforce a max-payload guard and content-length check at capture, and replay DLQ rows once the upstream bug is fixed — never hand-edit fingerprints.

Version-drift namespace explosion. Symptom: storage cost climbs and baseline coverage drops after a fleet upgrade, because postgresql/17.4/ and postgresql/17.5/ are treated as unrelated namespaces. Root cause: patch-level version differences that do not change plan structure are still opening fresh namespaces. Diagnose by comparing fingerprint sets across adjacent versions for identical queries. Mitigation: route on engine/major.minor only (truncate the patch component) so cost-model-only patches do not fragment history, and reserve full-version namespacing for structural optimizer changes.

Executor saturation on wide plans. Symptom: plan_normalize_latency_ms p99 breaches 60 ms during analytical batch windows. Root cause: very deep plan trees monopolize the default thread pool. Diagnose by correlating latency spikes with plan node counts in traces. Mitigation: size a dedicated ThreadPoolExecutor for normalization (see the configuration reference) and cap accepted plan depth, quarantining pathological trees to the DLQ for offline hashing.

Configuration reference

Key tuning knobs, supplied as environment variables at worker start:

Variable	Default	Purpose
`PLAN_HASH_SHARD_COUNT`	`256`	Ring size for the `int(hash[:4],16) % N` partition key
`PLAN_HASH_MAX_DEPTH`	`64`	Maximum plan tree depth; deeper trees quarantine to the DLQ
`PLAN_HASH_MAX_BYTES`	`1048576`	Ingress payload ceiling; larger payloads reject as `ERR_MALFORMED_JSON`
`PLAN_HASH_EXECUTOR_WORKERS`	`4`	Dedicated normalization thread pool size
`PLAN_HASH_POOL_MIN`	`4`	`asyncpg` min connections to the fingerprint store
`PLAN_HASH_POOL_MAX`	`16`	`asyncpg` max connections to the fingerprint store
`PLAN_HASH_NAMESPACE_GRANULARITY`	`major_minor`	`major_minor` or `full` version namespacing
`PLAN_HASH_DLQ_TABLE`	`plan_hash_dlq`	Table for rejected and quarantined payloads
`PLAN_HASH_OTEL_EXPORTER_ENDPOINT`	—	OTLP collector endpoint for traces and metrics

Two safe-fallback protocols are non-negotiable. First, a payload that fails validation routes to the DLQ with an explicit error code rather than being hashed as raw bytes — a fingerprint must always correspond to a canonicalized tree, never to accidental input noise. Second, hashing itself is versioned: the active normalization ruleset is pinned by schema_version, so a change to canonicalization produces a new logical namespace instead of silently mutating identifiers already in storage. Authoritative field definitions for the raw plan inputs live in the PostgreSQL EXPLAIN documentation and the MySQL EXPLAIN output format.

How to Generate Deterministic Query Plan Hashes in Python — the step-by-step canonicalization runbook behind this stage.
Cost Estimation Mapping Across PostgreSQL and MySQL — the upstream stage that normalizes cost vectors before hashing.
Defining Regression Thresholds for Query Plans — the downstream consumer that keys baselines on the fingerprint.
Normalizing Query Plans for Cross-Engine Comparison — the capture-side flattening this stage assumes as input.
Security Boundaries for Baseline Data Storage — how the immutable fingerprint store is protected at rest.

← Back to Core Architecture & Baselining Fundamentals

Architectural boundaries #

Deterministic routing and schema enforcement #

Production-ready implementation #

Threshold table #

Failure scenarios and root cause analysis #

Configuration reference #

Related #