8 Decoding Algorithms

Learning Objectives

After completing this chapter, you will be able to: - Formulate surface code decoding as a minimum weight perfect matching problem - Implement the Union-Find decoder and analyze its complexity - Apply Belief Propagation to quantum LDPC codes - Use Ordered Statistics Decoding (OSD) as post-processing for BP - Compare decoder performance in terms of threshold and speed

Prerequisites

This chapter assumes familiarity with: - Surface code structure and syndromes (Chapter 4) - Belief Propagation algorithm (Chapter 2) - Basic graph algorithms (shortest paths, matching)

Decoding is the process of identifying and correcting errors from syndrome measurements. For quantum codes, decoding must account for the unique structure of Pauli errors and the CSS constraint. This chapter presents the main decoding algorithms used in practice.

8.1 MWPM for Surface Code

Minimum Weight Perfect Matching (MWPM) is the standard decoder for the surface code, offering excellent performance with manageable complexity.

8.1.1 Matching Problem Formulation

The surface code syndrome consists of defects (stabilizers that anticommute with errors). The decoding problem is to find the most likely error chain consistent with the observed syndrome.

Definition: Matching Decoder Problem

Given a set of defects $D = \{d_1, ..., d_{2m}\}$: 1. Partition $D$ into $m$ pairs 2. For each pair $(d_i, d_j)$, find a path (error chain) connecting them 3. Minimize the total weight (sum of path lengths)

The decoder output is the union of all paths.

Proposition: Optimality

Under the independent error model where errors occur independently on each qubit with probability $p$: - The MWPM decoder minimizes the probability of logical error - This holds exactly for $p \ll 1$ (low error rate) - Near-optimal for practical error rates

8.2 Example: Defect Pairing

Consider defects at positions (0,0), (0,3), (3,0), (3,3): - Option A: pair (0,0)-(0,3) and (3,0)-(3,3), total weight = 6 - Option B: pair (0,0)-(3,0) and (0,3)-(3,3), total weight = 6 - Option C: pair (0,0)-(3,3) and (0,3)-(3,0), total weight = 12

MWPM chooses the minimum weight pairing (A or B, whichever has lower geometric distance on the lattice).

8.2.1 Blossom Algorithm

The Edmonds Blossom algorithm solves MWPM in polynomial time.

Algorithm: Blossom Algorithm

Input: Complete graph with $2m$ vertices and edge weights Output: Minimum weight perfect matching

Initialize empty matching $M = \emptyset$
While there exists an unmatched vertex:
1. Grow alternating tree from unmatched vertex
2. Find augmenting paths (including “blossoms”)
3. Augment matching along shortest augmenting path
Return $M$

Complexity: $O(m^3)$ for general graphs, $O(m^2)$ for planar graphs

Remark

For the surface code: - The lattice is planar, enabling $O(m^2)$ matching - Efficient implementations (Kolmogorov’s Blossom V) handle millions of defects in real-time - The PyMatching library provides optimized Python/C++ implementation

8.2.2 Code Example: PyMatching MWPM Decoder

from pymatching import Matching
import numpy as np

def mwpm_decode_surface_code(H, syndrome, max_weight=100):
    """
    MWPM decoder using PyMatching library.

    Args:
        H: Parity-check matrix (scipy sparse)
        syndrome: Binary syndrome vector
        max_weight: Maximum edge weight for matching

    Returns:
        error: Estimated error pattern
        num_activated: Number of activated edges in matching
    """
    # Create PyMatching decoder
    decoder = Matching(H, max_weight=max_weight)

    # Decode
    error = decoder.decode(syndrome)

    # Get matching statistics
    num_activated = decoder.active_edges

    return error, num_activated


# Example: Distance-5 surface code
def create_surface_code_parity_check(distance):
    """
    Create parity-check matrix for distance-d surface code.

    This is a simplified construction for illustration.
    Real implementations use full lattice geometry.
    """
    from scipy.sparse import lil_matrix, csr_matrix

    # Number of qubits and stabilizers for distance-d surface code
    n_qubits = distance ** 2
    n_stabilizers = distance ** 2  # X + Z stabilizers

    H = lil_matrix((n_stabilizers, n_qubits), dtype=np.uint8)

    # Simplified: each stabilizer checks 4 qubits (local)
    # In practice, use proper lattice connectivity
    for i in range(distance):
        for j in range(distance):
            # X-type stabilizer at position (i, j)
            idx = i * distance + j
            if i > 0: H[idx, (i-1)*distance + j] = 1
            if i < distance-1: H[idx, (i+1)*distance + j] = 1
            if j > 0: H[idx, i*distance + (j-1)] = 1
            if j < distance-1: H[idx, i*distance + (j+1)] = 1

    return csr_matrix(H)


def demo_mwpm_decoder():
    """Demonstrate MWPM decoding on surface code."""
    from scipy.sparse import random

    # Create a mock surface code H matrix (simplified)
    distance = 5
    H = create_surface_code_parity_check(distance)

    # Simulate syndrome from random errors
    np.random.seed(42)
    n_qubits = H.shape[1]

    # Random error pattern (1% error rate)
    error = (np.random.random(n_qubits) < 0.01).astype(np.uint8)
    syndrome = (H @ error) % 2

    print(f"Distance-{distance} surface code:")
    print(f"  Qubits: {n_qubits}")
    print(f"  Stabilizers: {H.shape[0]}")
    print(f"  Syndrome weight: {np.sum(syndrome)}")

    # Decode using MWPM
    decoded_error, num_activated = mwpm_decode_surface_code(H, syndrome)

    # Check if correction succeeded
    recovered = (error + decoded_error) % 2
    logical_error = (H @ recovered) % 2

    print(f"  MWPM activated edges: {num_activated}")
    print(f"  Logical error: {'Yes' if np.any(logical_error) else 'No'}")
    print(f"  Corrected: {'Yes' if np.sum(logcovered) == 0 else 'No'}")

Remark

PyMatching is the state-of-the-art MWPM decoder for surface codes: - 10-100x faster than naive Blossom implementations - Handles weighted matching with LLRs - Provides matching statistics for debugging - Used in major quantum computing simulations

8.2.3 Worked Example: MWPM Decoding Step-by-Step

Let us walk through a complete MWPM decoding example on a small surface code patch.

Step 1: Identify syndrome defects. Consider a distance-3 surface code after one error cycle. The syndrome measurement reveals 4 defects at coordinates: - $D_1 = (2, 3)$ - $D_2 = (2, 7)$ - $D_3 = (5, 3)$ - $D_4 = (5, 7)$

Step 2: Construct the complete graph. Create a weighted graph where vertices are defects and boundary nodes. Edge weights are the Manhattan distances between locations:

	$D_1$	$D_2$	$D_3$	$D_4$
$D_1$	—	4	3	7
$D_2$	4	—	7	3
$D_3$	3	7	—	4
$D_4$	7	3	4	—

Table: Distance matrix between defects. Edge weights are Manhattan distances.

Step 3: Find minimum weight perfect matching. We evaluate possible pairings: - Option A: $(D_1, D_2) + (D_3, D_4)$: weight = $4 + 4 = 8$ - Option B: $(D_1, D_3) + (D_2, D_4)$: weight = $3 + 3 = 6$ - Option C: $(D_1, D_4) + (D_2, D_3)$: weight = $7 + 7 = 14$

Option B has minimum weight. MWPM selects pairs $(D_1, D_3)$ and $(D_2, D_4)$.

Step 4: Construct correction chains. For each matched pair, construct a minimum-weight path on the lattice: - Path $D_1 \rightarrow D_3$: vertical chain from $(2,3)$ to $(5,3)$ (length 3) - Path $D_2 \rightarrow D_4$: vertical chain from $(2,7)$ to $(5,7)$ (length 3)

Step 5: Apply correction. Apply X corrections along these paths. If the original error chains were also vertical (as is likely from the defect pattern), the correction cancels the error successfully.

8.2.4 Geometric Matching

Surface code matching exploits the lattice geometry.

Definition: Lattice Distance

The distance between two defects $(x_1, y_1)$ and $(x_2, y_2)$ is: $ d((x_1, y_1), (x_2, y_2)) = |x_1 - x_2| + |y_1 - y_2| $

This is the Manhattan (L1) distance on the grid.

8.3 Example: Weighted Matching

For defects on a $100 \times 100$ lattice: - Defect at (10, 20) and (10, 50): distance = 30 - Defect at (50, 50) and (80, 80): distance = 60

The matching algorithm chooses pairs minimizing total distance.

Proposition: Error Correction

If the matching decoder chooses a correction $C$ and the true error is $E$: - Logical error occurs if $E C$ is a logical operator - For surface code, this means $E C$ creates a non-contractible loop

MWPM minimizes the probability of such logical errors.

MWPM on surface code lattice. Defects (red) are paired by minimum weight paths (blue). The correction chain spans between paired defects.

MWPM decoding steps: (a) Identify syndrome defects on lattice, (b) Construct complete weighted graph between defects, (c) Find minimum weight perfect matching, (d) Apply correction along matched paths.

8.3.1 Practical Considerations

Real implementations add several optimizations.

Definition: MWPM Optimizations

Weighted edges: Use log-likelihood ratios instead of geometric distance
Pachner moves: Exploit topological properties for speedup
Batch processing: Decode multiple syndromes in parallel
Approximate matching: For very large codes, use heuristics

8.4 Example: LLR-Weighted Matching

Instead of geometric distance, use: $ w_{i,j} = - P( d_i d_j) $

For depolarizing noise with rate $p$: $ w_{i,j} d((x_i, y_i), (x_j, y_j)) ((1-p)/p) $

8.5 Union-Find Decoder

The Union-Find decoder offers near-linear complexity with comparable performance to MWPM.

8.5.1 Data Structure Motivation

The key insight is that we can use Union-Find (disjoint set union) data structure to efficiently track connected components of errors.

Definition: Union-Find Data Structure

Union-Find maintains a partition of vertices into disjoint sets: - find(v): Returns the representative of the set containing $v$ - union(a, b): Merges the sets containing $a$ and $b$

With path compression and union by rank: nearly $O(1)$ per operation.

Algorithm: Union-Find Decoder

Input: Syndrome defects on lattice Output: Error correction

Create Union-Find with each defect as separate component
For each stabilizer measurement round:
1. If defect appears: create new component
2. If defect disappears: mark component as “neutralized”
3. If same defect persists: union with previous occurrence
For each neutralized component, grow a minimum spanning tree
Apply correction along spanning tree boundaries

Complexity: $O(n \alpha(n))$ where $\alpha$ is inverse Ackermann

8.5.2 Delfosse-Nickerson Construction

Delfosse and Nickerson developed the Union-Find decoder specifically for surface codes.

Definition: Degeneracy-Aware UF

Standard Union-Find doesn’t account for code degeneracy: - Different error chains can have same syndrome - Need to choose the minimum weight representative

The degeneracy-aware variant: - Track boundaries of syndrome components - Apply corrections when boundaries grow large enough

Proposition: UF Performance

For surface code with distance $d$ and error rate $p$: - Logical error rate: $p_L \approx O(p^{(d+1)/2})$ - Same threshold as MWPM ($\sim 10\%$ phenomenological) - Complexity: $O(n \alpha(n))$ vs MWPM’s $O(n^2)$ for dense matching

For large codes, Union-Find is 100x faster than MWPM.

8.6 Example: UF vs MWPM

For $d = 101$ surface code with $n \approx 10^4$ qubits: - MWPM: ~1 second per syndrome - Union-Find: ~10 ms per syndrome

For real-time decoding at 1 MHz cycle rate: - MWPM: cannot keep up - Union-Find: easily meets timing requirements

Remark

Union-Find is now the preferred decoder for large-scale surface code simulations and may be essential for practical devices. The main limitation: slightly worse performance than MWPM near threshold.

8.6.1 Code Example: Union-Find Decoder

import numpy as np
from collections import defaultdict

class UnionFindDecoder:
    """Union-Find decoder for surface codes."""
    def __init__(self, H, lattice_distance):
        """
        Initialize decoder.

        Args:
            H: Parity-check matrix (scipy sparse)
            lattice_distance: Distance d of surface code
        """
        self.H = H
        self.d = lattice_distance
        self.n_qubits = H.shape[1]

        # Initialize Union-Find data structures
        self.parent = {}  # defect_id -> parent_id
        self.rank = {}    # defect_id -> rank for union by rank
        self.defect_pos = {}  # defect_id -> (x, y) position

    def _find(self, x):
        """Find with path compression."""
        if x not in self.parent:
            self.parent[x] = x
            self.rank[x] = 0
            return x
        if self.parent[x] != x:
            self.parent[x] = self._find(self.parent[x])
        return self.parent[x]

    def _union(self, x, y):
        """Union two sets."""
        rx, ry = self._find(x), self._find(y)
        if rx == ry:
            return
        if self.rank[rx] < self.rank[ry]:
            self.parent[rx] = ry
        elif self.rank[rx] > self.rank[ry]:
            self.parent[ry] = rx
        else:
            self.parent[ry] = rx
            self.rank[rx] += 1

    def decode(self, syndrome, defect_positions):
        """
        Decode syndrome to error pattern.

        Args:
            syndrome: Binary syndrome vector
            defect_positions: List of (x, y) positions for each defect

        Returns:
            error_pattern: Binary vector of estimated errors
        """
        # Initialize each defect as its own region
        defects = list(range(len(defect_positions)))
        self.parent = {i: i for i in defects}
        self.rank = {i: 0 for i in defects}
        self.defect_pos = {i: defect_positions[i] for i in defects}

        # Track which defects have merged
        active = set(defects)
        corrections = []

        # Grow regions until all defects are resolved
        while len(active) > 1:
            # Find two closest defects
            min_dist = float('inf')
            closest_pair = None

            for i in active:
                for j in active:
                    if i < j:
                        dist = self._manhattan_distance(
                            self.defect_pos[i], self.defect_pos[j]
                        )
                        if dist < min_dist:
                            min_dist = dist
                            closest_pair = (i, j)

            if closest_pair is None:
                break

            i, j = closest_pair

            # Grow regions until they meet (simplified)
            # Real implementation: grow both regions simultaneously
            corrections.append((self.defect_pos[i], self.defect_pos[j]))
            self._union(i, j)
            active.discard(i)
            active.discard(j)
            active.add(self._find(i))

        return self._corrections_to_errors(corrections)

    def _manhattan_distance(self, pos1, pos2):
        """Manhattan distance on lattice."""
        return abs(pos1[0] - pos2[0]) + abs(pos1[1] - pos2[1])

    def _corrections_to_errors(self, corrections):
        """Convert correction paths to error vector."""
        error = np.zeros(self.n_qubits, dtype=np.uint8)
        # Simplified: mark correction endpoints
        for pos1, pos2 in corrections:
            # In practice: trace shortest path and mark edges
            pass
        return error


def demo_union_find():
    """Demonstrate Union-Find decoder."""
    # Create mock syndrome (4 defects)
    syndrome = np.array([1, 0, 0, 1], dtype=np.uint8)
    defect_positions = [(2, 3), (2, 7), (5, 3), (5, 7)]

    # Note: Real implementation needs full lattice and H matrix
    print("Union-Find decoder initialized")
    print(f"Defect positions: {defect_positions}")

Remark

This simplified Union-Find implementation demonstrates key concepts: - Disjoint set data structure for tracking connected components - Growing regions from defects until they merge - Path compression for near-constant time operations - Full implementations (like in Stim) handle lattice geometry and boundaries

8.6.2 Growth-Based Decoding

The key operation in Union-Find is growing correction regions.

Algorithm: Growth Decoder

For each defect, maintain a growing region (Union-Find tree): 1. Initialize each defect as a singleton region 2. Repeatedly add boundary edges to regions 3. When regions merge, they cancel out 4. Continue until all defects are resolved

This is equivalent to computing minimum spanning forests.

8.6.3 Worked Example: Union-Find Growth Process

Let us trace the Union-Find decoder on the same syndrome as the MWPM example.

Step 1: Initialize regions. We have 4 defects at the same locations. Each starts as its own region: - Region 1: $\{D_1\}$ at $(2, 3)$ - Region 2: $\{D_2\}$ at $(2, 7)$ - Region 3: $\{D_3\}$ at $(5, 3)$ - Region 4: $\{D_4\}$ at $(5, 7)$

Step 2: First growth iteration. Each region expands by one unit in all directions: - Region 1 adds edges to $(1,3), (3,3), (2,2), (2,4)$ - Region 2 adds edges to $(1,7), (3,7), (2,6), (2,8)$ - Region 3 adds edges to $(4,3), (6,3), (5,2), (5,4)$ - Region 4 adds edges to $(4,7), (6,7), (5,6), (5,8)$

No regions touch yet, so continue growing.

Step 3: Second growth iteration. Each region expands another unit. Now: - Region 1 reaches $(2,5)$ - Region 2 reaches $(2,5)$

Regions 1 and 2 meet at $(2,5)$! They merge.

Step 4: Merge and cancel. When Region 1 and Region 2 merge, they form Region 1+2. Since both were odd-weight regions, they annihilate. The correction is the path from $D_1$ to $D_2$.

Meanwhile, Regions 3 and 4 grow and meet at $(5,5)$ in the next iteration.

Step 5: Final result. The decoder identifies two clusters: - Cluster A: $\{D_1, D_2\}$ $\rightarrow$ vertical correction path - Cluster B: $\{D_3, D_4\}$ $\rightarrow$ vertical correction path

This matches the MWPM solution but was computed in near-linear time rather than quadratic.

Union-Find growth decoder. Regions (shaded) grow from each defect until they merge, canceling the defects.

8.7 BP for Quantum LDPC

Belief Propagation (BP) extends naturally to quantum LDPC codes, though with additional complications.

8.7.1 Quantum BP Challenges

Quantum decoding requires handling both X and Z errors simultaneously.

Proposition: Quantum BP Complexity

For CSS codes: - X and Z errors are independent (commute) - Can decode separately using two classical BP runs

For non-CSS codes: - X and Z errors don’t commute - Need joint decoding over quaternary (4-state) messages

Complexity increases significantly for non-CSS.

Definition: Binary BP for CSS

For CSS codes, decode X and Z independently: 1. Run BP on X-type syndrome with $H_X$ matrix 2. Run BP on Z-type syndrome with $H_Z$ matrix 3. Combine corrections for full Pauli recovery

This works because X and Z stabilizers are independent.

8.7.2 Hard-Decision vs Soft-Decision BP

Different variants trade off complexity and performance.

Definition: BP Variants

Hard-decision BP (HD-BP): Use binary messages (0/1)
Soft-decision BP (SD-BP): Use LLR messages (real numbers)
Quantized BP: Use finite-precision LLRs (e.g., 4 bits)

HD-BP is fastest but has error floor issues. SD-BP has best performance but highest complexity.

8.8 Example: LLR Messages

For error probability $p$: - LLR for qubit $i$: $L_i = \log(P(x_i=1)/P(x_i=0))$ - For depolarizing channel: $L = \log((1-p)/p)$

Messages combine via: - Variable node: product of incoming messages - Check node: tanh combination for SD-BP, parity for HD-BP

Algorithm: Standard BP

Input: Parity-check matrix H, syndrome s Output: Estimated error e

Initialize messages from variable nodes to check nodes
For max_iterations:
1. Check nodes process incoming messages
2. Variable nodes process incoming messages
3. Compute hard decisions from messages
4. If $H e^T = s$ (valid codeword), return e
Return hard decision (may not match syndrome)

Complexity: $O(n \cdot \text{max\_iterations} \cdot d_v \cdot d_c)$

8.8.1 Syndrome BP for Degenerate Codes

For degenerate codes, standard BP doesn’t account for equivalent errors.

Definition: Syndrome BP

Syndrome BP decodes directly on syndrome space: 1. Instead of finding error e, find coset representative 2. Account for stabilizer degeneracy 3. Output is always a valid correction

More complex but handles degeneracy correctly.

Proposition: BP for qLDPC

Challenges for quantum LDPC BP: - Short cycles in Tanner graph (4-cycles common) - Degeneracy and logical operators - High check degree (unlike surface code’s degree 4)

Solutions: - Careful code design (girth optimization) - Syndrome BP or variants - Post-processing with OSD

8.9 OSD and BP+OSD Post-processing

Ordered Statistics Decoding (OSD) provides near-ML performance with low additional complexity.

8.9.1 OSD Principle

OSD uses the most reliable positions to reduce the search space.

Definition: OSD

Given syndrome s and reliability ordering of bits: 1. Order variables by decreasing reliability 2. Choose $k$ most reliable positions 3. Enumerate all $2^k$ possibilities for these positions 4. Find valid codeword closest to received word 5. Return corresponding error estimate

For $k \ll n$, this is much faster than full ML decoding.

Algorithm: OSD Algorithm

Input: Received word r, syndrome s, ordering $\pi$ Output: ML-decoded error

Partition positions into reliable ($R$) and unreliable ($U$)
For each $x \in \{0,1\}^{|R|}$:
1. Set $r_R = x$
2. Solve $H e^T = s$ for $e_U$ using Gaussian elimination
3. If solution exists, compute likelihood
Return most likely solution

Complexity: $O(2^k \cdot n^3)$ for full Gaussian elimination

8.9.2 BP+OSD Combination

BP+OSD combines BP’s speed with OSD’s accuracy.

Algorithm: BP+OSD

Run BP to get initial estimate $\hat{e}$
Identify $k$ least reliable positions where BP failed
Apply OSD to these positions
Return OSD-corrected estimate

For small $k$ (e.g., $k = 10-20$), this achieves near-ML performance with minimal overhead.

Proposition: BP+OSD Performance

For classical LDPC codes near threshold: - BP alone: ~0.5 dB from capacity - BP+OSD(k=10): ~0.1 dB from capacity - BP+OSD(k=20): matches ML decoding

The key is choosing $k$ small enough for efficiency but large enough to capture most errors.

8.10 Example: BP+OSD Parameters

For [1000, 500] LDPC code at BER = $10^{-5}$: - BP only: may have error floor at $10^{-7}$ - BP+OSD(10): error floor moves to $10^{-9}$ - BP+OSD(20): error floor at $10^{-12}$ (ML-like)

Complexity increase: factor of $2^{10} = 1024$ for k=10

Remark

BP+OSD is particularly valuable for quantum LDPC codes: - BP may have error floors from short cycles - OSD post-processing eliminates these floors - For moderate $k$, overhead is acceptable

8.10.1 Worked Example: BP Failure and OSD Recovery

This example shows how BP can fail on a code with short cycles, and how OSD post-processing recovers.

Setup: Small LDPC code with short cycles. Consider a $[6, 2, 3]$ LDPC code with parity-check matrix containing many 4-cycles: \[ H = \begin{pmatrix} 1 & 1 & 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 \end{pmatrix} \]

This code has short cycles that confuse BP.

Step 1: Generate a challenging syndrome. Let the true error be $\mathbf{e} = (1, 0, 0, 1, 1, 0)$ at error rate $p = 0.05$.

Compute syndrome: $\mathbf{s} = H \mathbf{e}^T = (1, 0, 1, 1)^T$

Step 2: Run BP (fails to converge). After 20 iterations, BP produces estimate: $\hat{\mathbf{e}}_{\text{BP}} = (0, 1, 0, 1, 1, 1)$

Check syndrome of BP’s estimate: $H \hat{\mathbf{e}}_{\text{BP}}^T = (1, 0, 0, 1)^T \neq \mathbf{s}$

BP failed! The estimate doesn’t even match the syndrome.

Step 3: Identify unreliable positions. Compute APP LLRs from BP: $\mathbf{L}_{\text{app}} = (+0.3, -0.8, +0.5, +0.2, -0.1, +0.4)$

Order by reliability (smallest |LLR| = least reliable): 1. Position 5: $L = -0.1$ (least reliable) 2. Position 4: $L = +0.2$ 3. Position 1: $L = +0.3$ 4. Position 6: $L = +0.4$ 5. Position 3: $L = +0.5$ 6. Position 2: $L = -0.8$ (most reliable)

Choose $k = 3$ unreliable positions: $\{5, 4, 1\}$

Step 4: Apply OSD (enumerate $2^3 = 8$ possibilities). For each of 8 patterns on positions 5,4,1:

Pattern (5,4,1)	Solve for (6,3,2)	Syndrome match?	LLR sum
(0,0,0)	(0,0,1)	No	-
(0,0,1)	(0,1,1)	No	-
(0,1,0)	(1,1,1)	No	-
(0,1,1)	(1,0,1)	Yes	-1.2
(1,0,0)	(0,1,0)	No	-
(1,0,1)	(0,0,0)	Yes	-0.8
(1,1,0)	(1,0,0)	No	-
(1,1,1)	(1,1,0)	No	-

Step 5: Select best solution. Pattern $(0, 1, 1)$ on unreliable positions with solution $(1, 0, 0, 1, 0, 1)$ gives: - Full estimate: $(1, 0, 0, 1, 0, 1)$ - Syndrome: $(1, 0, 1, 1)^T = \mathbf{s}$ ✓ - Total LLR: -1.2 (best among valid solutions)

Step 6: Verify correction. OSD estimate: $\hat{\mathbf{e}}_{\text{OSD}} = (1, 0, 0, 1, 0, 1)$ True error: $\mathbf{e} = (1, 0, 0, 1, 1, 0)$

Difference: $(0, 0, 0, 0, 1, 1)$ Syndrome of difference: $(0, 0, 0, 0)$ (stabilizer, not logical error!)

OSD succeeded! The correction recovers a valid codeword.

Intuition

BP failed because the short cycles in this code caused information to circulate indefinitely, preventing convergence. OSD doesn’t care about cycles—it simply searches the solution space systematically. By focusing on the least reliable positions (where BP was most uncertain), OSD only needs to explore a tiny fraction of the full $2^n$ possibilities. This is why BP+OSD is powerful for qLDPC codes: BP handles most cases fast, OSD fixes the hard cases.

8.10.2 Adaptive OSD

Adaptive variants adjust $k$ based on runtime conditions.

Definition: Adaptive OSD

Instead of fixed $k$: - Use larger $k$ for harder syndromes - Use smaller $k$ for easier syndromes - Stop when confidence is high enough

This balances complexity and accuracy dynamically.

Proposition: Adaptive Performance

Average complexity can be much lower than worst-case: - Most syndromes are easy (BP succeeds) - Only hard syndromes need large $k$ - Adaptive OSD can be 10x faster on average

Critical for real-time decoding applications.

8.10.3 Code Example: BP+OSD Decoder

import numpy as np
from typing import Tuple, Optional

class BPDecoder:
    """Belief Propagation decoder for LDPC codes."""
    def __init__(self, H, max_iter=50):
        self.H = H.astype(np.float32)
        self.n = H.shape[1]
        self.n_check = H.shape[0]
        self.max_iter = max_iter

        # Pre-compute adjacency
        self.check_to_var = [[] for _ in range(self.n_check)]
        self.var_to_check = [[] for _ in range(self.n)]
        for j in range(self.n_check):
            for i in range(self.n):
                if H[j, i] == 1:
                    self.check_to_var[j].append(i)
                    self.var_to_check[i].append(j)

    def decode(self, syndrome) -> Tuple[np.ndarray, int, bool]:
        """Decode using BP."""
        # Initialize messages from syndrome (hard decisions)
        messages = np.zeros((self.n_check, self.n), dtype=np.float32)

        for iteration in range(self.max_iter):
            # Check-to-variable messages (parity)
            for j in range(self.n_check):
                neighbors = self.check_to_var[j]
                for idx, i in enumerate(neighbors):
                    other_messages = [messages[j, k] for k in neighbors if k != i]
                    messages[j, i] = np.sum(other_messages) % 2

            # Variable-to-check messages
            new_messages = np.zeros((self.n, self.n_check), dtype=np.float32)
            for i in range(self.n):
                for j in self.var_to_check[i]:
                    other_messages = [messages[k, i] for k in self.var_to_check[i] if k != j]
                    parity = np.sum(other_messages) % 2
                    if syndrome[j] == 1:
                        new_messages[i, j] = 1 - parity
                    else:
                        new_messages[i, j] = parity
            messages = new_messages.T

            # Check if converged
            decision = np.zeros(self.n, dtype=np.int8)
            for i in range(self.n):
                syndrome_calc = 0
                for j in self.var_to_check[i]:
                    syndrome_calc ^= int(messages[j, i])
                decision[i] = syndrome_calc

            if np.all((self.H @ decision) % 2 == syndrome):
                return decision, iteration + 1, True

        return np.zeros(self.n, dtype=np.int8), self.max_iter, False


class OSDPostProcessor:
    """OSD post-processor for BP failures."""
    def __init__(self, H, order=10):
        self.H = H
        self.order = order
        self.n = H.shape[1]

        # Gaussian elimination preprocessing
        self._preprocess_gaussian()

    def _preprocess_gaussian(self):
        """Precompute Gaussian elimination matrices."""
        from copy import deepcopy
        H_copy = deepcopy(self.H)
        n, m = H_copy.shape

        # Find full-rank submatrix
        self.pivot_cols = []
        self.row_ops = []

        for col in range(m):
            for row in range(len(self.pivot_cols), n):
                if H_copy[row, col] == 1:
                    # Found pivot
                    self.pivot_cols.append(col)
                    break

        # Simple approach: use reduced row echelon
        self.gaussian_matrix = np.linalg.inv(
            self.H[:, self.pivot_cols].astype(float) + 1e-10
        ) @ self.H

    def post_process(self, syndrome, bp_estimate, llrs) -> np.ndarray:
        """
        Apply OSD to BP estimate.

        Args:
            syndrome: Binary syndrome
            bp_estimate: BP decoder's error estimate
            llrs: Log-likelihood ratios

        Returns:
            OSD-corrected estimate
        """
        # Find least reliable positions
        unreliable_idx = np.argsort(np.abs(llrs))[:self.order]

        # Test all 2^order combinations on unreliable positions
        best_solution = None
        best_llr = float('-inf')

        for mask in range(1 << self.order):
            # Set unreliable positions
            trial = bp_estimate.copy()
            for i, idx in enumerate(unreliable_idx):
                trial[idx] = (mask >> i) & 1

            # Check syndrome
            if np.all((self.H @ trial) % 2 == syndrome):
                # Valid solution, compute LLR
                trial_llr = sum(
                    llrs[i] if trial[i] == 1 else -llrs[i]
                    for i in range(self.n)
                )
                if trial_llr > best_llr:
                    best_llr = trial_llr
                    best_solution = trial

        return bp_estimate if best_solution is None else best_solution


def bp_osd_decode(H, syndrome, llrs, bp_max_iter=50, osd_order=10):
    """
    Combined BP+OSD decoding.

    Args:
        H: Parity-check matrix
        syndrome: Binary syndrome
        llrs: Log-likelihood ratios
        bp_max_iter: Maximum BP iterations
        osd_order: OSD search order

    Returns:
        Estimated error pattern
    """
    # Step 1: Run BP
    bp_decoder = BPDecoder(H, max_iter=bp_max_iter)
    bp_estimate, _, converged = bp_decoder.decode(syndrome)

    if converged:
        return bp_estimate

    # Step 2: Apply OSD post-processing
    osd = OSDPostProcessor(H, order=osd_order)
    osd_estimate = osd.post_process(syndrome, bp_estimate, llrs)

    return osd_estimate


# Demonstration
if __name__ == "__main__":
    # Example: LDPC code with short cycles
    H = np.array([
        [1, 1, 1, 0, 0, 0],
        [1, 1, 0, 1, 0, 0],
        [0, 0, 1, 1, 1, 0],
        [0, 0, 0, 1, 1, 1],
    ], dtype=np.int8)

    syndrome = np.array([1, 0, 1, 1], dtype=np.int8)
    llrs = np.array([0.3, -0.8, 0.5, 0.2, -0.1, 0.4], dtype=np.float32)

    result = bp_osd_decode(H, syndrome, llrs, osd_order=3)
    print(f"BP+OSD result: {result}")

Remark

This BP+OSD implementation demonstrates key concepts: - BP handles most decoding cases efficiently - OSD post-processes only when BP fails - Search order controls complexity-accuracy trade-off - For qLDPC codes, BP+OSD eliminates error floors from short cycles

Decoder	Complexity	Performance	Best For
MWPM	High	Near-optimal	Surface code
Union-Find	Low	Near-threshold	Large surface codes
BP	Low	Error floors	LDPC codes
BP+OSD	Medium	Near-ML	qLDPC codes

Table: Comparison of decoding algorithms for quantum error correction codes.

Remark

The choice of decoder depends on the code and application: - Surface code: MWPM or Union-Find - Hypergraph product: BP+OSD - Quantum Tanner: BP with careful design

Research continues on decoders optimized for specific codes.

Decoding remains an active research area. The goal is always to achieve the best error correction performance within the constraints of real-time operation on quantum hardware.

8.11 Key Takeaways

Key Takeaways

MWPM decoder: Optimal for surface codes under independent errors, maps defects to minimum-weight perfect matching problem using Blossom algorithm
Union-Find decoder: Near-linear complexity ($O(n \alpha(n))$) with comparable threshold to MWPM, essential for large-scale decoding
BP for qLDPC: Extends classical BP to quantum codes, struggles with short cycles and degeneracy in quantum LDPC codes
BP+OSD: Combines BP’s speed with OSD’s accuracy, eliminates error floors by enumerating small error patterns
Decoder selection: MWPM/Union-Find for surface codes, BP+OSD for qLDPC codes, GNN for scalable learning-based approaches
Real-time constraints: 1 MHz cycle time requires “less than” 1$\mu$s decoding, favoring Union-Find and neural decoders over MWPM

8.12 Bridge to Chapter 7

Bridge to Chapter 7

This chapter focused on classical decoding algorithms adapted for quantum codes. Chapter 7 explores a fundamentally different approach: using neural networks to learn decoders from data. Neural decoders can exploit patterns in errors that traditional algorithms miss, and with GPU acceleration, they can achieve the throughput needed for million-qubit machines.

We will see how GNN architectures respect the code’s graph structure while scaling to thousands of qubits, and how hybrid approaches combine neural and classical decoders for optimal performance.

8.13 Exercises

Exercise: MWPM Decoding

MWPM Decoding. Consider minimum-weight perfect matching for surface code decoding.

Explain how syndrome defects (violated stabilizers) are represented as nodes in the matching graph.
Why must the matching be “perfect”? What does this correspond to physically?
If there are $s$ syndrome defects, what is the time complexity of MWPM using Edmonds’ algorithm? Why is this a concern for real-time decoding?

Exercise: Union-Find Decoder

Union-Find Decoder. The Union-Find decoder offers near-linear complexity.

Describe the basic idea of growing clusters from syndrome defects until they merge.
What is the time complexity of Union-Find decoding? How does this compare to MWPM?
Union-Find is slightly suboptimal compared to MWPM. In what regime (low/high error rate) is this difference most significant?

Exercise: BP+OSD for qLDPC

BP+OSD for qLDPC. Belief Propagation with Ordered Statistics Decoding is used for quantum LDPC codes.

Why does standard BP often fail on quantum codes with short cycles?
Explain the role of OSD as a post-processing step when BP fails to converge.
What is the trade-off between OSD order and decoding complexity? How does this affect the choice of OSD order in practice?

	\(D_1\)	\(D_2\)	\(D_3\)	\(D_4\)
\(D_1\)	—	4	3	7
\(D_2\)	4	—	7	3
\(D_3\)	3	7	—	4
\(D_4\)	7	3	4	—