Back to guides
2
7 min

QUBO Formulation

Encoding Real Problems as Energy Minimization

The Art of QUBO Construction

Building a QUBO is the hardest part of quantum optimization. The solver is generic — give it any QUBO matrix and it'll find low-energy states. Your job is to construct a matrix where low-energy states correspond to good solutions for your specific problem.

Every QUBO has three components:

  • Objective terms — reward what you want (e.g., informative features)
  • Penalty terms — discourage what you don't want (e.g., redundant features)
  • Constraint terms — enforce requirements (e.g., select exactly K features)
  • Feature Selection QUBO

    Problem: Given N features and a target variable, select the best K features for classification.

    Term 1: Relevance (reward informative features)

    Mutual Information (MI) measures how much knowing feature i tells you about the target. High MI = valuable feature.

    E_relevance = -alpha * SUM_i [ MI(feature_i, target) * x_i ]

    This adds a negative value to the diagonal: Q[i,i] -= alpha * MI_i. Features with high MI get more negative diagonals, making them cheaper to select.

    Term 2: Redundancy (penalize correlated pairs)

    If two features are highly correlated, they carry the same information. Selecting both wastes a slot.

    E_redundancy = beta * SUM_{i<j} [ |corr(feature_i, feature_j)| * x_i * x_j ]

    This adds positive values to off-diagonals: Q[i,j] += beta * |corr_ij|. Correlated pairs become expensive to select together.

    Term 3: Cardinality (select exactly K)

    We want exactly K features. The soft constraint (sum - K)^2 penalizes deviation:

    E_cardinality = gamma * (SUM_i[x_i] - K)^2

    Expanding this quadratic:

    = gamma * (SUM_i[x_i])^2 - 2*gamma*K * SUM_i[x_i] + gamma*K^2
    = gamma * SUM_i[x_i^2] + gamma * SUM_{i≠j}[x_i * x_j] - 2*gamma*K * SUM_i[x_i] + const

    Since x_i is binary, x_i^2 = x_i. So:

  • Diagonal contribution: Q[i,i] += gamma * (1 - 2K)
  • Off-diagonal contribution: Q[i,j] += 2 * gamma
  • The constant term (gamma*K^2) doesn't affect which solution is optimal, so we ignore it.

    Combined QUBO

    Q[i,i] = -alpha * MI_i + gamma * (1 - 2K)           # diagonal
    Q[i,j] = beta * |corr_ij| + 2 * gamma                # off-diagonal (i < j)

    Three hyperparameters control the balance:

  • alpha (relevance weight): higher = stronger pull toward high-MI features
  • beta (redundancy weight): higher = stronger repulsion between correlated features
  • gamma (cardinality weight): higher = stricter enforcement of selecting exactly K features
  • Graph Partitioning QUBO

    Problem: Split N nodes into two balanced communities, maximizing intra-community edges.

    Term 1: Edge Reward

    For each edge (i,j) with weight w, we want both nodes in the same community. If x_i = x_j (both 0 or both 1), we reward it:

    Same community:  x_i == x_j  =  1 - x_i - x_j + 2*x_i*x_j

    So: E_edge = -w * (1 - x_i - x_j + 2*x_i*x_j)

  • Diagonal: Q[i,i] += w, Q[j,j] += w
  • Off-diagonal: Q[i,j] += -2w
  • Dense edges get strong negative off-diagonals: the solver prefers to keep those nodes together.

    Term 2: Balance Constraint

    Without a balance constraint, the solver would put all nodes in one community (trivially maximizes intra-community edges). The constraint (SUM_i[x_i] - N/2)^2 penalizes imbalanced splits.

    Same expansion as the cardinality constraint above, with K = N/2.

    The Tradeoff

    Balance penalty vs edge density creates an interesting tradeoff. High penalty → perfectly balanced but may cut important edges. Low penalty → preserves communities but allows imbalance.

    Weight Selection

    Choosing alpha, beta, gamma (or penalty) is problem-dependent. Some heuristics:

  • Start with alpha=1, beta=0.5, gamma=2 for feature selection
  • Start with penalty=1 for graph partitioning
  • If too few features selected → decrease gamma or increase alpha
  • If correlated features both selected → increase beta
  • If graph partition is imbalanced → increase penalty
  • The sensitivity analysis in Module 4 will show how these weights affect solution quality.

    Key Insight

    The QUBO framework is general-purpose. Any problem you can express as "select a subset of binary variables to minimize a quadratic function" can be encoded. Other examples:

  • Traveling salesman: which edges to include in the tour
  • Job scheduling: which time slot for each job
  • Portfolio optimization: which stocks to include
  • Protein folding: which configuration minimizes free energy
  • The QUBO is the bridge between your domain problem and the quantum (or classical) solver.

    This is chapter 2 of Quantum Optimization for AI.

    Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

    View course details