7 min

QUBO Formulation

Encoding Real Problems as Energy Minimization

The Art of QUBO Construction

Building a QUBO is the hardest part of quantum optimization. The solver is generic — give it any QUBO matrix and it'll find low-energy states. Your job is to construct a matrix where low-energy states correspond to good solutions for your specific problem.

Every QUBO has three components:

Objective terms — reward what you want (e.g., informative features)

Penalty terms — discourage what you don't want (e.g., redundant features)

Constraint terms — enforce requirements (e.g., select exactly K features)

Feature Selection QUBO

Problem: Given N features and a target variable, select the best K features for classification.

Term 1: Relevance (reward informative features)

Mutual Information (MI) measures how much knowing feature i tells you about the target. High MI = valuable feature.

E_relevance = -alpha * SUM_i [ MI(feature_i, target) * x_i ]

This adds a negative value to the diagonal: Q[i,i] -= alpha * MI_i. Features with high MI get more negative diagonals, making them cheaper to select.

Term 2: Redundancy (penalize correlated pairs)

If two features are highly correlated, they carry the same information. Selecting both wastes a slot.

E_redundancy = beta * SUM_{i<j} [ |corr(feature_i, feature_j)| * x_i * x_j ]

This adds positive values to off-diagonals: Q[i,j] += beta * |corr_ij|. Correlated pairs become expensive to select together.

Term 3: Cardinality (select exactly K)

We want exactly K features. The soft constraint (sum - K)^2 penalizes deviation:

E_cardinality = gamma * (SUM_i[x_i] - K)^2

Expanding this quadratic:

= gamma * (SUM_i[x_i])^2 - 2*gamma*K * SUM_i[x_i] + gamma*K^2
= gamma * SUM_i[x_i^2] + gamma * SUM_{i≠j}[x_i * x_j] - 2*gamma*K * SUM_i[x_i] + const

Since x_i is binary, x_i^2 = x_i. So:

Diagonal contribution: Q[i,i] += gamma * (1 - 2K)

Off-diagonal contribution: Q[i,j] += 2 * gamma

The constant term (gamma*K^2) doesn't affect which solution is optimal, so we ignore it.

Combined QUBO

Q[i,i] = -alpha * MI_i + gamma * (1 - 2K)           # diagonal
Q[i,j] = beta * |corr_ij| + 2 * gamma                # off-diagonal (i < j)

Three hyperparameters control the balance:

alpha (relevance weight): higher = stronger pull toward high-MI features

beta (redundancy weight): higher = stronger repulsion between correlated features

gamma (cardinality weight): higher = stricter enforcement of selecting exactly K features

Graph Partitioning QUBO

Problem: Split N nodes into two balanced communities, maximizing intra-community edges.

Term 1: Edge Reward

For each edge (i,j) with weight w, we want both nodes in the same community. If x_i = x_j (both 0 or both 1), we reward it:

Same community:  x_i == x_j  =  1 - x_i - x_j + 2*x_i*x_j

So: E_edge = -w * (1 - x_i - x_j + 2*x_i*x_j)

Diagonal: Q[i,i] += w, Q[j,j] += w

Off-diagonal: Q[i,j] += -2w

Dense edges get strong negative off-diagonals: the solver prefers to keep those nodes together.

Term 2: Balance Constraint

Without a balance constraint, the solver would put all nodes in one community (trivially maximizes intra-community edges). The constraint (SUM_i[x_i] - N/2)^2 penalizes imbalanced splits.

Same expansion as the cardinality constraint above, with K = N/2.

The Tradeoff

Balance penalty vs edge density creates an interesting tradeoff. High penalty → perfectly balanced but may cut important edges. Low penalty → preserves communities but allows imbalance.

Weight Selection

Choosing alpha, beta, gamma (or penalty) is problem-dependent. Some heuristics:

Start with alpha=1, beta=0.5, gamma=2 for feature selection

Start with penalty=1 for graph partitioning

If too few features selected → decrease gamma or increase alpha

If correlated features both selected → increase beta

If graph partition is imbalanced → increase penalty

The sensitivity analysis in Module 4 will show how these weights affect solution quality.

Key Insight

The QUBO framework is general-purpose. Any problem you can express as "select a subset of binary variables to minimize a quadratic function" can be encoded. Other examples:

Traveling salesman: which edges to include in the tour

Job scheduling: which time slot for each job

Portfolio optimization: which stocks to include

Protein folding: which configuration minimizes free energy

The QUBO is the bridge between your domain problem and the quantum (or classical) solver.

This is chapter 2 of Quantum Optimization for AI.

Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

View course details

Ch. 1: Foundations

Ch. 3: Use Case Lab