QUBO Formulation
Encoding Real Problems as Energy Minimization
The Art of QUBO Construction
Building a QUBO is the hardest part of quantum optimization. The solver is generic — give it any QUBO matrix and it'll find low-energy states. Your job is to construct a matrix where low-energy states correspond to good solutions for your specific problem.
Every QUBO has three components:
Feature Selection QUBO
Problem: Given N features and a target variable, select the best K features for classification.
Term 1: Relevance (reward informative features)
Mutual Information (MI) measures how much knowing feature i tells you about the target. High MI = valuable feature.
E_relevance = -alpha * SUM_i [ MI(feature_i, target) * x_i ]This adds a negative value to the diagonal: Q[i,i] -= alpha * MI_i. Features with high MI get more negative diagonals, making them cheaper to select.
Term 2: Redundancy (penalize correlated pairs)
If two features are highly correlated, they carry the same information. Selecting both wastes a slot.
E_redundancy = beta * SUM_{i<j} [ |corr(feature_i, feature_j)| * x_i * x_j ]This adds positive values to off-diagonals: Q[i,j] += beta * |corr_ij|. Correlated pairs become expensive to select together.
Term 3: Cardinality (select exactly K)
We want exactly K features. The soft constraint (sum - K)^2 penalizes deviation:
E_cardinality = gamma * (SUM_i[x_i] - K)^2Expanding this quadratic:
= gamma * (SUM_i[x_i])^2 - 2*gamma*K * SUM_i[x_i] + gamma*K^2
= gamma * SUM_i[x_i^2] + gamma * SUM_{i≠j}[x_i * x_j] - 2*gamma*K * SUM_i[x_i] + constSince x_i is binary, x_i^2 = x_i. So:
The constant term (gamma*K^2) doesn't affect which solution is optimal, so we ignore it.
Combined QUBO
Q[i,i] = -alpha * MI_i + gamma * (1 - 2K) # diagonal
Q[i,j] = beta * |corr_ij| + 2 * gamma # off-diagonal (i < j)Three hyperparameters control the balance:
Graph Partitioning QUBO
Problem: Split N nodes into two balanced communities, maximizing intra-community edges.
Term 1: Edge Reward
For each edge (i,j) with weight w, we want both nodes in the same community. If x_i = x_j (both 0 or both 1), we reward it:
Same community: x_i == x_j = 1 - x_i - x_j + 2*x_i*x_jSo: E_edge = -w * (1 - x_i - x_j + 2*x_i*x_j)
Dense edges get strong negative off-diagonals: the solver prefers to keep those nodes together.
Term 2: Balance Constraint
Without a balance constraint, the solver would put all nodes in one community (trivially maximizes intra-community edges). The constraint (SUM_i[x_i] - N/2)^2 penalizes imbalanced splits.
Same expansion as the cardinality constraint above, with K = N/2.
The Tradeoff
Balance penalty vs edge density creates an interesting tradeoff. High penalty → perfectly balanced but may cut important edges. Low penalty → preserves communities but allows imbalance.
Weight Selection
Choosing alpha, beta, gamma (or penalty) is problem-dependent. Some heuristics:
The sensitivity analysis in Module 4 will show how these weights affect solution quality.
Key Insight
The QUBO framework is general-purpose. Any problem you can express as "select a subset of binary variables to minimize a quadratic function" can be encoded. Other examples:
The QUBO is the bridge between your domain problem and the quantum (or classical) solver.
This is chapter 2 of Quantum Optimization for AI.
Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.
View course details