6 min

Annealing Solver

Simulated Annealing & Solution Analysis

How Simulated Annealing Works

Simulated annealing (SA) is a metaheuristic inspired by the physical process of cooling metal. When metal is heated and slowly cooled (annealed), atoms settle into a low-energy crystalline structure. SA applies the same principle to optimization.

The Algorithm

1. Start with a random binary state x
2. Set temperature T = T_start (high)
3. Repeat for num_steps:
   a. Pick a random variable i
   b. Flip it: x_i = 1 - x_i
   c. Compute energy change: dE = E(new) - E(old)
   d. If dE <= 0: accept (it's an improvement)
   e. If dE > 0: accept with probability exp(-dE / T)
   f. Cool: T = T * cooling_rate
4. Return the final state and its energy

Why It Works

The acceptance probability exp(-dE / T) is the key insight:

High temperature: exp(-dE / T) ≈ 1 for moderate dE. Almost any move is accepted. The solver explores the landscape freely, jumping over barriers.

Low temperature: exp(-dE / T) ≈ 0 for positive dE. Only improvements are accepted. The solver exploits the current region, settling into the nearest minimum.

Goldilocks zone: During cooling, the solver transitions from exploration to exploitation. If the schedule is right, it discovers good regions early and refines them late.

Temperature Schedule

The geometric cooling schedule multiplies temperature by a constant factor each step:

cooling_rate = (T_end / T_start) ^ (1 / num_steps)
T_next = T_current * cooling_rate

This produces exponential decay: fast cooling at high temperatures (where there's less to gain from exploration) and slow cooling at low temperatures (where precision matters).

Key parameters:

T_start = 10: High enough to accept most moves initially

T_end = 0.01: Low enough to settle into a minimum

num_steps = 1000: More steps = finer cooling = better solutions (with diminishing returns)

num_reads = 100: Independent runs from different random starts

Multiple Reads

SA is stochastic — different starting points lead to different solutions. Running multiple independent chains (num_reads) and keeping the best increases your chance of finding the global optimum.

The energy distribution across reads tells you about the landscape:

Tight distribution (all reads find similar energy): the landscape has few minima, SA converges reliably

Wide distribution (reads find very different energies): the landscape is rugged, more reads help

Simulated vs Quantum Annealing

Property	Simulated Annealing	Quantum Annealing
Escape mechanism	Thermal fluctuation	Quantum tunneling
Hardware	Any CPU	D-Wave quantum processor
Problem size	No hard limit	~5000 qubits (with embedding)
Speed	O(num_steps × num_reads)	Fixed anneal time (~20μs)
Guarantee	Converges to optimal as T→0 (infinitely slow)	No formal guarantee
Practical advantage	Well-understood, reliable	Debated — active research

The honest assessment: for problems under ~100 variables, well-tuned SA is competitive with or better than current quantum hardware. The QUBO framework is the valuable part — it works with both backends.

Analyzing Solutions

After running the solver, you should check:

Best energy: Is it significantly better than random? Than greedy?

Selected variables: Do they make domain sense? (e.g., are the selected features actually informative?)

Consistency: Do multiple reads converge to similar solutions?

Sensitivity: How much do results change when you tweak QUBO weights?

Baseline Comparisons

Always compare against simple baselines:

Random selection: Pick K variables uniformly at random. This is the "no intelligence" baseline.

Greedy selection: Iteratively pick the variable that most improves the objective. This is the "local intelligence" baseline.

SA selection: The annealing result. Should beat both, especially when interactions (off-diagonal terms) matter.

Greedy fails when features are interacting: it might greedily pick two correlated features because each is individually good, missing that together they're redundant. SA considers the full QUBO structure.

Convergence Analysis

Plot best energy vs num_reads and best energy vs num_steps:

If the curve is still improving → increase that parameter

If the curve has plateaued → you've found the practical limit

The elbow point is your sweet spot for compute budget

What's Next

In Module 5, you'll build an interactive dashboard that makes this analysis visual — sliders for parameters, charts for energy distributions, side-by-side comparisons with baselines.

This is chapter 4 of Quantum Optimization for AI.

Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

View course details

Ch. 3: Use Case Lab

Ch. 5: Pipeline App