ML Detection
Beyond Statistics — Learning What Normal Looks Like
Why ML for Anomaly Detection?
Statistical methods (z-scores, moving averages) make assumptions about your data: it should be roughly Gaussian, stationary, and have clear thresholds. Real production metrics violate all three assumptions.
ML-based detectors learn what "normal" looks like from the data itself. They handle:
Isolation Forest
The isolation forest is one of the most elegant algorithms in anomaly detection. The key insight: anomalies are easy to isolate.
Imagine randomly drawing a vertical line to split your data points into two groups. Normal points are clustered together — it takes many random splits to isolate one from the crowd. But an outlier sits far from the cluster and can be isolated in just a few splits.
Normal point: needs 8-12 splits to isolate (deep in the tree)
Anomalous point: needs 2-4 splits to isolate (near the root)The algorithm:
score = 2^(-avgPath / expectedPath)Scores range from 0 to 1:
const result = detectWithIsolationForest("api_latency_search", points);
// result.anomalies: points with score > 0.6
// result.scoresPerPoint: score for every timestampThe beauty: zero distributional assumptions. It works on Gaussian data, bimodal data, skewed data, and data with complex temporal patterns. The only parameter that matters is the number of trees (100 is standard).
Autoencoders
An autoencoder takes a different approach: learn to reconstruct normal patterns, then flag anything with high reconstruction error.
The architecture (simplified):
When trained on normal data:
const result = detectWithAutoencoder("api_latency_search", points, 12);
// Uses 12-point sliding windows
// Anomalies: windows with error > mean_error + 2*std_errorThe windowed approach is the key advantage over point-based methods. A value of 150ms might be normal in isolation, but a sudden jump from a stable 80ms to 150ms within one window creates a pattern the autoencoder fails to reconstruct. This catches gradual drift that point-based z-scores miss.
Trade-off: Autoencoders need a window size parameter. Too small (3 points) and they miss slow changes. Too large (48 points) and they're insensitive to short anomalies. 12 points (12 hours with hourly windowing) is a reasonable starting point.
Ensemble Scoring
No single detector catches every type of anomaly:
| Detector | Best At | Worst At |
|---|---|---|
| Z-score | Sudden spikes | Gradual drift, non-Gaussian data |
| Isolation Forest | Point outliers | Temporal pattern anomalies |
| Autoencoder | Shape/pattern anomalies | Very brief spikes |
The ensemble combines all three with weighted voting:
score = 0.40 * iforest_score + 0.35 * autoencoder_score + 0.25 * zscore_scoreA point that triggers only one detector (score ~0.35) probably isn't worth alerting on — it could be a quirk of that particular method. But a point that triggers two or three detectors (score ~0.65+) is almost certainly a real anomaly.
This is the same principle behind Random Forests, boosting, and other ensemble methods in ML: combining weak learners produces a strong learner. In anomaly detection, combining specialized detectors produces a robust detection system.
Confidence Calibration
Raw anomaly scores are not probabilities. An isolation forest score of 0.7 doesn't mean "70% chance of being anomalous." The mapping from raw score to actual probability varies by:
Platt scaling fixes this with a logistic calibration function:
calibrated = 1 / (1 + exp(a * rawScore + b))Parameters a and b are fitted to historical data: for each score range, what fraction turned out to be true anomalies? After calibration, a score of 0.7 genuinely means "70% likely to be a real anomaly."
const calibrated = calibrateBatch(anomalies, 0.3);
// Filters out anomalies below 0.3 calibrated confidence
// Sorts by calibrated confidence (highest first)This makes alert thresholds meaningful. You can tell the on-call team: "alerts above 0.6 calibrated confidence are correct 85% of the time."
Choosing Detection Methods
For a new monitoring deployment, start with:
Revisit the weights and thresholds monthly as your data evolves.
This is chapter 3 of AI Anomaly Detection.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details