13 min

Time Series & Forecasting

When Order Matters

What Makes Time Series Special

Most data science treats rows as independent — shuffling them doesn't change the analysis. Time series data is fundamentally different: order matters. Today's stock price depends on yesterday's. This month's sales relate to last month's.

This dependency changes everything: how you split data, how you validate models, and which algorithms work.

The Components of Time Series

Every time series can be decomposed into four components:

Loading diagram...

Example — Monthly retail sales:

Trend: Gradually increasing 3% per year

Seasonality: Spikes in November-December (holidays), dip in January

Cyclical: Dips during recessions (2008, 2020)

Residual: Random variation (weather, viral products, supply disruptions)

Understanding these components tells you what's predictable (trend + seasonality) and what's noise (residual).

Stationarity: The Foundation

A time series is stationary when its statistical properties (mean, variance) don't change over time. Most forecasting methods require stationarity.

Property	Stationary	Non-Stationary
Mean	Constant over time	Trending up or down
Variance	Constant over time	Growing or shrinking
Autocorrelation	Depends only on lag	Changes over time

How to make data stationary:

Differencing: Subtract previous value. Removes trends.

Log transform: Stabilizes growing variance.

Seasonal differencing: Subtract value from same season last year.

The Augmented Dickey-Fuller (ADF) test statistically checks for stationarity. p-value < 0.05 = stationary.

Moving Averages: The Simplest Forecast

A moving average smooths out noise by averaging the last N values. Simple, but powerful for understanding trends.

Window	Effect	Use Case
7-day	Removes daily noise	Daily metrics dashboard
30-day	Shows monthly trend	Monthly business reviews
365-day	Shows year-over-year trend	Strategic planning

Exponential Moving Average (EMA) weights recent observations more heavily. Better than simple MA because it responds faster to changes.

ARIMA: The Classical Workhorse

ARIMA (AutoRegressive Integrated Moving Average) is the standard classical forecasting model. Despite the intimidating name, it combines three simple ideas:

AR (AutoRegressive): Future values depend on past values. "If sales were high last month, they'll likely be high this month."

I (Integrated): Differencing to achieve stationarity. "Look at changes, not absolute values."

MA (Moving Average): Future values depend on past forecast errors. "If we over-predicted last month, adjust down."

ARIMA(p, d, q):

p = number of past values to use (AR order)

d = number of times to difference (Integration order)

q = number of past errors to use (MA order)

SARIMA adds seasonal components: SARIMA(p,d,q)(P,D,Q,s) where s is the seasonal period (12 for monthly, 7 for daily).

When ARIMA Works Well

Single variable forecasting (no external features)

Clear trend and seasonality

Historical patterns will continue

Short-to-medium forecast horizons (days to months)

When ARIMA Fails

Multiple interacting variables (use multivariate models)

Regime changes (COVID, new competitor, policy changes)

Very long horizons (uncertainty compounds)

Non-linear patterns (use gradient boosting or neural networks)

Anomaly Detection in Time Series

Finding unusual values in time series data is one of the highest-value applications:

Method	How It Works	Best For
Z-score	Flag values > 3 std devs from mean	Stationary data with normal distribution
IQR	Flag values outside 1.5× interquartile range	Robust to non-normal distributions
Moving average bands	Flag values outside MA ± k×std	Data with trends or seasonality
Isolation forest	ML method that isolates outliers	Multivariate time series
Prophet anomaly detection	Forecast + confidence interval	Business metrics with multiple seasonalities

Business applications:

Server monitoring: CPU spike detection

Financial: Unusual transaction volumes

IoT: Sensor malfunction detection

Marketing: Traffic anomalies (bot attacks, viral content)

Operations: Supply chain disruptions

Forecasting vs Prediction

These terms are often confused:

	Forecasting	Prediction
Input	Historical values of the same variable	Features/variables that describe the target
Question	"What will sales be next month?"	"Will this customer churn?"
Time	Always about the future	Can be current or future
Method	ARIMA, Prophet, exponential smoothing	Regression, random forest, neural networks
Key challenge	Uncertainty grows with horizon	Feature quality and relevance

Validation: Never Shuffle Time Series

The golden rule of time series: never randomly split data. Always split chronologically.

Method	How	When
Train/test split	Train on Jan-Oct, test on Nov-Dec	Simple, one-shot evaluation
Walk-forward	Train on expanding window, test on next period	More robust, multiple evaluations
Time series cross-validation	Multiple walk-forward splits	Best estimate of true performance

Random splits leak future information into the training set. A model might learn that "sales drop after December" by seeing January data during training — information it shouldn't have.

Key Takeaways

Time series data has order — treating rows as independent is wrong

Decompose into trend + seasonality + residual to understand what's predictable

Most methods require stationarity — use differencing to achieve it

Moving averages are simple but powerful for trend analysis

ARIMA is the classical workhorse for single-variable forecasting

Anomaly detection in time series is a high-value business application

Never randomly split time series data — use chronological splits

This is chapter 4 of Data Science for AI.

Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

View course details

Ch. 3: Classification & Clustering

Ch. 5: Neural Networks Demystified