Back to guides
4
13 min

Time Series & Forecasting

When Order Matters

What Makes Time Series Special

Most data science treats rows as independent — shuffling them doesn't change the analysis. Time series data is fundamentally different: order matters. Today's stock price depends on yesterday's. This month's sales relate to last month's.

This dependency changes everything: how you split data, how you validate models, and which algorithms work.

The Components of Time Series

Every time series can be decomposed into four components:

Loading diagram...

Example — Monthly retail sales:

  • Trend: Gradually increasing 3% per year
  • Seasonality: Spikes in November-December (holidays), dip in January
  • Cyclical: Dips during recessions (2008, 2020)
  • Residual: Random variation (weather, viral products, supply disruptions)
  • Understanding these components tells you what's predictable (trend + seasonality) and what's noise (residual).

    Stationarity: The Foundation

    A time series is stationary when its statistical properties (mean, variance) don't change over time. Most forecasting methods require stationarity.

    PropertyStationaryNon-Stationary
    MeanConstant over timeTrending up or down
    VarianceConstant over timeGrowing or shrinking
    AutocorrelationDepends only on lagChanges over time

    How to make data stationary:

  • Differencing: Subtract previous value. Removes trends.
  • Log transform: Stabilizes growing variance.
  • Seasonal differencing: Subtract value from same season last year.
  • The Augmented Dickey-Fuller (ADF) test statistically checks for stationarity. p-value < 0.05 = stationary.

    Moving Averages: The Simplest Forecast

    A moving average smooths out noise by averaging the last N values. Simple, but powerful for understanding trends.

    WindowEffectUse Case
    7-dayRemoves daily noiseDaily metrics dashboard
    30-dayShows monthly trendMonthly business reviews
    365-dayShows year-over-year trendStrategic planning

    Exponential Moving Average (EMA) weights recent observations more heavily. Better than simple MA because it responds faster to changes.

    ARIMA: The Classical Workhorse

    ARIMA (AutoRegressive Integrated Moving Average) is the standard classical forecasting model. Despite the intimidating name, it combines three simple ideas:

  • AR (AutoRegressive): Future values depend on past values. "If sales were high last month, they'll likely be high this month."
  • I (Integrated): Differencing to achieve stationarity. "Look at changes, not absolute values."
  • MA (Moving Average): Future values depend on past forecast errors. "If we over-predicted last month, adjust down."
  • ARIMA(p, d, q):

  • p = number of past values to use (AR order)
  • d = number of times to difference (Integration order)
  • q = number of past errors to use (MA order)
  • SARIMA adds seasonal components: SARIMA(p,d,q)(P,D,Q,s) where s is the seasonal period (12 for monthly, 7 for daily).

    When ARIMA Works Well

  • Single variable forecasting (no external features)
  • Clear trend and seasonality
  • Historical patterns will continue
  • Short-to-medium forecast horizons (days to months)
  • When ARIMA Fails

  • Multiple interacting variables (use multivariate models)
  • Regime changes (COVID, new competitor, policy changes)
  • Very long horizons (uncertainty compounds)
  • Non-linear patterns (use gradient boosting or neural networks)
  • Anomaly Detection in Time Series

    Finding unusual values in time series data is one of the highest-value applications:

    MethodHow It WorksBest For
    Z-scoreFlag values > 3 std devs from meanStationary data with normal distribution
    IQRFlag values outside 1.5× interquartile rangeRobust to non-normal distributions
    Moving average bandsFlag values outside MA ± k×stdData with trends or seasonality
    Isolation forestML method that isolates outliersMultivariate time series
    Prophet anomaly detectionForecast + confidence intervalBusiness metrics with multiple seasonalities

    Business applications:

  • Server monitoring: CPU spike detection
  • Financial: Unusual transaction volumes
  • IoT: Sensor malfunction detection
  • Marketing: Traffic anomalies (bot attacks, viral content)
  • Operations: Supply chain disruptions
  • Forecasting vs Prediction

    These terms are often confused:

    ForecastingPrediction
    InputHistorical values of the same variableFeatures/variables that describe the target
    Question"What will sales be next month?""Will this customer churn?"
    TimeAlways about the futureCan be current or future
    MethodARIMA, Prophet, exponential smoothingRegression, random forest, neural networks
    Key challengeUncertainty grows with horizonFeature quality and relevance

    Validation: Never Shuffle Time Series

    The golden rule of time series: never randomly split data. Always split chronologically.

    MethodHowWhen
    Train/test splitTrain on Jan-Oct, test on Nov-DecSimple, one-shot evaluation
    Walk-forwardTrain on expanding window, test on next periodMore robust, multiple evaluations
    Time series cross-validationMultiple walk-forward splitsBest estimate of true performance

    Random splits leak future information into the training set. A model might learn that "sales drop after December" by seeing January data during training — information it shouldn't have.

    Key Takeaways

  • Time series data has order — treating rows as independent is wrong
  • Decompose into trend + seasonality + residual to understand what's predictable
  • Most methods require stationarity — use differencing to achieve it
  • Moving averages are simple but powerful for trend analysis
  • ARIMA is the classical workhorse for single-variable forecasting
  • Anomaly detection in time series is a high-value business application
  • Never randomly split time series data — use chronological splits
  • This is chapter 4 of Data Science for AI.

    Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

    View course details