Predictive Analytics in Business

📊Predictive Analytics in Business Unit 5 – Time Series Analysis & Forecasting

Time series analysis and forecasting are crucial tools for understanding and predicting patterns in sequential data. These techniques help businesses and researchers extract insights from historical observations, identify trends and seasonality, and make informed predictions about future values. From data preparation to advanced modeling approaches, time series analysis encompasses a wide range of methods. Key concepts include stationarity, autocorrelation, and decomposition, while popular models like ARIMA and exponential smoothing form the foundation for accurate forecasting in various real-world applications.

Key Concepts

  • Time series data consists of observations collected at regular intervals over time (hourly, daily, monthly, yearly)
  • Stationarity assumes the statistical properties of the data remain constant over time
    • Constant mean, variance, and autocorrelation structure
  • Autocorrelation measures the correlation between a variable and its lagged values
  • Partial autocorrelation measures the correlation between a variable and its lagged values after removing the effect of intermediate lags
  • White noise is a series of uncorrelated random variables with zero mean and constant variance
  • Differencing transforms non-stationary data into stationary data by computing the differences between consecutive observations
  • Decomposition separates a time series into its components trend, seasonality, and residuals

Data Preparation

  • Handling missing values through interpolation, forward-filling, or backward-filling
  • Dealing with outliers using statistical methods (Z-score, IQR) or domain knowledge
  • Transforming data to stabilize variance using logarithmic, square root, or Box-Cox transformations
  • Scaling data to a common range (min-max scaling) or standardizing to zero mean and unit variance
  • Creating lagged variables to capture the effect of past observations on the current observation
  • Splitting data into training, validation, and testing sets for model development and evaluation
  • Resampling data to change the frequency of observations (upsampling or downsampling)

Trend and Seasonality

  • Trend represents the long-term increase or decrease in the data over time
    • Can be linear, exponential, or polynomial
  • Seasonality refers to regular, periodic fluctuations in the data
    • Can be additive (constant amplitude) or multiplicative (amplitude varies with the level of the series)
  • Detecting trend using visual inspection, moving averages, or regression analysis
  • Identifying seasonality using visual inspection, autocorrelation plots, or Fourier analysis
  • Removing trend and seasonality to obtain stationary residuals
    • Differencing, detrending, or seasonal adjustment techniques (STL decomposition, SEATS)
  • Modeling trend and seasonality explicitly in time series models (SARIMA, Holt-Winters)

Time Series Models

  • Autoregressive (AR) models predict future values based on a linear combination of past values
    • Order pp determines the number of lagged values used
  • Moving Average (MA) models predict future values based on a linear combination of past forecast errors
    • Order qq determines the number of lagged errors used
  • Autoregressive Moving Average (ARMA) models combine AR and MA components
    • Suitable for stationary data
  • Autoregressive Integrated Moving Average (ARIMA) models extend ARMA to handle non-stationary data
    • Differencing order dd determines the number of times the data is differenced to achieve stationarity
  • Seasonal ARIMA (SARIMA) models incorporate seasonal AR, MA, and differencing terms
  • Exponential Smoothing (ES) models use weighted averages of past observations to forecast future values
    • Simple ES, Holt's linear trend ES, Holt-Winters' seasonal ES

Forecasting Techniques

  • Recursive forecasting uses the model to predict one step ahead, then updates the model with the actual value before predicting the next step
  • Direct forecasting trains separate models for each forecast horizon
  • Rolling forecasting uses a fixed window of historical data to train the model and updates the window as new observations become available
  • Ensemble forecasting combines predictions from multiple models to improve accuracy and robustness
    • Simple averaging, weighted averaging, or stacking
  • Hierarchical forecasting reconciles forecasts at different levels of aggregation (bottom-up, top-down, or middle-out approaches)
  • Forecast combination methods (Bates-Granger, Newbold-Granger, Holt-Winters) assign weights to individual forecasts based on their historical performance

Model Evaluation

  • Splitting data into training and testing sets to assess model performance on unseen data
  • Cross-validation techniques (rolling origin, time series cross-validation) for time-dependent data
  • Evaluation metrics for point forecasts
    • Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE)
  • Evaluation metrics for probabilistic forecasts
    • Coverage probability, prediction intervals, Continuous Ranked Probability Score (CRPS)
  • Residual diagnostics to check model assumptions
    • Ljung-Box test for autocorrelation, Jarque-Bera test for normality, Engle's ARCH test for heteroscedasticity
  • Comparing models using information criteria (AIC, BIC) or forecast accuracy measures
  • Backtesting to evaluate model performance on historical data

Real-World Applications

  • Demand forecasting in supply chain management (inventory optimization, production planning)
  • Sales forecasting in retail and e-commerce (promotional planning, pricing strategies)
  • Energy demand forecasting for utilities (electricity, gas, water)
  • Traffic volume forecasting for transportation planning and management
  • Financial market forecasting (stock prices, exchange rates, volatility)
  • Economic forecasting (GDP growth, inflation, unemployment rates)
  • Disease incidence and prevalence forecasting in healthcare (epidemic modeling)

Advanced Topics

  • Multivariate time series models (Vector Autoregression, Vector Error Correction)
  • State space models and Kalman filtering for dynamic systems
  • Bayesian time series models (Bayesian structural time series, dynamic linear models)
  • Neural network-based models (Recurrent Neural Networks, Long Short-Term Memory, Gated Recurrent Units)
  • Deep learning architectures for time series (Temporal Convolutional Networks, Transformers)
  • Functional time series models for high-dimensional data (functional autoregressive models)
  • Nonlinear time series models (Threshold AR, Smooth Transition AR, Markov Switching models)
  • Forecasting with exogenous variables (ARIMAX, SARIMAX, regression with ARIMA errors)


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary