Energy Consumption Forecasting

Time-series forecasting of hourly energy consumption using deep learning and traditional ML models.

Overview

Production-ready pipeline for energy demand forecasting across 10 U.S. regions (145k+ hourly observations, 2004-2018). Implements LSTM, GRU, TFT, and gradient-boosted models with GPU acceleration.

Quick Start

# Setup
python -m venv .venv
.venv\Scripts\activate  # Windows; source .venv/bin/activate on Linux/Mac
pip install -r requirements.txt

# Train model (5 minutes)
python scripts/train_lstm.py --mode train_test --epochs 50

Outputs: Checkpoint in checkpoints/, figures in figures/

Model Comparison

Model	RMSE (MW)	MAPE (%)	Training Time	Use Case
GRU	1928	4.79	10-15 min	GPU available
LSTM	1778	4.17	10-15 min	Alternative to GRU
XGBoost/LightGBM	~900	~2.5	2-5 min	CPU only, fast iterations
TFT	~700-1000	~2.0-3.0	30-60 min	Maximum accuracy

Full analysis: Model Comparison Guide

Project Evolution

This project evolved through several stages, starting with fundamental analysis and progressively adopting more sophisticated models.

1. Exploratory Data Analysis (EDA)

The initial phase focused on understanding the dataset. We identified strong seasonal, weekly, and daily patterns in energy consumption.

Figure: Average energy consumption by hour of the day and day of the week, revealing clear cyclical demand.

2. Baseline Modeling

With a good understanding of the data, we established baseline performance using traditional machine learning models like LightGBM and XGBoost. These models performed well, confirming that engineered features like time lags and rolling averages were highly predictive.

Figure: Feature importance from a LightGBM model, highlighting the predictive power of temporal and lag features.

The predictions from these shallow models were quite accurate and served as a strong benchmark for more advanced techniques.

Figure: Prediction accuracy of the baseline LightGBM model.

3. Deep Learning Adoption

To better capture long-term temporal dependencies, we moved to deep learning models. GRU and LSTM networks were implemented, which further improved forecasting accuracy by learning complex patterns directly from the time-series data. The GRU model, in particular, offered a great balance of performance and training efficiency.

4. State-of-the-Art Models

Finally, we integrated the Temporal Fusion Transformer (TFT), a state-of-the-art forecasting model. The TFT architecture is designed to handle multi-horizon forecasting with high accuracy, representing the cutting edge of what is possible with this dataset.

Documentation

Guides:

Getting Started - Installation and first training
LSTM/GRU Guide - Detailed training instructions
Model Comparison - Choosing the right model
Data Guide - Dataset structure and features
Visualisation Guide - Interpreting results

Reference:

Complete Documentation - Full documentation index
Analysis Report - Exploratory data analysis

Common Commands

Train default model:

python scripts/train_lstm.py --mode train_test --epochs 50

Train optimised model:

# PowerShell (Windows)
python scripts/train_lstm.py `
    --mode train_test `
    --hidden_size 512 --num_layers 3 `
    --lookback 336 --batch_size 32 `
    --epochs 100

# Bash (Linux/Mac)
python scripts/train_lstm.py \
    --mode train_test \
    --hidden_size 512 --num_layers 3 \
    --lookback 336 --batch_size 32 \
    --epochs 100

Test saved model:

python scripts/train_lstm.py --mode test \
    --checkpoint_path checkpoints/lstm_best_PJME_MW.pt \
    --hidden_size 512 --num_layers 3 --lookback 336

Generate visualisations:

python scripts/generate_figures.py

GRU Model Details

GRU Model (GPU-optimised, 336h lookback, 512 hidden, 3 layers):

Metric	Value
RMSE	1927.85 MW
MAE	1448.76 MW
MAPE	4.79%

GRU Architecture

Why GRU over LSTM:

15-20% faster training (simpler architecture)
Better generalisation on time-series data
Less prone to overfitting
Comparable accuracy, lower computational cost

Configuration:

Lookback: 336 hours (2 weeks) to capture weekly patterns
Hidden size: 512 units for complex pattern learning
Layers: 3 for hierarchical feature extraction
Batch size: 32 for better generalisation (noisier gradients)
Dropout: 0.4 to prevent overfitting

Key Design Decisions

Feature Engineering

Temporal features: hour, day-of-week, month
Lag features: 1h, 24h, 168h (weekly)
Rolling statistics: 24h mean/std for trend capture
Normalisation: StandardScaler on all features

Repository Structure

EnergyConsumption/
├── docs/                   # Complete documentation
├── scripts/
│   ├── train_lstm.py      # LSTM/GRU training (recommended)
│   ├── train_tft.py       # TFT training (advanced)
│   └── generate_figures.py
├── src/
│   ├── data_loader.py
│   ├── feature_engineering.py
│   ├── modeling.py
│   └── plotting.py
├── notebooks/
│   └── exploration.ipynb  # EDA and analysis
├── figures/                # Auto-generated visualisations
└── checkpoints/            # Model checkpoints

Requirements

Python 3.9+
CUDA-capable GPU (recommended, 10-15 min training vs 2+ hours on CPU)
8GB+ RAM
2GB+ storage

Technology Stack

Core: NumPy, Pandas, SciPy
Visualisation: Matplotlib, Seaborn, Plotly
Traditional ML: scikit-learn, XGBoost, LightGBM, CatBoost
Deep Learning: PyTorch, PyTorch Lightning, PyTorch Forecasting

Getting Help

Getting Started Guide - Installation and setup
Model Comparison - Choosing models
LSTM/GRU Guide - Training and tuning
Complete Documentation - Full reference

Licence

MIT Licence - See LICENSE for details

Quick Links: Getting Started | Model Comparison | LSTM Guide | Full Docs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Energy Consumption Forecasting

Overview

Quick Start

Model Comparison

Project Evolution

1. Exploratory Data Analysis (EDA)

2. Baseline Modeling

3. Deep Learning Adoption

4. State-of-the-Art Models

Documentation

Common Commands

GRU Model Details

GRU Architecture

Key Design Decisions

Feature Engineering

Repository Structure

Requirements

Technology Stack

Getting Help

Licence

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs		docs
figures		figures
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Energy Consumption Forecasting

Overview

Quick Start

Model Comparison

Project Evolution

1. Exploratory Data Analysis (EDA)

2. Baseline Modeling

3. Deep Learning Adoption

4. State-of-the-Art Models

Documentation

Common Commands

GRU Model Details

GRU Architecture

Key Design Decisions

Feature Engineering

Repository Structure

Requirements

Technology Stack

Getting Help

Licence

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages