📈 Stock Market Time Series Forecasting with LSTM & Transformer (PyTorch)

Overview

This project implements an end-to-end deep learning pipeline for stock market time series forecasting using LSTM and Transformer models in PyTorch.

The goal is not trading or profit prediction, but to:

Model temporal dependencies in financial time series
Compare deep learning models against strong baselines
Analyze model behavior, limitations, and failure modes

This project follows FAANG-level ML engineering practices, including time-aware splits, baselines, evaluation, and error analysis.

Problem Statement

Given historical stock price data (OHLCV), predict the next-day closing price using past observations.

Key constraints:

Time-aware training (no data leakage)
Fair comparison with naïve baselines
Focus on model reliability and interpretability

Dataset

Source: Yahoo Finance (yfinance)
Stock: AAPL (Apple Inc.)
Frequency: Daily
Time Range: 2015-01-01 → 2024-12-31

Features Used

Open
High
Low
Close
Volume
Daily return
Log return

Target

Next-day closing price

Methodology

1. Exploratory Data Analysis (EDA)

Verified time continuity and data integrity
Identified non-stationarity and volatility regimes
Observed trend, seasonality, and regime shifts

2. Preprocessing

Engineered returns and log-returns to stabilize variance
Time-aware train/validation/test split:
- Train: 2015–2020
- Validation: 2021–2022
- Test: 2023–2024
Feature scaling using StandardScaler (fit on train only)

3. Sliding Window Formulation

Converted the time series into supervised learning format:

Input window: 60 trading days
Forecast horizon: 1 day
X → (batch, 60, num_features)
y → (batch, 1)

Baseline Models

Strong baselines were implemented to justify model complexity:

Naive Forecast
- Predicts last observed value
Moving Average Forecast
- Mean of last 5 timesteps

These baselines provide a realistic lower bound for performance.

Deep Learning Models

LSTM

2-layer LSTM
Hidden size: 64
Dropout for regularization
Uses final hidden state for prediction

Motivation:
LSTMs handle medium-range temporal dependencies and mitigate vanishing gradients.

Transformer

Encoder-only Transformer
Positional encoding
Multi-head self-attention
Causal sequence modeling

Motivation:
Transformers capture long-range dependencies via attention, but require more data and regularization.

Training Setup

Framework: PyTorch
Optimizer: Adam
Loss: Mean Squared Error (MSE)
Early stopping via validation monitoring
GPU support when available

Evaluation Metrics

Mean Absolute Error (MAE)
Root Mean Squared Error (RMSE)

Evaluation Strategy

Same test set for baselines and DL models
Visual comparison of predictions vs actuals
Residual analysis during high-volatility periods

Key Results (Typical Observation)

Naive baseline performs strongly during stable periods
LSTM improves consistency during trending regimes
Transformer shows potential for longer horizons but is data-hungry
All models struggle during extreme volatility (earnings, macro events)

Error Analysis

Large errors correlate with:
- Earnings announcements
- Market regime shifts
- Sudden volatility spikes
Highlights inherent uncertainty in financial time series

Limitations

Stock prices are inherently noisy and non-stationary
No external features (news, macro indicators)
Single-stock modeling (extendable to panel data)

Future Work

Multi-step forecasting
Multi-stock panel forecasting with symbol embeddings
Probabilistic forecasting (prediction intervals)
Volatility-aware loss functions

Disclaimer

This project is for educational and research purposes only.
It is not intended for financial trading or investment decisions.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
results.txt		results.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📈 Stock Market Time Series Forecasting with LSTM & Transformer (PyTorch)

Overview

Problem Statement

Dataset

Features Used

Target

Methodology

1. Exploratory Data Analysis (EDA)

2. Preprocessing

3. Sliding Window Formulation

Baseline Models

Deep Learning Models

LSTM

Transformer

Training Setup

Evaluation Metrics

Evaluation Strategy

Key Results (Typical Observation)

Error Analysis

Limitations

Future Work

Disclaimer

About

Uh oh!

Releases

Packages

Languages

Varshith-Yadav/StockMarket_Forecasting

Folders and files

Latest commit

History

Repository files navigation

📈 Stock Market Time Series Forecasting with LSTM & Transformer (PyTorch)

Overview

Problem Statement

Dataset

Features Used

Target

Methodology

1. Exploratory Data Analysis (EDA)

2. Preprocessing

3. Sliding Window Formulation

Baseline Models

Deep Learning Models

LSTM

Transformer

Training Setup

Evaluation Metrics

Evaluation Strategy

Key Results (Typical Observation)

Error Analysis

Limitations

Future Work

Disclaimer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages