📈 Stock Closing Price Prediction

Machine learning system combining LSTM neural networks and BERT-based sentiment analysis for stock market prediction.

🎯 Project Overview

This project implements two complementary approaches for stock market prediction:

LSTM Model: Predicted stock returns through lag-based features and MinMaxScaler normalization on an LSTM architecture
NLP Sentiment Analysis: Extracted stock-related news data, built BERT-based embeddings, and trained a sentiment-driven impact model

🏆 Key Results

LSTM Model: Achieved test MAE of 0.002 through lag-based features and MinMaxScaler normalization
NLP Model: Achieved 90% validation accuracy and increased simulated trading return by 10%
Trading Strategy: Outperformed buy-and-hold benchmark through sentiment-driven predictions

🚀 Technologies Used

Deep Learning: TensorFlow/Keras LSTM architecture with dropout regularization
NLP: BERT-based embeddings (FinBERT) for financial text analysis
Data Processing: MinMaxScaler normalization, lag-based feature engineering
Trading Simulation: Realistic backtesting with transaction costs

📊 Model Details

🧠 LSTM Model

Architecture: 3-layer LSTM (50 units each) with 20% dropout regularization
Input Features: 2-day lag sequences of AAPL and S&P 500 closing prices
Normalization: MinMaxScaler for optimal neural network performance
Performance: Test MAE of 0.002

🤖 NLP Sentiment Model

Text Processing: BERT-based embeddings using FinBERT (financial domain-specific)
Sentiment Analysis: Multi-dimensional sentiment scoring (polarity, subjectivity)
Classification: Ensemble of Random Forest and Logistic Regression
Performance: 90% validation accuracy for next-day stock movement prediction

💰 Trading Strategy

Signal Generation: Sentiment-driven stock direction predictions
Position Sizing: Binary and probability-weighted strategies
Performance: 10% improvement in simulated trading returns vs buy-and-hold

🚀 Quick Start

# Install dependencies
pip install -r requirements.txt

# Note: Large data files are excluded from Git due to size limits
# See data/README.md for download instructions

# Run complete pipeline
python run_all_models.py

# Or run individual models
python run_lstm_model.py          # LSTM model
python run_nlp_model.py           # NLP sentiment model

📁 Project Structure

Stock_Closing_Price_Prediction/
├── Stock_Closing_Price_Prediction.ipynb  # LSTM model implementation
├── NLP_Sentiment_Analysis.ipynb          # BERT-based sentiment analysis
├── run_lstm_model.py                     # LSTM execution script
├── run_nlp_model.py                      # NLP execution script
├── run_all_models.py                     # Master execution script
├── data/                                 # Dataset files
│   ├── apple_news_data.csv              # Apple news (2024-2025)
│   ├── apple_prices.csv                 # Apple stock prices
│   ├── stock_prices.csv                 # Historical stock data (1990-2017)
│   └── SP500.csv                        # S&P 500 index data
├── utils/                               # Utility modules
│   ├── data_preprocessing.py            # Data processing utilities
│   ├── model_evaluation.py              # Performance evaluation
│   ├── trading_simulation.py            # Trading strategy simulation
│   └── visualization.py                 # Visualization utilities
└── requirements.txt                     # Python dependencies

📈 Results

Model Performance

LSTM Model: Test MAE of 0.002 (extremely low prediction error)
NLP Model: 90% validation accuracy for stock direction prediction
Trading Strategy: 10% improvement over buy-and-hold benchmark

Key Findings

Sentiment features contribute significantly to prediction accuracy
LSTM architecture effectively captures temporal patterns in stock prices
Combined approach outperforms individual technical or fundamental analysis

🔬 Research Methodology

📚 Literature Review

Technical Analysis: LSTM networks for time series forecasting
Fundamental Analysis: NLP sentiment analysis in finance
Behavioral Finance: News impact on market movements
Quantitative Trading: Risk-adjusted performance metrics

🧪 Experimental Design

Data Collection: Multi-source financial and news data
Preprocessing: Advanced cleaning and normalization
Feature Engineering: Technical + fundamental feature fusion
Model Development: Deep learning + machine learning ensemble
Validation: Time-series aware cross-validation
Backtesting: Realistic trading simulation

📊 Statistical Validation

Significance Testing: Bootstrap confidence intervals
Robustness Checks: Out-of-sample validation
Risk Analysis: VaR, CVaR, maximum drawdown
Benchmarking: Comparison with market indices

🛠️ Advanced Configuration

🔧 LSTM Model Parameters

python run_lstm_model.py \
    --ticker AAPL \
    --epochs 50 \
    --batch_size 64 \
    --sequence_length 5 \
    --test_split 0.15 \
    --save_model

🤖 NLP Model Parameters

python run_nlp_model.py \
    --model_name "ProsusAI/finbert" \
    --test_split 0.25 \
    --confidence_threshold 0.8 \
    --initial_capital 100000 \
    --transaction_cost 0.005

🚀 Master Pipeline Options

python run_all_models.py \
    --output_dir institutional_results \
    --quick \
    --skip_lstm  # or --skip_nlp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📈 Stock Closing Price Prediction

🎯 Project Overview

🏆 Key Results

🚀 Technologies Used

📊 Model Details

🧠 LSTM Model

🤖 NLP Sentiment Model

💰 Trading Strategy

🚀 Quick Start

📁 Project Structure

📈 Results

Model Performance

Key Findings

🔬 Research Methodology

📚 Literature Review

🧪 Experimental Design

📊 Statistical Validation

🛠️ Advanced Configuration

🔧 LSTM Model Parameters

🤖 NLP Model Parameters

🚀 Master Pipeline Options

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
NLP_Sentiment_Analysis.ipynb		NLP_Sentiment_Analysis.ipynb
News.ipynb		News.ipynb
PROJECT_STRUCTURE.md		PROJECT_STRUCTURE.md
README.md		README.md
Stock_Closing_Price_Prediction.ipynb		Stock_Closing_Price_Prediction.ipynb
requirements.txt		requirements.txt
run_all_models.py		run_all_models.py
run_lstm_model.py		run_lstm_model.py
run_nlp_model.py		run_nlp_model.py
verify_data.py		verify_data.py

nuglifeleoji/Stock-Prediction-with-NLP

Folders and files

Latest commit

History

Repository files navigation

📈 Stock Closing Price Prediction

🎯 Project Overview

🏆 Key Results

🚀 Technologies Used

📊 Model Details

🧠 LSTM Model

🤖 NLP Sentiment Model

💰 Trading Strategy

🚀 Quick Start

📁 Project Structure

📈 Results

Model Performance

Key Findings

🔬 Research Methodology

📚 Literature Review

🧪 Experimental Design

📊 Statistical Validation

🛠️ Advanced Configuration

🔧 LSTM Model Parameters

🤖 NLP Model Parameters

🚀 Master Pipeline Options

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages