Skip to content

Machine learning solution for predicting CO2 emissions using economic and social indicators. Leverages World Bank data to identify key factors influencing emissions. Helps policymakers make data-driven decisions for climate action and sustainable development.

Notifications You must be signed in to change notification settings

MykeShale/Emmissions-Predictor-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CO2 Emissions Prediction Model for SDG 13: Climate Action 🌍

Project Overview

This project implements a machine learning model to predict CO2 emissions based on various economic and social factors, contributing to the United Nations Sustainable Development Goal 13: Climate Action. The model uses real-world data from the World Bank API to analyze and predict CO2 emissions per capita based on GDP, energy consumption, and population metrics.

Problem Statement

Climate change is one of the most pressing challenges of our time, and understanding the factors that contribute to CO2 emissions is crucial for developing effective mitigation strategies. This project aims to:

  • Predict CO2 emissions based on key economic and social indicators
  • Identify the most significant factors influencing emissions
  • Provide insights for policymakers and researchers working on climate action
  • Generate visualizations for better understanding of emission patterns

Technical Implementation

The project uses the following technologies and approaches:

  • Programming Language: Python 3.x
  • Key Libraries:
    • scikit-learn for machine learning
    • pandas for data manipulation
    • matplotlib and seaborn for visualization
    • requests for API data fetching
    • joblib for model persistence
  • Machine Learning Model: Random Forest Regressor
  • Data Source: World Bank API (with fallback to synthetic data)

Installation and Setup

Prerequisites

  • Python 3.x
  • pip (Python package installer)
  • Git (for cloning the repository)

Step-by-Step Installation

  1. Clone the repository:
git clone [repository-url]
cd [repository-name]
  1. Create a virtual environment (recommended):
# Windows
python -m venv venv
.\venv\Scripts\activate

# Linux/Mac
python3 -m venv venv
source venv/bin/activate
  1. Install required packages:
pip install -r requirements.txt

Usage

Running the Model

  1. Train and evaluate the model:
python src/models/co2_emissions_prediction.py

This will:

  • Fetch data from the World Bank API (or use sample data if API is unavailable)
  • Train the Random Forest model
  • Evaluate model performance
  • Save the trained model to output/co2_emissions_model.joblib

Generating Visualizations

To generate all visualizations:

python src/visualizations/generate_visualizations.py

This will create visualizations in the pitch_deck_assets directory.

Project Structure

Emmissions Predictor AI/
├── .git/
├── src/
│   ├── visualizations/
│   │   └── generate_visualizations.py    # Visualization generation script
│   └── models/
│       └── co2_emissions_prediction.py   # Main prediction model
├── pitch_deck_assets/                    # Generated visualizations
├── output/                               # Model outputs and saved models
├── pitch_deck.md                         # Project pitch documentation
├── requirements.txt                      # Python dependencies
├── project_report.md                     # Detailed project report
└── README.md                            # Project documentation

Model Performance

The model achieves the following metrics:

  • Mean Absolute Error (MAE): Measures the average magnitude of errors
  • Mean Squared Error (MSE): Penalizes larger errors more heavily
  • R-squared (R²): Indicates the proportion of variance explained by the model

Expected Performance Metrics

  • MAE: ~50-55
  • MSE: ~4000-4500
  • R²: ~0.95-1.00

Data Sources

The model uses data from the World Bank API, including:

  • CO2 emissions per capita
  • GDP per capita
  • Energy consumption
  • Population statistics

If the API is unavailable, the model falls back to synthetic data for demonstration purposes.

Ethical Considerations

  1. Data Bias: The model may be biased towards countries with better data collection systems
  2. Fairness: Predictions should be interpreted in the context of each country's development stage
  3. Transparency: Model decisions are explainable through feature importance analysis
  4. Data Privacy: All data used is publicly available and aggregated at the country level

Future Improvements

  1. Integration of more data sources:
    • Renewable energy production
    • Industrial activity metrics
    • Climate policy indicators
  2. Implementation of additional machine learning models:
    • Time series analysis
    • Deep learning approaches
  3. Development of a web interface for real-time predictions
  4. Addition of time-series analysis for trend prediction
  5. Implementation of model versioning and tracking

Troubleshooting

Common issues and solutions:

  1. API Connection Issues: If the World Bank API is unavailable, the model will automatically use sample data
  2. Package Installation Errors: Ensure you're using Python 3.x and try updating pip:
    python -m pip install --upgrade pip
  3. Memory Issues: If you encounter memory errors, try reducing the dataset size or using a machine with more RAM

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • United Nations Sustainable Development Goals
  • World Bank for providing the data API
  • Python open-source community for the amazing tools and libraries

Contact

For questions or support, please open an issue in the repository.

About

Machine learning solution for predicting CO2 emissions using economic and social indicators. Leverages World Bank data to identify key factors influencing emissions. Helps policymakers make data-driven decisions for climate action and sustainable development.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages