This project implements a machine learning model to predict CO2 emissions based on various economic and social factors, contributing to the United Nations Sustainable Development Goal 13: Climate Action. The model uses real-world data from the World Bank API to analyze and predict CO2 emissions per capita based on GDP, energy consumption, and population metrics.
Climate change is one of the most pressing challenges of our time, and understanding the factors that contribute to CO2 emissions is crucial for developing effective mitigation strategies. This project aims to:
- Predict CO2 emissions based on key economic and social indicators
- Identify the most significant factors influencing emissions
- Provide insights for policymakers and researchers working on climate action
- Generate visualizations for better understanding of emission patterns
The project uses the following technologies and approaches:
- Programming Language: Python 3.x
- Key Libraries:
- scikit-learn for machine learning
- pandas for data manipulation
- matplotlib and seaborn for visualization
- requests for API data fetching
- joblib for model persistence
- Machine Learning Model: Random Forest Regressor
- Data Source: World Bank API (with fallback to synthetic data)
- Python 3.x
- pip (Python package installer)
- Git (for cloning the repository)
- Clone the repository:
git clone [repository-url]
cd [repository-name]- Create a virtual environment (recommended):
# Windows
python -m venv venv
.\venv\Scripts\activate
# Linux/Mac
python3 -m venv venv
source venv/bin/activate- Install required packages:
pip install -r requirements.txt- Train and evaluate the model:
python src/models/co2_emissions_prediction.pyThis will:
- Fetch data from the World Bank API (or use sample data if API is unavailable)
- Train the Random Forest model
- Evaluate model performance
- Save the trained model to
output/co2_emissions_model.joblib
To generate all visualizations:
python src/visualizations/generate_visualizations.pyThis will create visualizations in the pitch_deck_assets directory.
Emmissions Predictor AI/
├── .git/
├── src/
│ ├── visualizations/
│ │ └── generate_visualizations.py # Visualization generation script
│ └── models/
│ └── co2_emissions_prediction.py # Main prediction model
├── pitch_deck_assets/ # Generated visualizations
├── output/ # Model outputs and saved models
├── pitch_deck.md # Project pitch documentation
├── requirements.txt # Python dependencies
├── project_report.md # Detailed project report
└── README.md # Project documentation
The model achieves the following metrics:
- Mean Absolute Error (MAE): Measures the average magnitude of errors
- Mean Squared Error (MSE): Penalizes larger errors more heavily
- R-squared (R²): Indicates the proportion of variance explained by the model
- MAE: ~50-55
- MSE: ~4000-4500
- R²: ~0.95-1.00
The model uses data from the World Bank API, including:
- CO2 emissions per capita
- GDP per capita
- Energy consumption
- Population statistics
If the API is unavailable, the model falls back to synthetic data for demonstration purposes.
- Data Bias: The model may be biased towards countries with better data collection systems
- Fairness: Predictions should be interpreted in the context of each country's development stage
- Transparency: Model decisions are explainable through feature importance analysis
- Data Privacy: All data used is publicly available and aggregated at the country level
- Integration of more data sources:
- Renewable energy production
- Industrial activity metrics
- Climate policy indicators
- Implementation of additional machine learning models:
- Time series analysis
- Deep learning approaches
- Development of a web interface for real-time predictions
- Addition of time-series analysis for trend prediction
- Implementation of model versioning and tracking
Common issues and solutions:
- API Connection Issues: If the World Bank API is unavailable, the model will automatically use sample data
- Package Installation Errors: Ensure you're using Python 3.x and try updating pip:
python -m pip install --upgrade pip
- Memory Issues: If you encounter memory errors, try reducing the dataset size or using a machine with more RAM
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- United Nations Sustainable Development Goals
- World Bank for providing the data API
- Python open-source community for the amazing tools and libraries
For questions or support, please open an issue in the repository.