🚗 Car Price Prediction

A Machine Learning system for predicting car prices using advanced regression models and custom evaluation metrics.

Features • Installation • Usage • Models • NewMetric • Project Structure

📋 Overview

This project implements a comprehensive car price prediction system using machine learning techniques. It features a custom evaluation metric called NewMetric designed specifically for car price prediction, along with multiple regression models for comparison.

Key Highlights

🎯 Custom NewMetric for specialized car price evaluation
🤖 Multiple ML Models including Gradient Boosting, Random Forest, and more
📊 Visual Analytics with detailed comparison charts
💾 Model Persistence for easy deployment and reuse
🔄 Interactive Prediction system for real-time price estimation

✨ Features

Feature	Description
Model Training	Train multiple regression models and compare their performance
NewMetric Evaluation	Custom metric combining MAE, RMSE, and Relative Error
Model Comparison	Side-by-side comparison of 5 different ML algorithms
Visualization	Generate publication-ready charts and graphs
Model Export	Save trained models for production use
Batch Prediction	Predict prices for multiple cars from Excel files
Interactive CLI	User-friendly command-line interface

🛠 Installation

Prerequisites

Python 3.8 or higher
pip package manager

Step 1: Clone the Repository

git clone https://github.com/yourusername/CarPricePrediction.git
cd CarPricePrediction

Step 2: Install Dependencies

pip install pandas numpy scikit-learn matplotlib openpyxl

Required Libraries

Library	Version	Purpose
`pandas`	≥1.3.0	Data manipulation and analysis
`numpy`	≥1.20.0	Numerical computing
`scikit-learn`	≥1.0.0	Machine learning algorithms
`matplotlib`	≥3.4.0	Data visualization
`openpyxl`	≥3.0.0	Excel file support

🚀 Usage

1. Training the Model

To train and build the final prediction model:

python car_price_prediction.py

What this does:

Loads data from data.xlsx
Trains a Gradient Boosting model with optimized parameters
Evaluates performance using NewMetric
Saves the trained model to car_price_model.pkl
Generates visualization charts

Output Files:

car_price_model.pkl - Trained model file
final_model_results.png - Prediction vs Actual charts
final_feature_importance.png - Feature importance visualization

2. Comparing Multiple Models

To compare different ML algorithms:

python "Comparison of models.py"

Models Compared:

Linear Regression
Ridge Regression
Lasso Regression
Random Forest Regressor
Gradient Boosting Regressor

Output Files:

model_results.png - Model comparison charts
feature_importance.png - Feature importance for best model

3. Using the Trained Model

To make predictions with the saved model:

python use_model.py

Available Options:

Option	Description
`1`	Display list of features
`2`	Interactive prediction (enter values manually)
`3`	Batch prediction from Excel file
`4`	Exit

4. Programmatic Usage

from use_model import load_model, predict_price

# Load the trained model
model, feature_names = load_model("car_price_model.pkl")

# Prepare feature values (normalized between 0 and 1)
feature_values = {
    "کیلومتر_نرمال": 0.3,
    "سال_نرمال": 0.8,
    # ... other features
}

# Get prediction
predicted_price = predict_price(model, feature_names, feature_values)
print(f"Predicted Price: {predicted_price:,.0f} Toman")

5. Batch Prediction from Excel

from use_model import load_model, predict_from_excel

# Load model
model, feature_names = load_model()

# Predict for all cars in Excel file
results = predict_from_excel(
    model,
    feature_names,
    excel_path="new_cars.xlsx",
    output_path="predictions.xlsx"
)

🤖 Models

Supported Algorithms

Model	Description	Best For
Gradient Boosting	Ensemble of weak learners	⭐ Best overall performance
Random Forest	Ensemble of decision trees	Robust to overfitting
Ridge Regression	L2 regularized linear	When features are correlated
Lasso Regression	L1 regularized linear	Feature selection
Linear Regression	Basic linear model	Baseline comparison

Final Model Configuration

The production model uses Gradient Boosting Regressor with optimized parameters:

GradientBoostingRegressor(
    n_estimators=200,
    learning_rate=0.1,
    max_depth=5,
    min_samples_split=5,
    min_samples_leaf=2,
    subsample=0.8,
    random_state=42
)

🎯 NewMetric

What is NewMetric?

NewMetric is a custom evaluation metric designed specifically for car price prediction. It combines multiple error measures to provide a comprehensive assessment of model performance.

Formula

$$\text{NewMetric} = 0.4 \times \text{MAE}_{norm} + 0.4 \times \text{RMSE}_{norm} + 0.2 \times \text{RelativeError}$$

Where:

MAE_norm = MAE / Mean Price (Normalized Mean Absolute Error)
RMSE_norm = RMSE / Mean Price (Normalized Root Mean Square Error)
RelativeError = Mean of |Actual - Predicted| / Actual

Interpretation

NewMetric Value	Performance
< 0.10	🟢 Excellent
0.10 - 0.15	🟡 Good
0.15 - 0.20	🟠 Average
> 0.20	🔴 Needs Improvement

Note: Lower values indicate better performance.

📁 Project Structure

CarPricePrediction/
│
├── 📄 car_price_prediction.py    # Main training script with final model
├── 📄 Comparison of models.py    # Model comparison and evaluation
├── 📄 use_model.py               # Inference and prediction utilities
│
├── 📊 data.xlsx                  # Training dataset (required)
├── 🤖 car_price_model.pkl        # Saved model (generated)
│
├── 📈 final_model_results.png    # Prediction charts (generated)
├── 📈 final_feature_importance.png
├── 📈 model_results.png
├── 📈 feature_importance.png
│
├── 📖 README.md                  # English documentation
└── 📖 README_FA.md               # Persian documentation

File Descriptions

File	Purpose
`car_price_prediction.py`	Trains the final Gradient Boosting model, evaluates it, and saves it for production use
`Comparison of models.py`	Compares 5 different ML models using NewMetric and traditional metrics
`use_model.py`	Provides utilities for loading saved models and making predictions
`data.xlsx`	Excel file containing training data with normalized features
`car_price_model.pkl`	Serialized trained model for deployment

📊 Data Format

The input Excel file (data.xlsx) should contain:

Required Columns

Column	Type	Description
`قیمت`	Numeric	Target variable (price in Toman)
`*_نرمال`	Numeric (0-1)	Normalized feature columns

Example Features

کیلومتر_نرمال - Normalized mileage
سال_نرمال - Normalized year
رنگ_نرمال - Normalized color encoding
And more...

📈 Output Examples

Model Comparison Chart

The system generates comparison charts showing:

NewMetric scores for all models
MAPE (Mean Absolute Percentage Error)
Actual vs Predicted scatter plot
Error distribution histogram

Sample Predictions

✅ Actual: 1,200,000,000 | Predicted: 1,180,000,000 | Error: 1.7%
✅ Actual:   850,000,000 | Predicted:   870,000,000 | Error: 2.4%
⚠️ Actual:   500,000,000 | Predicted:   450,000,000 | Error: 10.0%

🔧 Troubleshooting

Common Issues

Issue	Solution
Model file not found	Run `car_price_prediction.py` first to generate the model
Missing features warning	Some features in your data may not match the model's expected features
Memory error	Reduce dataset size or use a machine with more RAM

Font Issues (Persian Display)

If Persian text doesn't display correctly in charts, install a Persian-compatible font:

plt.rcParams["font.family"] = "DejaVu Sans"

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👥 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📬 Contact

For questions or support, please open an issue on GitHub.

Made with ❤️ for the Car Industry

⭐ Star this repo if you find it helpful!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Comparison_of_models.py		Comparison_of_models.py
README.md		README.md
README_FA.md		README_FA.md
car_price_model.pkl		car_price_model.pkl
car_price_prediction.py		car_price_prediction.py
data.xlsx		data.xlsx
feature_importance.png		feature_importance.png
final_feature_importance.png		final_feature_importance.png
final_model_results.png		final_model_results.png
final_result.xlsx		final_result.xlsx
model_results.png		model_results.png
use_model.py		use_model.py

DaneshCode/CarPricePrediction

Folders and files

Latest commit

History

Repository files navigation