Skip to content

Car price prediction using Linear Regression with real-world Pakistani car data.

Notifications You must be signed in to change notification settings

Gurupriyan666/Pakistani-Car-Price-Prediction-Linear-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Pakistani-Car-Price-Prediction-Linear-Regression

Car price prediction using Linear Regression with real-world Pakistani car data.

πŸ“˜ Pakistani Car Price Prediction using Linear Regression

πŸ“Œ Project Overview

This project demonstrates how Machine Learning can be learned through research and applied to a real-world problem. Linear Regression is used to predict car prices based on Pakistani car market data.

The goal of this project is to understand Linear Regression practically, analyze its performance, visualize predictions, and identify its limitations.

🎯 Objective

  • Learn Machine Learning through research.

  • Apply Linear Regression in a real project.

  • Predict car prices using real-world data.

  • Analyze model behavior and limitations.

  • Connect ML theory with real results.

🧠 Machine Learning Algorithm

Linear Regression

Linear Regression is a supervised machine learning algorithm used to predict continuous values by learning linear relationships between input features and the target variable.

πŸ“‚ Dataset

The dataset contains Pakistani car information including:

  • Car Model

  • Manufacturing Year

  • Fuel Type

  • Car Price

βš™οΈ Project Workflow

  • Data Loading & Cleaning

  • Exploratory Data Analysis

  • Feature Encoding

  • Train-Test Split

  • Linear Regression Model Training

  • Prediction Generation

  • Model Evaluation

  • Visualization

  • Insight Extraction

πŸ“Š Visualizations Used

  • Price Distribution

  • Price vs Year

  • Average Price by Fuel Type

  • Actual vs Predicted Prices

  • Residual Plot

πŸ” Key Insights

  1. Linear Regression captures general pricing trends.

  2. High-priced cars are under-predicted.

  3. Low-priced cars are sometimes over-predicted.

  4. Year is the strongest influencing feature.

  5. Data skewness and outliers affect model accuracy.

  6. Linear Regression works best as a baseline model.

⚠️ Limitations

  1. Linear assumption restricts performance.

  2. Sensitive to outliers.

  3. Limited feature set.

  4. Categorical encoding impact.

πŸš€ Future Improvements

  • Add mileage, engine size, transmission.

  • Use One-Hot Encoding.

  • Apply Polynomial Regression.

  • Compare with Random Forest and XGBoost.

πŸ§ͺ Learning Outcome

  1. This project helped me understand:

  2. Machine Learning workflow

  3. Linear Regression theory and practice

  4. Data preprocessing importance

  5. Feature engineering

  6. Model evaluation

  7. Visualization interpretation

  8. Real-world ML challenges

🏁 Conclusion

This project successfully demonstrates learning Machine Learning by research and applying it in a real-world car price prediction problem using Linear Regression.

πŸ§‘β€πŸ’» Author

Gurupriyan K Data Analyst