A web-based application that predicts house prices in Pune using machine learning and provides AI-powered location insights through an interactive map interface.
- Project Overview
- Features
- Technologies
- Project Structure
- Installation
- Usage
- API Endpoints
- Model Details
- Contributing
This project is a comprehensive house price prediction system specifically designed for the Pune real estate market. It combines a machine learning model trained on local property data with a Flask-based web API and interactive HTML frontends. Users can input property details to get price predictions and explore locations on a map to receive AI-generated summaries of real estate trends and insights.
- Price Prediction: Predict house prices based on features like area, BHK, bathrooms, furnishing status, property age, and distances to amenities (school, hospital, metro).
- Interactive Map: Click on locations in Pune to get AI-powered real estate summaries including nearby localities, average prices, rental estimates, and development trends.
- Reverse Geocoding: Automatically converts coordinates to readable addresses using OpenStreetMap.
- AI-Powered Insights: Integrates Google Gemini AI for generating detailed location summaries.
- Responsive Web Interface: Clean, user-friendly HTML frontends for both prediction and map exploration.
- Backend: Flask, Flask-CORS
- Machine Learning: scikit-learn, RandomForestRegressor
- Data Processing: pandas, numpy
- AI Integration: Google Generative AI (Gemini)
- Geocoding: OpenStreetMap Nominatim API
- Frontend: HTML, CSS, JavaScript, Leaflet.js, Bootstrap
app_with_reverse_geocoding.py— Main Flask application with prediction and summarization endpointstrain_model.py— Script to train the RandomForest model on property datahouse_price_model.pkl— Trained machine learning model (generated by train_model.py)latestnewdataset.csv— Training dataset (property listings with features and prices)index.html— Main prediction interfacesolomagicwand.html— Interactive map for location summariesrequirements.txt— Python dependencies
-
Clone the repository:
git clone https://github.com/yourusername/house-price-prediction-pune.git cd house-price-prediction-pune -
Create and activate a virtual environment:
python -m venv venv venv\Scripts\activate # On Windows # source venv/bin/activate # On macOS/Linux
-
Install dependencies:
pip install -r requirements.txt
-
Set up Google Gemini API key:
- Obtain an API key from Google AI Studio
- Replace
"API_KEY"inapp_with_reverse_geocoding.pywith your actual API key
-
Train the model (optional, if you want to retrain):
python train_model.py
-
Start the Flask server:
python app_with_reverse_geocoding.py
The server will run on
http://localhost:5000 -
Open
index.htmlin your browser for the price prediction interface:- Fill in property details (area, BHK, bathrooms, etc.)
- Click "Predict Price" to get an estimated price in lakhs
-
Open
solomagicwand.htmlfor the interactive map:- Click the "Magic Wand" button to activate location selection
- Click anywhere on the map to get an AI-generated real estate summary for that location
Predicts house price based on input features.
Request Body:
{
"Area": "string",
"Area_sqft": number,
"BHK": number,
"Bathrooms": number,
"Furnishing": "string",
"Age": number,
"Distance_School": number,
"Distance_Hospital": number,
"Distance_Metro": number
}Response:
{
"predictedPrice": number
}Provides AI-generated real estate summary for given coordinates.
Request Body:
{
"lat": number,
"lng": number
}Response:
{
"summary": "string"
}- Algorithm: RandomForestRegressor with 200 estimators
- Features: Area, Area (sq.ft.), BHK, Bathrooms, Furnishing Status, Age of Property, Distance to School/Hospital/Metro
- Target: Price in Lakhs
- Preprocessing: One-hot encoding for categorical features, standard scaling for numerical features
- Training Data:
latestnewdataset.csv(local Pune property listings) - Evaluation: R² score and RMSE calculated during training
Note: Model accuracy depends on the quality and recency of the training data. For production use, consider updating the dataset with current market data. The dataset used to train the model was not the accurate, hence model is not precise, can be fine tuned if the accurate dataset is availabe.
Contributions are welcome! Please feel free to submit a Pull Request.