This project simulates a real SaaS product analytics workflow, where we analyze how users interact with different product features over time.
The system provides:
- Feature usage metrics
- User engagement analytics
- RFM-based behavioral scoring
- KMeans usage segmentation
- A polished multi-page Streamlit dashboard
This project is designed for Business Analyst / Product Analyst / Data Analyst roles and is fully runnable locally.
Modern digital products generate large volumes of event-level data (clicks, searches, feature interactions). However, companies often struggle with:
- Understanding which features are actually used
- Identifying โpower usersโ vs โat-riskโ users
- Measuring adoption and engagement
- Making data-driven product decisions
This project solves that by simulating:
- Feature-level adoption
- Usage trends
- RFM-based engagement scoring
- Usage-based user segmentation
- Monitoring active vs inactive users
Product Managers, Growth Teams, and Analysts use dashboards like this to:
- Prioritize the roadmap
- Improve user retention
- Identify adoption gaps
- Personalize engagement or marketing campaigns
Synthetic dataset simulates clickstream-like product usage:
| Column | Description |
|---|---|
user_id |
Unique user ID |
signup_date |
When the user joined |
event_date |
Date of feature usage |
feature_name |
Feature used (Search, Dashboard, API, etc.) |
events_count |
Number of actions for that feature/day |
| Column | Description |
|---|---|
last_event_date |
Last active day |
active_days |
Days user engaged with product |
total_events |
Total feature interactions |
recency |
Days since last activity |
frequency |
Number of active days |
monetary |
Usage intensity score |
cluster |
KMeans behavioral segment |
Synthetic Feature Usage Data
โ
Preprocessing (clean, transform, aggregate)
โ
RFM Feature Builder (recency / frequency / monetary)
โ
KMeans Segmentation Model (Power, Regular, At-Risk, Dormant)
โ
Streamlit Multi-Page Dashboard
โ
Insights for Product & Business Decision-Making
๐ Folder Structure
Product-Feature-Usage-Intelligence/
โ
โโโ app/
โ โโโ Home.py
โ โโโ pages/
โ โโโ 1_Overview.py
โ โโโ 2_Feature_Usage.py
โ โโโ 3_RFM_Segments.py
โ
โโโ data/
โ โโโ raw/
โ โโโ processed/
โ
โโโ models/
โ โโโ kmeans_rfm.pkl
โ
โโโ scripts/
โ โโโ generate_synthetic_data.py
โ โโโ preprocess_data.py
โ โโโ build_rfm.py
โ โโโ train_model.py
โ
โโโ src/
โ โโโ preprocessing.py
โ โโโ rfm.py
โ โโโ viz.py
โ
โโโ tests/
โ โโโ test_predict.py
โ
โโโ screenshots/
โ โโโ overview.png
โ โโโ feature_usage.png
โ โโโ rfm_segments.png
โ
โโโ .gitignore
โโโ requirements.txt
โโโ README.md
git clone https://github.com/girishshenoy16/Product-Feature-Usage-Intelligence
cd Product-Feature-Usage-Intelligencepython -m venv venv
venv\Scripts\activatepython.exe -m pip install --upgrade pip
pip install -r requirements.txtpython scripts/generate_synthetic_data.pypython scripts/preprocess_data.py
python scripts/build_rfm.pypython scripts/train_model.pypython src/evaluate_model.pypytest -vstreamlit run app/Home.pyYour dashboard opens at: ๐ http://localhost:8501
Simple project overview and navigation.
Shows high-level KPIs:
- Total users
- Avg events per user
- Avg active days
- RFM summary per cluster
Includes:
- Total usage per feature
- Interactive filters (date range, feature selection)
- Feature-wise time-series trends
- 3D RFM scatter plot (RecencyโFrequencyโMonetary)
- Cluster distribution bar chart
- Insightful cluster interpretation
Some example insights (your numbers may differ):
- "Dashboard" is the most used feature.
- API & Integrations have lower adoption (good candidate for UX improvement).
- Power users show low recency and high frequency/monetary.
- At-risk users show high recency with declining frequency.
Model successfully separates users into:
- Cluster 0 โ Power Users
- Cluster 1 โ Regular Users
- Cluster 2 โ At-Risk Users
- Cluster 3 โ Dormant Users
These are directly usable for:
- Re-engagement campaigns
- Feature onboarding
- Product roadmap planning
To expand this into a more advanced product analytics suite:
- ๐ Churn prediction model
- ๐งฉ Cohort retention heatmaps
- โก Real-time usage ingestion (Kafka โ DB โ Dashboard)
- ๐ Feature correlation matrix (which features drive stickiness?)
- ๐งญ User journey funnel visualization
- ๐ Deploy dashboard to Streamlit Cloud / Render / AWS


