Skip to content

itsmawna/End-to-End-BI-Solution-Sales

Repository files navigation

Building an End-to-End Business Intelligence Solution for Market and Sales Insights

From ETL to Predictive Analytics

SQL Server SSIS SSAS Python Power BI ETL Data Warehouse OLAP Machine Learning Customer Segmentation Predictive Analytics Data Visualization Recommender System


In today’s data-driven economy, information is power.
Organizations that know how to collect, process, and analyze their data are better equipped to make timely and strategic decisions.

This project aimed to design and implement a complete Business Intelligence (BI) solution exploring opportunities for market positioning in the sales of products and related services.

By integrating SQL Server, SSIS, SSAS, Power BI, and Python, we developed a system that transforms raw datasets into actionable insights, combining data warehousing, analytics, and visualization into one cohesive ecosystem.


Project Objectives

The main goal was to create a BI platform that supports data-driven decision-making.

Specific objectives:

  • Segment customers based on demographics and purchasing behavior
  • Identify high-performing and underperforming products
  • Explore sales trends and seasonality
  • Forecast future sales using predictive models
  • Recommend products tailored to customer preferences

Data Overview

We used three datasets from the Sales Product and Customer Insight Repository by SvbStan on Kaggle.
These datasets served as the foundation for building our BI model and analytical environment.

Customer Dataset

Contains demographic information such as gender, age, and location.

Attributes: customer_ID, first_name, last_name, gender, date_of_birth, email, phone_number, signup_date, address, city, state, zip_code

Sample:

Customer Dataset Sample

Figure 1: Sample from Customer Dataset


Product Dataset

Includes product details such as brand, category, price, and stock availability.

Attributes: product_id, product_name, category, price_per_unit, brand, product_description

Sample:

Product Dataset Sample

Figure 2: Sample from Product Dataset


Transaction Dataset

Links customers and products through purchases, providing transaction-level details.

Attributes: purchase_id, customer_id, product_id, purchase_date, quantity, total_amount

Sample:

Transaction Dataset Sample

Figure 3: Sample from Transaction Dataset


System Architecture

Data flows through a series of specialized layers, each designed to perform a specific role in the transformation from unprocessed data to practical insights.

System Architecture

Figure 4: Global Architecture of the BI Solution


ETL Process (Extract, Transform, Load)

Extraction

The extraction phase centralizes raw data from multiple CSVs into SQL Server staging tables. Using an SSIS project, data is loaded via Flat File Sources, validated for type and integrity, and prepared for transformation, ensuring consistent, high-quality input.

ETL Extraction Workflow

Figure 5: ETL Extraction Workflow


Transformation

Following validation, the transformation phase cleansed, enriched, and standardized the data. This involved implementing a star schema (including dimension and fact tables), pre-calculating KPIs like profit, and using SSIS components to automate the process, resulting in analysis-ready data.


Loading

The final load phase populated the Data Warehouse using the optimized star schema. The fact_sales table, linked to dimCustomer, dimProduct, and dimDate, now contains pre-calculated KPIs, enabling fast analytics and Power BI reporting.

Star Schema of Data Warehouse

Figure 7: Star Schema of the Data Warehouse


Analytical Modeling (SSAS)

The BI system's analytical capabilities were implemented by creating an OLAP cube in SSAS. This cube structures the data warehouse information, incorporating key hierarchies within the Customer, Product, and Date dimensions to facilitate intuitive multidimensional exploration and aggregated analysis for end-users.


Machine Learning Integration

Customer Segmentation (K-Means)

K-Means clustering segmented customers by purchase frequency and total spend, identifying two core groups: low-spending occasional buyers and high-value loyal customers.

Customer Segmentation

Figure 9: Customer Segmentation Visualization

Predictive Sales Forecasting (SARIMA)

To anticipate future demand, we built a sales forecast using a SARIMA model, which accounts for both seasonal patterns and underlying trends in the historical data.

SARIMA Sales Forecast

Figure 10: Sales Forecast Using SARIMA

Product Recommendation (SVD)

To personalize product suggestions, we developed a recommender model using Singular Value Decomposition (SVD) that leverages individual purchase history and collective customer patterns.


Data Visualization (Power BI)

Customer Insights Dashboard

The dashboard provides a segmented view of customer behavior, highlighting key performance indicators such as total customer count, regional expenditure, and customer loyalty metrics.

Customer Insights Dashboard

Figure 12: Customer Insights Dashboard

Product and Sales Performance Dashboard

The product and sales dashboard provides visibility into key financial metrics, visualizing revenue, profit, and growth trends. It identifies leading categories and regions to help managers track performance against objectives.

Product and Sales Performance Dashboard

Figure 13: Product and Sales Performance Dashboard


Read More & Feedback

For a more detailed explanation of this project, including insights, methodology, and additional visuals, you can read the full article on Medium:

Building an End-to-End Business Intelligence Solution for Market and Sales Insights

We welcome your comments, questions, and feedback. Your insights and suggestions are highly appreciated!