A comprehensive data analytics project analyzing Business Development and Sales representative pairing performance to optimize lead routing and maximize conversion rates and revenue.
Explore the complete analysis through three interactive dashboards:
- Dashboard 1: Exploratory Data Analysis & Data Validation
- Dashboard 2: Performance Methodology & Analysis Results
- Dashboard 3: Business Recommendations & $600K+ ARR Impact
This project analyzes 2,200+ opportunities across 18 BD reps and 23 Sales reps (413 unique pairings) to identify optimal BD-Sales team pairings and provide data-driven routing recommendations.
Not all BD-Sales pairings perform equally. Random lead routing leads to:
- Suboptimal win rates
- Longer sales cycles
- Higher early deal death rates
- Inconsistent pipeline health
Analyze historical pairing performance to:
- Identify high-performing BD-Sales combinations
- Flag at-risk pairings to avoid
- Provide actionable routing recommendations
- Quantify expected business impact in ARR
Expected ARR Improvement: $600K+ (annual recurring revenue lift from optimized routing)
- 18 BDs analyzed with varying pairing performance ranges
- Top 3 BDs show $100K+ ARR improvement potential each
- 124-point performance swing between best and worst routing decisions for some BDs
- $0 implementation cost - pure routing logic optimization
sales-handoff-quality-analysis/ │ ├── data/ │ └── opportunities.csv # Simulated opportunity data │ ├── analysis/ │ ├── pair_metrics.csv # BD-Sales pair metrics │ ├── performance_scores.csv # Performance scoring results │ ├── bd_pairing_recommendations.csv # Top/bottom 5 recommendations │ └── routing_impact_analysis.csv # ARR impact calculations │ ├── visualizations/ │ ├── output/ # Generated PNG visualizations │ └── *.py # Visualization scripts │ ├── iterations/ # Project evolution │ ├── v1_initial_approach.py │ ├── v2_added_confidence.py │ ├── v3_baseline_comparison.py │ └── notes.txt │ ├── exploratory_data_analysis.py # EDA and data validation ├── data_generation.py # Simulated data generation ├── metric_calculation.py # Calculate pairing metrics ├── performance_scoring.py # Score and classify pairings │ ├── dashboard_index.html # Interactive dashboard landing page ├── dashboard_1_eda.html # EDA dashboard ├── dashboard_2_methodology.html # Methodology & analysis dashboard ├── dashboard_3_recommendations.html # Business recommendations dashboard │ ├── METHODOLOGY.md # Detailed methodology doc ├── README.md # Project documentation ├── requirements.txt # Python dependencies └── .gitignore # Git ignore file
- Python 3.8+: Core analysis language
- pandas: Data manipulation and analysis
- numpy: Numerical computing
- matplotlib: Data visualization
- seaborn: Statistical visualizations
- HTML/CSS: Interactive dashboards
- Clone the repository
git clone https://github.com/yourusername/sales-handoff-quality-analysis.git cd sales-handoff-quality-analysis
-
Install dependencies pip install -r requirements.txt
-
Run the analysis pipeline
python data_generation.py
python exploratory_data_analysis.py
python metric_calculation.py
python performance_scoring.py
python visualizations/run_all_visualizations.py
- View the dashboard
- Simulated 2,200 opportunities spanning 2 years
- 18 BD reps × 23 Sales reps = 413 possible pairings
- Tracked: Win rate, early death rate, stale pipeline rate, deal size
Four key metrics calculated for each pairing:
- Win Rate: % of decided opportunities that closed won
- Early Death Rate: % of lost deals that died within 14 days
- Stale Pipeline Rate: % of open deals inactive for 90+ days
- Average Deal Size: Mean deal value
- Baseline Comparison: Each pair compared to that BD's average (controls for lead quality)
- Percentage Deviation: Relative performance vs baseline
- Weighted Scoring: Equal 25% weight per metric
- Confidence Multiplier: Based on opportunity count (threshold: 7)
Percentile-based classification:
- High Performer: Top 25% (75th percentile+)
- Above Average: 50th-75th percentile
- Average: 25th-50th percentile
- Below Average: 10th-25th percentile
- At-Risk: Bottom 10%
- Opportunity Distribution: Shows confidence levels across pairings
- Data Quality Validation: Confirms dataset is ready for analysis
- Performance Score Distribution: Percentile thresholds and classification zones
- Top/Bottom Performers: Best and worst BD-Sales combinations
- Performance Heatmap: Complete pairing matrix
- Metric Contributions: What drives high vs low performance
- BD Pairing Recommendations: Performance range for each BD
- Routing Decision Matrix: Green (route) vs red (avoid) lookup table
- ARR Impact Analysis: Quantified revenue improvement from optimization
- Total ARR: ~$1.2M
- No systematic pairing optimization
- Performance left on the table
- Total ARR: ~$1.8M
- +$600K ARR improvement
- 50% ARR increase
- Data analysis and manipulation (pandas, numpy)
- Statistical analysis and percentile-based classification
- Data visualization (matplotlib, seaborn)
- Python scripting and automation
- HTML/CSS dashboard creation
- Problem identification and scoping
- Metric definition and KPI selection
- Data-driven recommendation development
- Stakeholder communication and storytelling
- ROI analysis and business case building
- Baseline methodology selection
- Confidence interval application
- Performance classification systems
- Business impact quantification
Problem: How do we fairly compare pairings when different BDs pass different quality leads?
Initial Approach: Compare all pairings to company-wide average
- Issue: Didn't account for BD lead quality differences
- Enterprise BDs naturally have lower win rates than SB BDs
Solution: Compare each pairing to that BD's average (BD baseline)
- Controls for inherent lead quality from each BD
- Makes comparisons fair and actionable for routing decisions
Impact: This was the most critical methodological decision - completely changed the analysis validity
Problem: When averaging all BD pairs' performance scores, results were close to zero
Initial Reaction: Thought there was a bug in the calculation
Realization: This is mathematically correct by design!
- We compare each pair to that BD's average
- Deviations naturally balance out (some above, some below)
- This is a feature, not a bug
Solution:
- Documented this clearly in methodology
- Added explanation box to visualization
- Changed interpretation: "This measures pairing consistency, not absolute BD quality"
Lesson Learned: What looks like an error might be correct behavior - understand your methodology deeply
Problem: How many opportunities needed for reliable pairing analysis?
Trial & Error:
- Tried threshold of 3 opps → Too noisy, unreliable results
- Tried threshold of 10 opps → Lost too many pairings, insufficient data
- Tried threshold of 5 opps → Still some instability
Solution: Two-tier approach
- Minimum threshold: 3 opportunities (43% confidence)
- Full confidence: 7 opportunities (75th percentile in dataset)
- Apply confidence multiplier that scales linearly
Impact: Balances statistical reliability with data availability
Problem: Initial visualizations tried to show everything at once - too overwhelming
Examples of cuts made:
- Sales Rep frequency analysis → Too complex for main dashboard
- Confidence vs Performance scatter → Interesting but not essential
- Multiple classification views → Chose most intuitive representation
Solution:
- Focus on storytelling: Each dashboard has a clear narrative
- Progressive disclosure: Start simple (EDA), build to complex (recommendations)
- Keep "cut" visualizations in analysis folder for reference
Lesson Learned: Less is more - every chart must earn its place in the story
Problem: How do we estimate revenue impact when not all opportunities are closed?
Initial thought: Only count won deals
- Issue: Ignores the opportunity cost of poor routing
Solution: Calculate potential ARR based on win rate × deal value × volume
- Current: Actual win rate × avg deal × opportunities
- Optimized: Best pairing win rate × avg deal × opportunities
- Difference = ARR improvement potential
Assumption: Average deal size remains constant across pairings (reasonable for this analysis)
Impact: Provides compelling business case in dollar terms
- Real-time dashboard with Plotly/Dash
- Machine learning model for pairing prediction
- Time-series analysis of pairing performance trends
- A/B test framework for routing validation
- Integration with CRM systems (Salesforce, HubSpot)
- Automated monitoring and alerting
- Sales rep specialization analysis (which BDs work best with each rep)
Ain
Data Analyst | Marketing Analyst | Revenue Operations
- Portfolio: []
- LinkedIn: [(https://www.linkedin.com/in/qurat-ul-ain-236b60133/)]
- GitHub: @Ain-spec
- Email: quratulain_izhar@hotmail.com
- METHODOLOGY.md: Detailed technical methodology
This is a portfolio project, but feedback and suggestions are welcome! Feel free to:
- Open an issue for discussion
- Fork the repo and experiment
- Share how you've adapted this methodology for your use case
Have questions about the methodology or want to discuss similar projects?
Reach out via LinkedIn or email
Last Updated: January 2026