Background: PowerCo is a major gas and electricity utility that supplies to corporate, SME (Small & Medium Enterprise), and residential customers. The power-liberalization of the energy market in Europe has led to significant customer churn, especially in the SME segment. They have partnered with BCG to help diagnose the source of churning SME customers.
A fair hypothesis is that price changes affect customer churn. Therefore, it is helpful to know which customers are more (or less) likely to churn at their current price, for which a good predictive model could be useful.The head of the SME division is considering a 20% discount that is considered large enough to dissuade almost anyone from churning
Task1: Formulate the hypothesis as a data science problem and lay out the major steps needed to test this hypothesis.
Task2: Perform some exploratory data analysis. Look into the data types, data statistics, specific parameters, and variable distributions. Verify the hypothesis of price sensitivity being to some extent correlated with churn
Task3: Develop more on the feature “the difference between off-peak prices in December and January the preceding year” and train a Random Forest classifier and to evaluate the results in an appropriate manner. Also test the suggestion of giving the discount of 20% to the customers likely to churn