Telecom Customer Churn Prediction (Machine Learning)
- Richmond Yeboah
- Sep 11, 2024
- 2 min read

This project dives into the customer data of a Telecom company to unveil insights and hidden patterns about customers and what might be their reasons for churning so that the company can come up with customer retention strategies. Also, an ML model will be built to help make churn predictions that can help the company tailor solutions to customers before they churn.
The success criteria of this project lies in building a highly accurate and interpretable machine learning model. This will require carefully chosen data features, effective model selection, and rigorous evaluation techniques. Ultimately, this project aims to equip companies with a powerful tool to combat customer churn and achieve sustainable growth.
Two datasets out of the three that were sourced from multiple places were concatenated and divided into features and target variables, after which it was split into train and evaluation sets with a percentage of 80% for training and evaluation for the remaining. The goal of this project is to build a robust machine-learning model hence, five models namely, RandomForest, DecisionTree, GradientBoosting, and SupportVector classifiers were trained.
The imbalanced nature of our target column called for a balancing of the dataset and this was done using both RandomSampler and SMOTE and it was seen that the latter performed better. Feature selection was implemented to improve the performance of the model, and also some hyperparameter tuning processes using RandomizedSearchCV.
The model's performance was assessed using the f1-score metric and the AUC-ROC curve. The f1-score blends precision and recall using their harmonic mean which means maximizing for the f1-score implies simultaneously maximizing for both precision and recall.
The AUC-ROC curve visually summarizes a classifier's ability to distinguish between classes across all thresholds, with a higher AUC indicating better separation (AUC = Area Under the ROC Curve). A graph for the ROC curve for all the models was plotted with their respective auc-score to help assess how the model is performing.
Tools and Languages Used: SQL, Python, Power BI, VS Code (Jupyter Notebook)
Link to codes on GitHub : https://github.com/richmond-yeboah/Telecom-Customer-Churn-Prediction/blob/main/Notebooks/main.ipynb
Link to Power BI dashboard: https://app.powerbi.com/view?r=eyJrIjoiZjBlMDQwZDktODlhOS00ZDZjLTk1ZTUtNTk2OTUzNTNiMDQyIiwidCI6IjFjZTU4MjFjLTE5NDItNDczMy1hNmRjLTBmYzNhODJiNzRkYiJ9
Link to project article on medium: https://medium.com/@richmondyeboah299/telco-customer-churn-prediction-0fdb61ed68f9
Comments