Make a call : +1 646 699 8676

Predicting Churn


Customer churn, also known as customer attrition, is when a company’s customers stop buying or interacting with its services and the process of predicting which customers are most likely to leave a service or cancel their subscription is called Churn prediction. It’s a crucial forecasting method deployed by many firms to predict the churn as getting new customers is many a times more expensive than retaining the old ones. You should know exactly what marketing activity to do for each individual client once you’ve identified those who are on the verge of canceling to increase the possibilities that the customer will continue to stay.                                                                                                     

The reasons for a customer’s churn are influenced by a variety of things. It may be that a new rival has entered the market and is providing better rates, or that the service they are receiving is not up to par, and so on. As a result, there is no one-size-fits-all explanation to why a customer wants to leave since there are several impacting variables.

Why Customer Churn Analysis is Important?

1. Costs savings and an increase in profits

The ultimate purpose of a customer churn study is to boost earnings by reducing customer attrition. A lot of cost is saved when the older customers are retained. Also, if more consumers stay for a longer amount of time, there will ultimately be a growth in sales and profits as customers are using the product/service for a longer duration.

2. Understanding the reasons and improving

Businesses get an opportunity to improve their products and understand the reasons for why consumers are departing and understand if the issue is with the products, services, or delivery methods, etc. Acting on these insights will not only reduce customer churn, but it will also result in a better overall product or service, which will bring in better growth.                                                                                                 

Building a system that helps with the customer churn analysis is cumbersome and an important way to save time and money on this is to use machine learning to predict attrition and the factors leading to it. Ultimately, you need to build an end-to-end predictive data pipeline that can provide the required data pre-processing and cleansing, the features that are the most relevant, and a quick analysis regarding various variables present within the data.

Our approach to addressing these issues is by selecting a unified platform that offers these capabilities. ANAI provides an all-in-one platform that brings together data and AI and allows the different personas of your organization to come together and collaborate in a single workspace. Other important advantages of the ANAI Platform include the ability to:

  • Handle the data with the required data engineering and feature engineering within a few clicks.
  • Generate insightful visualizations to derive hidden patterns within any data set.
  • Build and Train ML models with ease on the pre-processed data with more than 300 unique models at disposal.
  • Quickly and easily generate explanations on your model’s output to detect data and model inconsistencies such as drifts and biases to build and deploy accountable and responsible AI systems.

Cleanse and Explore the data

In this implementation, we will use ANAI’s Data Engineering pipeline on the Telecom Churn Analysis data. Using ANAI, any user can process their data to make it ready for model building and other ML applications. The goal of this step is to transform data into an easily interpreted form for the ML model and hence making it easy for the model to make further predictions. But for the data to be ML ready it needs to satisfy certain conditions and meet requirements hence, the data undergoes a few pre-processing steps like data analysis, wrangling, transformation, encoding, etc.

ANAI’s Data Engineering pipeline automates the data pre-processing pipeline. It provides 100 plus data ingestion methods to give a flexibility while importing data, conducts thorough pre-analysis of data to understand its features and distribution, and then proceeds over to data wrangling wherein the missing and the duplicate values are dealt with. For feature engineering, ANAI applies automated feature engineering to detect the features that affect the prediction the most and summarizes the data giving it a final health score.

Steps taken by ANAI for Data Pre-Processing, Engineering and Analysis:

  1. Start by creating a new project and add in the details regarding the project such as the project name and purpose.
  2. Select the data source to which your data belongs to and import it into the ANAI platform.
  3. Prepare the data set by selecting the data, selecting the fields that should be considered, the target variable and finally the type of ML task that is needed to be done.
  4. Save the project and finally get a summary for the data with a health score and understand the inconsistencies within the data such as missing and duplicate values, etc. In the Overview section.
  5. Within the Profile section, you can see various feature statistics for each and every feature that can be saved as a PDF if required.
  6. Interactions section provides a detailed analysis by displaying uni-variate and bi-variate analysis of features that helps users understand the data even better and draw key inferences from it.

Exploratory Data Analysis  lets businesses understand and resolve certain events which trigger customer churn, and one of such methods is to analyze features and their correlations to our target variable. ANAI allows visualizing the data to perform churn analysis in order to analyze KPI (Key Performance Indicators) such as pricing points, customer usage, etc.

Various types of visualizations, from histograms to pair and scatter plots, can be generated to extract deeper meaning from the data using ANAI. Some of them have been used below for the analysis of Telecom Churn data:

The chart represents feature account length in form of histogram in relation with its count of values, the account length data is segregated on basis of Area code feature using green, red and blue color. The graph is fairly normally distributed. Majority of data is present between the 50-150 range.

The chart above represents feature “State” in form of histograms in relation with count of values, plot is segregated based on feature “Intl Calls” i.e., International Calls, with which it can be determined that which state has most international calls. WV state has highest number of calls while also having high international calls.

In this bivariate relation we see plot of feature variable Day Charge (Charge on calls) and Day Mins (Number of minutes spent on calls), we can see that they have positive correlation as minutes of calls increase the charge on calls increases.

In this bivariate relation we see plot of feature variable Day Charge (Charge on calls) and Day Mins (Number of minutes spent on calls), we can see that they have positive correlation as minutes of calls increase the charge on calls increases.

The box plot above represents relation between “State” and “Day Calls” variables. It helps us understand the calls made in a day by different states. It also depicts the quartile and outliers of calls made by state as a whole.

The box plot above represents relation between State and Night Calls variables. It helps us understand the calls made in a Night by different states. It also depicts the quartiles and outlier of calls made by state as a whole.

This ternary diagram depicts relations of Day, Eve (Evening) and Night Calls and study their dependencies, also the data points have been segregated based on international calls made. It helps us understand the proportion of values ranging from 0 to 1 with unique color and different ranges of colors based on international calls.

The box plot above represents relation between State and Night Calls variables. It helps us understand the calls made in a Night by different states. It also depicts the quartiles and outlier of calls made by state as a whole.

The scatter 3-D plot is used to show relations between features Day, Eve (Evening), Night Charges. There are distinguished based on international calls made. You can study relation of one variable with another in a 3-D space.  Also, the 4-th dimension of the data can be represented using color of markers (here, international calls).

Building the model

The next step for implementing the churn analysis model is to create an ML pipeline to build, train and tune a suitable model that can predict the churn accurately. ANAI’s automated model building tool helps users select the best models for their use case and gives out a comparative analysis for different model’s performance and also automatically tunes the hyperparameters for the best accuracy on the test data set.

A classifier can be easily built by instantiating the classification module within the ANAI package and mentioning the data, target variable and the models to be trained. The same process can also be done on ANAI’s app.

After some time when the training and tuning for each and every model mentioned is finished, a report for the performance of all the models is generated with their accuracies. Here, LightGBM classifier gave the best cross validated accuracy of 95.911%.

The models can be selected based on their simplicity and interpretability depending on the use case so that every stakeholder can understand the model’s working. The most accurate model gives out the best performance while predicting the customers that will churn out first. Such customers can be analyzed more deeply to understand the reasons behind their departure to retain them or avoid future dis-satisfactions of a customer.

Explain the Predictions

ANAI’s eXplainable AI-based solutions, allows businesses to get deeper within the model’s results and uncover the black box of the ML models. ANAI provides various eXplainable AI models such as SHAP, LOFO, CERTIFAI, etc. to give explanations on the model’s output so that every stakeholder can properly understand the reasoning behind a model’s prediction and the model creators can quickly detect biases and other discrepancies beforehand.

Business Impact

The acceptable number of losing customers can be less than 10% and other business types might meet one-fifth to one-third of customer churn. So, if a business starts losing over 30% of its customers, such metrics should be dived deeper within as such customer attrition will grow and might reach 50% if not taken care of.

Customer Experience Management is becoming increasingly essential for establishing a unique competitive edge. In fact, it is one of the most critical strategies to stay ahead of the competition. Understanding and evaluating churn are one of the finest strategies to improve customer experience management. 

According to a widely-known rule, customer acquisition costs five times more than customer retention, so by increasing customer retention rates by 5%, you can boost profit by 25% to 90%. The probability of selling to loyal customers is around 70% and only within 5%-20% to newly added customers. Keeping these points in mind, the relevance of doing a good customer churn analysis comes into the picture.

Churn analysis assists firms in developing subscriber profiles and different prediction models for identifying churn causes. As a result, service providers may take a variety of corrective actions, such as initiating discounts and offers to retain consumers, controlling any service delivery gaps to prevent churn, and enhancing the overall customer experience.

According to Pega, it has been found that only 28% of respondents were uncomfortable with a business that uses AI. On the other hand, 63% of respondents would be more than willing to share their data if the service is truly valued.

Managing attrition and enhancing customer experience management lets companies gain a competitive edge. Analyzing churn enables a company to better understand customer behavior and, as a result, create services that guarantee churn is handled properly.


In this Telecom Churn case study, we discussed how churn data analysis can be carried out on the ANAI platform with detailed explanation of procedures, from ingesting complex data, to data analysis and feature engineering, to building and tuning ML models. We also discussed the importance of a customer churn analysis and the challenges in doing so. In order to reduce the churn percentage, various factors in different industries can be analyzed using the visualizations but there are other common approaches like increasing customer usage by providing step-step instructions and detailed tutorials, insight pricing, correct target audience, etc.

Using ANAI, companies can make quick decisions, and focus more on the business side, implementing solutions to eliminate the problem causing churns, on the go using advanced ML techniques and insightful exploratory data analysis without the ML expertise needed. ANAI holds simplicity, efficiency, accuracy in mind to provide feature descriptions, detailed analysis and accurate predictions. ANAI also helps understanding the model results easily and assists in building models that are trustable, fair and robust.

To implement such solutions or to get a personalized solution for your niche use case, contact us at or visit

customer churn prediction | customer churn prediction model | customer churn prediction dataset |customer churn prediction python

Want to get started?

Connect with us to get a free demo