Make a call : +1 646 699 8676

Building ML Models that score without bias

Credit scoring in financial institutes has seen tremendous automation in recent years that have proven to be very useful assistance while lending. Various highly accurate ML models have been deployed to assess the applicants on a massive level and that too with precision within a very less time as compared to traditional methods.

Such feats are helping financial institutions bring in great prospects, save a lot of money and time for the underwriting process, and also help the institute look more tech-savvy and more advanced on the technology side than other competitors.

But as amazing as it sounds, implementing AI within such sensitive areas comes with one downside and that is the introduction of various biases within the ML pipeline that might hinder the implementation of ML in credit scoring altogether. Such biases can come through the data acquired over time which is used for the training, incorrect selection of features, and also algorithmic biases that might come from the models.

Such biases need to be removed before deploying the models into the real world and failing to do so might lead to various problems for the deploying entity in the form of customer dissatisfaction, and various government regulatory bodies coming at your doorsteps. Also, deploying biased models and systems will encourage such biases to stay and propagate even more and that might lead to the solutions creating a more unfair and less diverse customer base.

How does ANAI help with the biases?

Explainable AI or XAI has recently come up as a very useful way to decode the inner workings of an ML model and to try to understand the reasoning behind a model’s decision. It helps look behind the curtains to see how the model comes up with its predictions helping the makers understand how the model works and also helping generate user trust in the model.

For this case study, we have used the German credit scoring data set to evaluate how the biases arrive within the models and to try to create explanations on the same to detect where the biases come from and eliminate them using various techniques. Here we have used the three most commonly used explanatory models/techniques and tried to derive meaning from their charts to detect possible biases and we have also discussed some methods to remove them.

The data set contains 1000 rows with 20 features including age, sex, marital status, current income, etc. that can be used to determine if the person can be credited. Right away it can be seen that some of these features might cause biased learning for the model and should be removed immediately, even though they have been used in the past to determine someone’s credit worthiness.

Age, sex, and marital status are some of the features that have been traditionally used by various financial institutes to determine if a person will pay back their money. But, making ML predictions using such bias-inducing features will always lead to relying on them more than the features that are actually useful. For. eg. a female candidate can get rejected during such processes as traditionally women have been seen to not take credits from banks and also maybe the banks have less data on them. So, such features create discrimination based on things that it has learned from the past, and first, it leads to bad outcomes, and second, it outright creates discrimination on the basis of someone’s age, gender, or features that are considered to be discriminatory by modern standards.

ANAI’s XAI results

We first imported the data into the ANAI platform and trained some models on it. The chart below shows the resulting table for all the models, with the accuracy score being low as the data was not big enough. Still, it works, as our objective was to detect the biases within the data set, and training a good model was not the goal.

Auto ML

SHAP charts

ANAI employs SHAP as one of its XAI methods to decode the inner workings of a model. Shapley Additive values (or SHAP) tell the contribution that each of the features brings to the predictions. Below can be seen a chart with plottings of all the features and their SHAP values:

Explainable AI

From the above chart, it can be said that the feature, “account_balance” contributes the most towards making the prediction both positively and negatively. Similarly, other feature dependencies can also be seen. One such feature that we have to take care of is the age and the sex_marital. The model is predicting on the basis of these discriminatory features, hence these features can be directly eliminated from the data or they can be corrected to give equal representation or weight to all the groups within it.

SHAP values for some of the features within the data can be seen below:

Here, the dependency and different SHAP values for features can be looked at and compared. The chart shows the acceptance probability of an applicant as the feature value changes. Vertical dispersions at a single value show interaction effects with other features. SHAP automatically selects another feature for coloring to make these interactions easier to see.


LIME charts

LIME uses simple logical explanatory models on a local scale to explain the model behavior on a global scale. It generates a linear model for a non-linear/highly complex model on a very tiny scale where the problem looks very linear and then generates explanations based on that linear representation.

Explainable AI

LIME charts show the correlation between features and the predictions, with positive correlations in green and negative in red. Anything in green pushes the prediction towards a certain class (giving a good score in this case) and red pushes it away from that class. It is very simpler and more intuitive to understand. Looking at the age feature, it shows up in green (fourth from top) saying that the model positively relies on that feature to give a final score, suggesting that the model is discriminating on the basis of ‘age’, as well as ‘sex_marital’.


LOFO charts

LOFO or Leave One Feature Out is also a very interesting method to get insights behind the model’s functioning. As the name suggests, this method leaves one feature out of the equation while going for the predictions suggesting the importance of that feature for the model’s output. Over-reliance on discriminatory features might indicate that the model is biased as it learned to trust a biased feature during inference.

Explainable AI

The chart above suggests that the ‘duration’ feature is positively bringing the value closer to an acceptance. ‘sex_marital’ and ‘age’ are showing negative relation with the prediction, showing the reliance of the model on these features.

ANAI has over 25 plus such different methods that help businesses create explanations for all use cases and for each type of a person. The techniques shown above are the very basic ones that we selected for display, but there are more methods/techniques that are available on the platform, each with there own unique features.

Discussion on how to remove such biases after detection

To detect biases and eliminate them completely, various methods are being worked on and many are already available out there. The biased features within a data set can be corrected after the XAI results show bias and the new improved and fair data can be used to train the model again.

Ignoring features that could have been good indicators such as the amount of money in the bank or payment statuses, the model used discriminatory features that were helping during predictions creating biases and hence leading to discrimination against some of the customers. ANAI clearly showed this through various XAI charts that were generated. 

Detecting and eliminating such features to again retrain or build bias-free fair models should always be the next step, and is also the next thing in our development pipeline. Our plans include incorporating various bias removers within the ANAI framework or building our own proprietary bias detection and elimination method in the near future to provide a complete explanatory and bias-removing solution.

Some tools like Aequitas, AI Fairness 360, and What-If are open-source toolkits that data scientists can look into for bias detection in machine learning model performance. These tools allow for informed and equitable decisions around developing and deploying predictive tools that can be used without worrying about various biases popping up within the system. 

Business Impact 

  • More accuracy and flexibility

AI-based Credit Scoring systems take into account a range of real-time factors from income levels and credit history, to work experience and user transaction analysis. As a result, it provides individualized credit score assessment for each user based on different factors providing access to finance to a lot more people.

  • Better customer segmentation without the bias

Traditional customer analysis and scoring were done on a limited set of data and hence did not provide enough context to the lender about the customer, their spending habits, and other factors. Today, with the help of big data and other sources such as the internet, thousands of bytes of data are available on a given candidate to go through and analyze to give them a score. ML models are built for understanding large data sets and hence can be leveraged. One downside of using ML models was the introduction of biases, but that problem has also been solved with the use of XAI techniques some of which were mentioned above.

  • Faster processes

The conventional processes of getting approved for a loan or a credit card were slow and took a lot of time. AI has automated all these processes and taken care of multiple steps reducing the time by a huge amount while still maintaining the quality or even improving it in some cases. 


Bias-free credit scoring has become really important while deploying ML-based algorithms for making such important predictions. The features that actually tell if the person is a defaulter or a customer should be studied and not be based on the standards that our unconscious minds have formed over the centuries. They should consider the actual features that lead to good predictions and not learn biases from the past and repeat them again, leading us nowhere.

We showed how ANAI can explain the predictions within a few clicks and we also saw the types of XAI charts that can be generated. We are currently on track to build features that complete and enhance our current XAI, with features such as automatic biased feature detection and rectification within the same platform. 

To implement such solutions or to get a personalized solution for your niche use case, contact us at or visit

credit scoring model | credit scoring data set | credit scoring analysis | credit scoring machine learning

Want to get started?

Connect with us to get a free demo