Tools we love – FairLearn

  • Product Reviews

In this edition, we are going to review Fairlearn, an open-source and community-driven project to help data scientists and decision-makers improve the fairness of artificial intelligence (AI) systems.

We all know how AI has transformed modern life like the advent of self-driving cars, machines playing better than humans in games like Chess, Go, and now even advancing into the process of being a copilot to programmers which helps in writing code when stating requirements in plain English. Everything from your Google search queries and recommendations for the next video to watch on YouTube is made possible with these machine learning-based systems.

Yet at the same time, there are times these systems do not work as we expect. Among these, most notably, are challenges that have highlighted the potential for AI systems to treat people unfairly. Indeed, the fairness of AI systems is one of the key concerns facing society as AI plays an increasingly important role in our daily lives.

Let’s look at some examples where Machine learning has failed:

  • When the first version of Google Photos was made, the image of a black user with their friend was labeled as ‘Gorillas’, which highlighted the racial bias exhibited in ML systems.

image

  • Google Translate has shown gender bias for certain positions over the years. For example, ‘she is a doctor and he is a nurse’, translated into another language was being changed to ‘He is a doctor and she is a nurse’. This issue has now been resolved for most of the languages a few years after the issue was reported.

image

In the book Deep Learning for Coders the author highlights three common ethical issues in tech, with specific examples:

  1. Lack of Recourse processes – Arkansas’s buggy healthcare algorithms left patients stranded
  2. Feedback loops – YouTube’s recommendation system helped unleash a conspiracy theory boom. When aiming for more watch hours. Feedback loops can occur when your model is controlling the next round of data you get. The data that is returned quickly becomes flawed by the software itself.
  3. Bias – When a traditionally African-American name is searched on Google. It displays ads for a criminal background check. Check the study of discrimination in Online Delivery.

The Fairlearn project doesn’t aim to be a silver bullet for everything related to fairness in machine learning and doesn’t claim to solve all the issues in machine learning. Fairlearn aspires to do two things:

image

  • A Python library for fairness assessment and improvement (fairness metrics, mitigation algorithms, plotting, etc.)
  • Educational resources covering organizational and technical processes for unfairness mitigation (comprehensive user guide, detailed case studies, Jupyter notebooks, white papers, etc.)

There are many types of harms (see, e.g., the keynote by K. Crawford at NeurIPS 2017). The Fairlearn library is mainly for handling two types of fairness harms ie:

  • Allocation harms can occur when AI systems extend or withhold opportunities, resources, or information. Some of the key applications are in hiring, school admissions, and lending.
  • Quality-of-service harms can occur when a system does not work as well for one person as it does for another, even if no opportunities, resources, or information are extended or withheld. Examples include varying accuracy in face recognition, document search, or product recommendation.

The Fairlearn project is most useful when it comes to detecting fairness in classification predictors and applying mitigation techniques. It is easy to install as a python package. Fairlearn comes with multiple metrics and mitigation techniques useful for identifying issues for Data Scientists.

Also, it is an open-source project used by thousands of data scientists and organizations to evaluate the fairness of their solutions. It is open-sourced under MIT license and has 997 stars on Github at the time of writing this article.

About the author:

Kurian Benoy is a SE-Data scientist at AOT Technologies. He is a Kaggle expert, with an interest in working on data science problems. He was a Google CodeIn mentor for Tensorflow in 2019, and has worked for open-source organizations like Keras, DVC, Swathanthra Malayalam Computing.
In his free time free, he likes to do bird watching and takes interest in learning more about world history.