Digital Themes

Machine learning bias

2022 was a banner year for artificial intelligence (AI), and developments like ChatGPT and other large learning models (LLMs) altered the AI landscape by improving public access to AI tools. 

However, the benefits of AI tools remain dampened by machine learning bias.

Bias in machine learning occurs when machine learning models (e.g., ones assisting with facial recognition, programmatic advertising, or product recommendations) use poor or incomplete datasets to arrive at a conclusion. Machine learning algorithms are only as good as their data, after all, and datasets all too often reflect existing social prejudices or faulty assumptions.

In successful models, ML systems are low-bias and low-variance, and the bias-variance tradeoff refers to a model that is neither underfitted nor overfitted. Underfitting, for example, will produce low variance, high bias, and a high error rate.

The Conversation, a leading publisher of research-based news, found instances of ageism and sexism in AI-generated images: “For non-specialised job titles, Midjourney returned images of only younger men and women. For specialised roles, both younger and older people were shown – but the older people were always men. … There were also notable differences in how men and women were presented. For example, women were younger and wrinkle-free, while men were ‘allowed’ to have wrinkles.”

Machine learning bias refers to any sort of departure from the true value, and so it can encompass a large number of issues. However, marginalized groups – including people of color, women, and people with disabilities – are particularly at risk, and the consequences of machine learning bias can be disastrous. Per a 2023 report from the IBM Data and AI Team, “[C]omputer-aided diagnosis (CAD) systems have been found to return lower accuracy results for black patients than white patients.”

On December 15, 2023, an article published by JAMA Open Network, “Guiding Principles to Address the Impact of Algorithm Bias on Racial and Ethnic Disparities in Health and Health Care,” elucidated the need to prevent bias in ML models, and its authors proposed a framework:

“Five principles should guide these efforts:

(1) promote health and health care equity during all phases of the health care algorithm life cycle;

(2) ensure health care algorithms and their use are transparent and explainable;

(3) authentically engage patients and communities during all phases of the health care algorithm life cycle and earn trustworthiness;

(4) explicitly identify health care algorithmic fairness issues and trade-offs; and

(5) establish accountability for equity and fairness in outcomes from health care algorithms.”

The benefits of removing machine learning biases are many, but they include the following:

  • Equal healthcare access for all patients in need
  • Reduction and elimination of harmful outcomes related to faulty datasets
  • Recovered patient trust in care providers and in the ML models built to improve their care
  • Reduction of health disparities (described by the Centers for Disease Control and Prevention (CDC) as “preventable differences in the burden of disease, injury, violence, or in opportunities to achieve optimal health experienced by socially disadvantaged racial, ethnic, and other population groups, and communities”)
Related content