A growing body of research suggests that artificial intelligence (AI) and machine learning (ML) technology are mirroring racial biases in society, raising concerns about the unintended harm these biases are causing. Studies have found gender and race bias in face-recognition software and job recruitment technology; but as the medical community increasingly relies on AI and ML to diagnose and treat patients, scientists and researchers are concerned that these technologies may also be racially biased.

In 2019, a team of researchers analyzed a widely used algorithm in health care used to identify and help millions of patients with complex health needs. The study found the algorithm displayed significant racial bias, frequently assessing Black patients who were considerably sicker than White patients as needing the same level of care, which ultimately results in sicker Black patients not receiving the accurate healthcare they need.

Researchers found that the bias occurred because the algorithm predicted health care costs rather than illness, not accounting for unequal access to care meaning that less money is spent caring for Black patients than for white patients. We're currently working with a group of health systems, insurers, tech companies, and government agencies to explore the extent of these biases, Dr. Ziad Obermeyer, the Blue Cross of California Distinguished Associate Professor of Health Policy and Management at the UC Berkeley School of Public Health and lead researcher in the study, told The Plug.

Dr. Ziad Obermeyer
Dr. Ziad Obermeyer

What we're finding is that biases similar to the one we found, where an algorithm predicts a proxy measure, like cost, that is biased relative to what we really want to be predicting, like health, are widespread in live algorithms, Dr. Obermeyer said. Racial biases have also been identified in technology that treats and identifies skin conditions.

 In 2018, Dr. Adewole Adamson, a board-certified dermatologist and Assistant Professor in the Department of Internal Medicine at Dell Medical School at the University of Texas UT at Austin, published a study finding that ML software developed to distinguish between images of benign and malignant moles could greatly assist dermatologists in diagnosing and treating skin diseases, improving patient care.

But Dr. Adamson highlighted that, if not developed with inclusivity in mind, this technology could instead exacerbate health care disparities in dermatology. Studies have found that on average, Black patients diagnosed with melanoma have a five-year survival rate of 67%, while white patients have a survival rate of 92% The algorithms that are currently being developed to diagnose disease are not representative of the various skin types we see, Dr. Adamson told The Plug. In patients in the United States, they are primarily developed using light skin or skin from people that identify as white, and so they will do a poor job.

Dr. Adewole Adamson

It's a problem with the datasets. They are basically just recapitulating the disparities that are already out there, Dr. Adamson said. Dr. Ravi B. Parikh, Assistant Professor in the Department of Medical Ethics and Health Policy and Medicine at the University of Pennsylvania and Staff Physician at the Corporal Michael J. Crescenz VA Medical Center, has been studying racial bias in artificial intelligence, identifying two solutions to tackle these outcomes. Coming up with better ways to detect bias is the first step because then we can come up with ways to mitigate [bias], Dr. Parikh said. 

The second solution is to make sure that the data that we're using is as unbiased as possible."Because health care systems frequently rely on their own data, which could be biased by virtue of the patients who they serve, using standardized datasets that are built with diverse representation is a key part of the solution, according to Dr. Parikh.

Dr. Ravi Parikh

When you're deploying an algorithm, you need to see, depending on what this algorithm is for, are people of color being misdiagnosed? Are people of color getting fewer resources? Dr. Adamson said. What is most amazing about AI is just how scalable it is, Dr. Adamson said. 

The ability to purchase computers to use all of this data to render a decision is incredible, but it is also somewhat dangerous, Dr. Adamson continued. It's both the benefit and the harm because you can magnify harm so much more quickly than you could if it's a single provider or a single health system. But many researchers and doctors, including those who spoke to The Plug, also believe that AI and ML have the potential of reducing racial biases in medicine and creating a fairer and more equitable healthcare system, as long as these systems are provided with the right set of data.

The central issue for me is aligning algorithms to target the right quantities, not biased proxies, Dr. Obermeyer said. His team recently trained an algorithm to interpret x-rays of the knee based on patients' reports of pain as opposed to radiologists' reports of how the knee looked to them.

What we found was that radiologists missed causes of pain that were disproportionately affecting Black patients because the way medical knowledge is produced draws largely on certain populations, Dr. Obermeyer said. 

So algorithms can play a role in building back up a new science that is more fair and equitable, and accurately represents the experience of diverse patients. The scalability of AI and ML technology in healthcare can be harmful if deployed with racially biased data, but the right set of data could significantly change the way providers treat underserved communities who may otherwise not receive the appropriate care. 

First, however, a robust tracking system must be created to identify algorithms and track their performance over time to better identify bias as it happens to ultimately improve healthcare outcomes for diverse populations, according to Dr. Parikh. The worst-case scenario is that we make a biased doctor feel good because the algorithm is convincing them that they are not biased, Dr. Parikh said. The best-case scenario is that we use good data to produce an evidence-based accurate prediction that counters a bias from a doctor who is willing to listen to the algorithm.