Artificial intelligence begins to meet (and sometimes even exceed) physician assessments in various clinical situations. A.I can now diagnose skin cancer like dermatologists, seizures like neurologists, and diabetic retinopathy like ophthalmologists. Algorithms are being developed to predict which patients have diarrhea or end up in intensive careand the FDA recently approved The first automatic learning algorithm to measure the amount of blood circulating in the heart is a tedious and tedious calculation performed traditionally by cardiologists.
That's enough for doctors like me to wonder why we spent a decade in medical training learning the art of diagnosis and treatment.
There are many questions about whether A.I actually works in medicine, and where it works: can it detect pneumonia, detect cancer, predict death? But these questions focus on technique, not ethics. And in a health system riddled with inequality, we must ask ourselves: Does the use of AI in medicine worsen health disparities?
There are at least three reasons to believe it.
The first is a training problem. A.I. learn how to diagnose a disease on large data sets, and if these data do not include enough patients belonging to a particular context, they will not be as reliable. Data from other areas suggests that this is not just a theoretical concern. A recent study found that some facial recognition programs incorrectly classified less than 1% of men with fair skin but more than a third of women with dark skin. What happens when we rely on such algorithms to diagnose melanoma on light or dark skin?
Medicine has long struggled to include enough women and minorities in the research, knowing that they have risk factors for and events of the disease. Many genetic studies suffer from a lack of black patients, leading to erroneous conclusions. Women often experience different symptoms in case of a heart attack, causing treatment delays. The most widely used cardiovascular risk score, developed from predominantly white patient data, can be less accurate for minorities.
Will using A.I. to tell us who might have a stroke, or which patients will benefit from a clinical trial, codify these concerns into less-efficient algorithms for under-represented groups?
Secondly, because A.I. trained from real data, it risks incorporating, anchoring and perpetuating the economic and social biases that contribute primarily to health disparities. Again, evidence from other areas is instructive. A.I. programs used to help judges predict which criminals are most likely to reoffend have shown racial prejudices, like those designed to help child protection services to decide who calls require further investigation.
In medicine, uncontrolled A.I. could create self-fulfilling prophecies that confirm our pre-existing biases, especially when used under complex compromise conditions and high degree of uncertainty. If, for example, the poorest patients suffer more after an organ transplant or chemotherapy for terminal cancer, machine-learning algorithms may conclude that these patients are less likely to benefit. of additional treatment – and advise against it.
Finally, even seemingly ostensibly fair, neutral A.I. risks aggravating disparities if its implementation has disproportionate effects on certain groups. Consider a program that helps physicians decide if a patient should return home or to a rehabilitation center after a knee surgery. It is an uncertain decision but one that has real consequences: the evidence suggests that the discharge to an institution is associated with higher costs and higher risk of readmission. If an algorithm incorporates residence in a low-income neighborhood as a marker of inadequate social support, it may recommend minority patients to go to nursing homes instead of receiving physical therapy at home . Worse still, a program designed to maximize efficiency or reduce medical costs could completely discourage interventions on these patients.
To a certain extent, all these problems already exist in medicine. American health care has always struggled with income and race-based inequalities based on various forms of bias. The risk with A.I. is that these prejudices become automated and invisible – that we begin to accept the wisdom of machines over the wisdom of our own clinical and moral intuition. Many AI programs are black boxes: we do not know exactly what is going on inside and why they produce the results they produce. But we can expect more and more to respect their recommendations.
In my practice, I have often seen how a tool can quickly become a crutch – an excuse to entrust the decision-making to someone or something else. Medical students struggling to interpret an electrocardiogram inevitably glance at the computer generated result at the top of the sheet. I am often impressed by the report provided next to a chest x-ray or CT scan. As automation becomes ubiquitous, will we see that the spell checker "they are" autocorrected to "there" while we hear "their"?
Yet, A.I. holds tremendous potential for improving medicine. This could make care more efficient, more accurate and – if properly deployed – more equitable. But to fulfill this promise, one must be aware of the potential for bias and guard against it. This means that the output of algorithms and downstream consequences must be monitored regularly. In some cases, this will require counter-bias algorithms that seek and correct a subtle and systematic discrimination.
But more fundamentally, it means recognizing that it is humans, not machines, who are responsible for patient care. It is our job to make sure we use AI as another tool at our disposal, not the other way around.
Dhruv Khullar (@DhruvKhullar) is a physician at NewYork-Presbyterian Hospital, Assistant Professor in the Departments of Medicine and Health Policy at Weill Cornell Medicine, and Director of Policy Dissemination at the Center for the Study of Health. practice and leadership of physicians. .