For symptoms like a runny nose and a cough, some might think it’s a common cold. It doesn’t warrant a doctor’s visit, so they turn to Google and WebMD for additional reassurance. Now, with advancements in AI, some might be tempted to switch from “Dr. Google” to “Dr. ChatGPT,” but can OpenAI’s AI-powered chatbot provide accurate medical advice?
Researchers from Western University set out to answer that question and explore whether ChatGPT is capable of becoming a reliable resource in health care and medical education. The study, led by Schulich School of Medicine & Dentistry assistant professor Dr. Amrit Kirpalani, was recently published in PLOS One and found that ChatGPT was only 49 per cent accurate when it came to providing the right diagnosis.
While they determined ChatGPT is not yet ready to be used as a reliable medical diagnostic tool for complicated cases, their study did find it was able to take complex medical topics and synthesize them in an easy-to-understand manner, which could be beneficial for instructors and health-care providers to deliver medical information in a digestible format.
“To me, the most relevant finding is that ChatGPT delivered its answers in a very simple and easy-to-understand way,” said Kirpalani. “I think that’s important because you can see the potential for it to be used as a great tool to help people learn and understand medical cases – but it can also be very convincing even when it’s wrong.”
The study asked ChatGPT to diagnose 150 cases through Medscape Clinical Challenges, which are designed to test the diagnostic skills of health-care professionals. Medscape is a public platform with many complex cases, where clinicians vote on what they think is the right answer. The research team, which included third-year medical students Ali Hadi, Edward Tran and Branavan Nagarajan, created prompts asking ChatGPT to choose the correct diagnosis in a multiple-choice format and to provide a rationale.
The chatbot was given information including patients’ histories, physical examination results, and laboratory or imaging test results. The researchers found it struggled with interpreting test results and sometimes overlooked critical information that was relevant to the diagnosis. However, the chatbot was helpful in providing next diagnostic steps and making medical information more accessible.
More research needed to ‘use AI responsibly’
It is clear from the study that further research and advancements are needed before AI can be used as another tool to help with medical diagnoses. And as new AI models advance and improve, Kirpalani emphasizes the importance of AI literacy.
“AI literacy is important for patients, for providers, for educators and for students because we need to understand how we can use AI responsibly and how it can be applied and leveraged for health care and medical education purposes.” – Dr. Amrit Kirpalani, Schulich School of Medicine & Dentistry professor
Regardless of the accuracy of these online resources, Kirpalani stressed the need to evaluate and double check responses from the internet against reliable, peer-reviewed sources.
“I would say we’re maybe already at the point where we need guidance around prompt engineering – where they structure an instruction that can be interpreted and understood by a generative AI model,” said Kirpalani.
“We are going to need a lot of oversight on how it’s being used to ensure patient safety and to make sure [this kind of AI technology] will be thoughtfully rolled out.”