Why It Matters
Peter Lee, PhD, may call himself a “techie,” but when he talks about artificial intelligence (AI), he starts from the perspective of a patient or family member. When asked what excites him most about AI’s potential to improve health care, the co-author of the book The AI Revolution in Medicine: GPT-4 and Beyond expressed enthusiasm about how using AI can help people feel more empowered in their interactions with the health care system. In the following interview with the Institute for Healthcare Improvement (IHI), Lee discusses his personal and professional experiences with GPT-4, an AI-powered tool that processes human speech and converts it into text, built by the Microsoft-backed company OpenAI. Lee is Corporate Vice President of Research and Incubations at Microsoft and will be a keynote speaker at the IHI Forum (December 10–13, 2023).
On using AI to feel more empowered as a patient and family member
My favorite example of using AI [to feel more empowered] is using GPT-4 to decode information from the health care system. I use it to understand the explanation of benefits notice I receive from our employer-provided health insurance every time someone in my family has a procedure or has something done. Just the other month, I had my annual physical exam, and I got the PDF of my CBC labs four days later in an email. It’s just a pile of numbers. Being able to give this kind of information to GPT-4 and say, “Explain this to me. Is there something I need to pay attention to? Are there questions I should ask my doctor?” is incredibly empowering.
On using AI to save time
A lot of my focus is on the power of generative AI to make the day-to-day working lives of doctors and nurses better by reducing their paperwork and clerical work. I’m very excited to see not just our company but other companies also starting to put generative AI into clinical note-taking systems, into [electronic health record] patient communication systems. It’s great to talk to doctors and nurses and having them say, “Wow, this is making it easier to avoid bringing work home at night.”
On mistaken assumptions about generative AI
Generative AI systems [which help users generate new content based on a variety of inputs] are reasoning engines. They’re not computers in the traditional sense. In other words, if you think of a computer as a machine that does perfect memory recall and perfect calculation, these things aren’t computers. And, so, we can get into trouble in health care delivery if we assume these things are computers and these misunderstandings can lead to mistakes. Ethan Mollick, an Associate Professor at the Wharton School of the University of Pennsylvania has said, “It's almost best to think of [ChatGPT] like a person, like an intern that you have working for you” and not as a computer.
On whether we assume too much or too little about AI’s potential in health care
A generative AI system like GPT-4 is both smarter than anyone you’ve met and dumber than anyone you’ve met. I think we both assume too much and too little about its potential in health care. If you are a doctor and you see the initial presentation of a patient, you get the initial lab test results, and from those you form a differential diagnosis. You can also ask GPT-4 to look at that initial presentation and the initial labs and ask GPT-4 to produce a differential diagnosis, but that differential diagnosis shouldn't automatically be assumed to be better than one you would get from a human being. So, there is a need in that situation to have a human in the loop to look over the results and verify them.
Another approach is to have the human doctor or nurse form his or her own differential diagnosis and then give that to GPT-4 and say, “Here’s what I'm thinking. Can you look at this and critique it?” While a generative AI system like GPT-4 in some ways isn’t as good as human beings in generating content, we’re finding it can be good for critique, improvement, or summarization.
GPT-4 has biases just like human beings, but it is amazingly good at spotting biases in things that it reads or in decisions that other AI systems or human beings make. So, you see this dichotomy where it can be good in some things, but still faulty. I like to think of GPT-4 as a tool that amplifies our own intelligence. It helps us improve the things we are good at, but it can also exhibit some of our same faults. Understanding the ins and outs of those dichotomies right now is so important.
On using generative AI to communicate with empathy
[My team and I] were given early access to GPT-4 at the end of the summer of 2022. We were investigating the potential applications and impact, both positive and negative, that GPT-4 might have in medicine and health care. At the beginning, we focused on determining how smart this thing could be. We saw it as a medical knowledge bot. Could it pass the US medical licensing exam? Could it pass the NCLEX-RN (National Council Licensure Examination for Registered Nurses) exam? Could it read and review and answer questions about a medical research paper?
Later, in November of 2022, OpenAI released ChatGPT using an earlier, smaller AI model called GPT-3.5. Our research labs started receiving emails from doctors and nurses around the world saying, “This is great stuff. We’re using it to write after-visit summaries and our patients really like it.” That was unexpected. Since then, we’ve come to learn that how to interact with patients is one of the biggest daily challenges clinicians and administrators have.
One of my colleagues had a friend who unfortunately was diagnosed with late-stage pancreatic cancer. Knowing his personal relationship with this patient, the specialist asked my colleague to talk to her about her treatment options. My colleague really struggled as he explained to GPT-4, “Here’s the situation. I don’t know what to tell her.”
The concrete advice on how to approach the situation was remarkable. And then, at the end of the conversation, my colleague told GPT-4, “Thank you for this advice. It’s very helpful.” And GPT-4 responded, “You’re welcome, but what about you? Are you getting everything you need? How are you feeling and holding up through this?"
Since then, several academic medical centers have done research on [empathy and GPT-4]. A paper in JAMA Internal Medicine measured how empathetic and technically correct GPT-4’s answers were in response to patients' questions. They found that GPT-4 is about as correct as a human doctor's answers, but by a factor of almost 10 to one, the answers from GPT-4 were deemed to be more empathetic. There have also been initial studies comparing GPT-4-authored notes to doctor's [communications to patients]. The overriding response has been that the GPT-4 notes are more “human.” Now, of course, we know they’re not more human. What’s happening is that a doctor or a nurse is busy and so overburdened with work that they just have to blast through these communications whereas GPT-4 can take the time to add those extra personal touches — “Congratulations on becoming a grandparent last month” or “Best wishes on your wedding next month” — that are so highly valued in the delivery of health care. We never expected a year ago that this would be a use of generative AI.
On issues of bias and inequity
Speaking as a technologist, our dream is that we would put the world’s best and most empowering medical information into everybody’s hands equally and that doing so would allow our world and our society to take a major step forward towards more equitable access to health care. I’ve come to realize, however, that using technology to address the issue of health equity won’t be that easy. Technology isn’t a silver bullet. These systems do exhibit biases. The question is whether we can repair and fix these biases. My own view is that these systems are reflections of humans, and that, therefore, they are inherently biased just like we are. But I find it hopeful that, though GPT-4 exhibits many biases, it also understands the concept of bias and understands that biases are harmful. For example, if you present it with a scenario involving medical informed consent and biased decision-making, it very reliably spots biases exhibited by AI or human beings. At the end of the day, AI is just a tool. My hope is it can become a powerful tool for combating biases, unfairness, and lack of equity.
On using generative AI to make a challenging situation a little easier
Earlier this year, my father passed away after a very lengthy illness. He was ill for almost two years. Over that time, he was seeing a primary care physician and two specialists. He needed help from me and my two sisters, but we all lived hundreds of miles away. As happens in many families, we had this complex situation we were trying to figure out long distance. In that process, the tensions flared up between me and my two sisters, and our relationships started to fray. We would schedule that golden 15-minute phone conversation with one of the specialists, and we would squabble about how best to use the time. We gave all the lab test results, notes, the list of medications, and latest observations to GPT-4, explained the situation, and said, “We’re going to have a 15-minute phone call with the specialist. What would be the best way to use that time? What should we talk about? What are the best questions to ask?” It brought the temperature down and helped us feel more in control of the situation.
Editor’s note: This interview has been edited for length and clarity.
You may also be interested in:
The Current Generation of AI Tools: Three Considerations for Quality Leaders
The New York Times - When Doctors Use a Chatbot to Improve Their Bedside Manner