An Opinion on ChatGPT in Health Care-Written by Humans Only.
Jens Kleesiek,Yonghui Wu,Gregor Stiglic,Jan Egger,Jiang Bian
DOI: https://doi.org/10.2967/jnumed.123.265687
2023-01-01
Abstract:ChatGPT, created by OpenAI, has taken the world by storm, and its user base is growing even faster than the previous record held by TikTok, reaching 100 million users in just 2 mo after it launched. Textual context, presentations, and even source code are already being generated using ChatGPT. Many publications have been issued, and meanwhile, ChatGPT has been banned as an author by many publishing companies for several different reasons, such as plagiarism, incorrect information, or inaccurate information (1,2), whereas others argue its benefits, such as the ability to write more coherent sentences than nonnative speakers (3). But that does not stop people from all walks of health care from using it. ChatGPT is powered by a generative pretrained transformer (GPT-3.5), which is a large language model (LLM) trained with 175 billion parameters (4). LLMs originate in natural language processing to formulate the probability distribution of a sequence of words or the next word in a sequence. Recent studies report that LLMs are foundation models in which a single model can be adapted to solve a wide range of different natural language-processing tasks because of few-shot learning, zero-shot learning, and transfer learning ability (5). The conversational artificial intelligence (AI) ability is achieved using LLM-based prompt learning (6). To alleviate the toxic responses and integrate human ethics, ChatGPT applied a strategy of reinforcement learning from human feedback to align LLMs to follow human instructions (7). These breakthroughs in natural language processing empower ChatGPT with conversational AI ability so good it has surprised the world. Even within OpenAI, ChatGPT has been a surprise. AI chatbots are not a new thing, but many previous attempts have not achieved the sensation that ChatGPT achieved. Meta’s BlenderBot was a disappointment. What may be different for ChatGPT, beyond the unknown technologies, is OpenAI’s goal of creating artificial general intelligence to match human-level intellect (8). ChatGPT certainly is not an artificial general intelligence, but it sure looks like one because of the breadth and depth of the knowledge it demonstrates through conversations. Even though many are excited by its first use, disillusionment often sets in over time, for several reasons. On the one hand, ChatGPT gives wrong answers and is prone to confabulation (“a memory error defined as the production of fabricated, distorted, or misinterpreted memories about oneself or the world” (9)). This is exacerbated by the fact that we set different standards for communication among humans and between humans and computers. The belief is that a computer will not make mistakes. Moreover, many users’ expectations are wrong, especially for medical interactions. The program was trained and designed for conversation, not diagnostic support or treatment recommendations. Yet, questions arise as to whether ChatGPT is a medical product and who is liable, even though ChatGPT always generates a disclaimer that it is not a health-care professional licensed to give medical advice. This is a typical case of intended use versus actual use as described in the medical device regulation. We argue that there is a difference between general-purpose conversational AI—in which the focus is the conversational ability such as readability—and medical AI—in which the focus is the health facts about flesh-andblood humans. Speaking a fake fact using elegant words is amusing (that is why many ChatGPT users are tricking this conversational AI), but providing a wrong fact in medical AI is dangerous— indeed, making ChatGPT a medical device if it should turn out that doctors are actually using it to diagnose and treat their patients. Nevertheless, philosophically, asking ChatGPT for health-related information (to inform health decision making) is not much different from asking Dr. Google, which has long been criticized for not just giving but spreading medical misinformation (10). Nevertheless, this is again not only the gap between intended use versus actual use but also the consistent push and pull between the expectations of the developers versus the end users. As always with any potentially disruptive technologies, such use can be seen as either a threat or an opportunity. Many articles are optimistic, pointing to the potential symbiosis, the modern centaur, a combination of humans and computers leading to a beneficial augmentation of our capabilities. But pessimistic views also need to be discussed. Take the global positioning system, for example. Because of this technology, many young people are no longer able to navigate with a compass and map. Of course, one could argue that use of a map is not required as a basic skill anymore. But that is certainly not the case with language. If we as humans lose the ability to communicate, debate, and think critically, then we are taking a step backward, leading to devolution. The question remains: what is the actual use of ChatGPT, despite all the hype during the last few months? Of course, it can be used to generate simple text and to produce code snippets (but often with errors). It can even quickly analyze a research topic and generate an academic paper—again, with frequent errors that may go unnoticed even by reviewers and editors of scientific journals (11). Received Mar. 7, 2023; revision accepted Mar. 14, 2023. For correspondence or reprints, contact Jens Kleesiek (jens.kleesiek@ uk-essen.de). Published online Apr. 13, 2023. COPYRIGHT 2023 by the Society of Nuclear Medicine andMolecular Imaging. DOI: 10.2967/jnumed.123.265687