Abstract:Recent advancements in natural language processing (NLP) have catalyzed the development of models capable of generating coherent and contextually relevant responses. Such models are applied across a diverse array of applications, including but not limited to chatbots, expert systems, question-and-answer robots, and language translation systems. Large Language Models (LLMs), exemplified by OpenAI's Generative Pretrained Transformer (GPT), have significantly transformed the NLP landscape. They have introduced unparalleled abilities in generating text that is not only contextually appropriate but also semantically rich. This evolution underscores a pivotal shift towards more sophisticated and intuitive language understanding and generation capabilities within the field. Models based on GPT are developed through extensive training on vast datasets, enabling them to grasp patterns akin to human writing styles and deliver insightful responses to intricate questions. These models excel in condensing text, extending incomplete passages, crafting imaginative narratives, and emulating conversational exchanges. However, GPT LLMs are not without their challenges, including ethical dilemmas and the propensity for disseminating misinformation. Additionally, the deployment of these models at a practical scale necessitates a substantial investment in training and computational resources, leading to concerns regarding their sustainability. ChatGPT, a variant rooted in transformer-based architectures, leverages a self-attention mechanism for data sequences and a reinforcement learning-based human feedback (RLHF) system. This enables the model to grasp long-range dependencies, facilitating the generation of contextually appropriate outputs. Despite ChatGPT marking a significant leap forward in NLP technology, there remains a lack of comprehensive discourse on its architecture, efficacy, and inherent constraints. Therefore, this survey aims to elucidate the ChatGPT model, offering an in-depth exploration of its foundational structure and operational efficacy. We meticulously examine Chat-GPT's architecture and training methodology, alongside a critical analysis of its capabilities in language generation. Our investigation reveals ChatGPT's remarkable aptitude for producing text indistinguishable from human writing, whilst also acknowledging its limitations and susceptibilities to bias. This analysis is intended to provide a clearer understanding of ChatGPT, fostering a nuanced appreciation of its contributions and challenges within the broader NLP field. We also explore the ethical and societal implications of this technology, and discuss the future of NLP and AI. Our study provides valuable insights into the inner workings of ChatGPT, and helps to shed light on the potential of LLMs for shaping the future of technology and society. The approach used as Eco-GPT, with a three-level cascade (GPT-J, J1-G, GPT-4), achieves 73% and 60% cost savings in CaseHold and CoQA datasets, outperforming GPT-4.

Evaluating ChatGPT on Nuclear Domain-Specific Data

A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets

Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

Evaluating Quality of Answers for Retrieval-Augmented Generation: A Strong LLM Is All You Need

Demystifying ChatGPT: An In-depth Survey of OpenAI's Robust Large Language Models

A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity

Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model

ChatGPT Alternative Solutions: Large Language Models Survey

Evaluating Large Language Models on a Highly-specialized Topic, Radiation Oncology Physics

Extending the Frontier of ChatGPT: Code Generation and Debugging

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Advancing Question-Answering in Ophthalmology with Retrieval Augmented Generations (RAG): Benchmarking Open-source and Proprietary Large Language Models

Investigating ChatGPT's Potential to Assist in Requirements Elicitation Processes

An Assessment of ChatGPT on Log Data

GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information

Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study

Aggregated Knowledge Model: Enhancing Domain-Specific QA with Fine-Tuned and Retrieval-Augmented Generation Models

DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation

Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues

From Answers to Insights: Unveiling the Strengths and Limitations of ChatGPT and Biomedical Knowledge Graphs

ChatGPT in Nuclear Medicine Education