Programming with AI: Evaluating ChatGPT, Gemini, AlphaCode, and GitHub Copilot for Programmers

Md Kamrul Siam,Huanying Gu,Jerry Q. Cheng
2024-11-14
Abstract:Our everyday lives now heavily rely on artificial intelligence (AI) powered large language models (LLMs). Like regular users, programmers are also benefiting from the newest large language models. In response to the critical role that AI models play in modern software development, this study presents a thorough evaluation of leading programming assistants, including ChatGPT, Gemini(Bard AI), AlphaCode, and GitHub Copilot. The evaluation is based on tasks like natural language processing and code generation accuracy in different programming languages like Java, Python and C++. Based on the results, it has emphasized their strengths and weaknesses and the importance of further modifications to increase the reliability and accuracy of the latest popular models. Although these AI assistants illustrate a high level of progress in language understanding and code generation, along with ethical considerations and responsible usage, they provoke a necessity for discussion. With time, developing more refined AI technology is essential for achieving advanced solutions in various fields, especially with the knowledge of the feature intricacies of these models and their implications. This study offers a comparison of different LLMs and provides essential feedback on the rapidly changing area of AI models. It also emphasizes the need for ethical developmental practices to actualize AI models' full potential.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the performance of current leading programming assistants in code generation and natural language processing tasks. Specifically, the paper conducts a comprehensive evaluation of large - language models (LLMs) including ChatGPT, Gemini (Bard AI), AlphaCode, and GitHub Copilot. The evaluation is based on tasks in different programming languages (such as Java, Python, and C++) such as the accuracy of natural language processing and code generation. Through these evaluations, the paper emphasizes the strengths and weaknesses of these models and points out the importance of further improvements required to enhance the reliability and accuracy of the latest popular models. In addition, the paper also discusses the high progress made by these AI assistants in language understanding and code generation, while raising the need for ethical considerations and responsible use to promote the development of more refined AI technologies and achieve advanced solutions in various fields.