Comparison of Open-Source and Proprietary LLMs for Machine Reading Comprehension: A Practical Analysis for Industrial Applications

Mahaman Sanoussi Yahaya Alassan,Jessica López Espejel,Merieme Bouhandi,Walid Dahhane,El Hassane Ettifouri

2024-12-07

Abstract:Large Language Models (LLMs) have recently demonstrated remarkable performance in various Natural Language Processing (NLP) applications, such as sentiment analysis, content generation, and personalized recommendations. Despite their impressive capabilities, there remains a significant need for systematic studies concerning the practical application of LLMs in industrial settings, as well as the specific requirements and challenges related to their deployment in these contexts. This need is particularly critical for Machine Reading Comprehension (MCR), where factual, concise, and accurate responses are required. To date, most MCR rely on Small Language Models (SLMs) or Recurrent Neural Networks (RNNs) such as Long Short-Term Memory (LSTM). This trend is evident in the SQuAD2.0 rankings on the Papers with Code table. This article presents a comparative analysis between open-source LLMs and proprietary models on this task, aiming to identify light and open-source alternatives that offer comparable performance to proprietary models.

Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to select and deploy large - language models (LLMs) to achieve machine reading comprehension (MRC) tasks in industrial applications. Specifically, the paper focuses on comparing the performance of open - source and proprietary LLMs in machine reading comprehension, aiming to identify lightweight open - source alternatives that can provide performance comparable to proprietary models in resource - constrained environments. This involves evaluating different models on key metrics such as accuracy, efficiency, and scalability, as well as analyzing the trade - offs of these models in actual deployment. Through this research, the author hopes to provide guidance for industry decision - makers, helping them make more informed choices when integrating MRC capabilities into their work processes while overcoming operational challenges.

Comparison of Open-Source and Proprietary LLMs for Machine Reading Comprehension: A Practical Analysis for Industrial Applications

Evaluating the Efficacy of Open-Source LLMs in Enterprise-Specific RAG Systems: A Comparative Study of Performance and Scalability

SMLT-MUGC: Small, Medium, and Large Texts -- Machine versus User-Generated Content Detection and Comparison

Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data

"Which LLM should I use?": Evaluating LLMs for tasks performed by Undergraduate Computer Science Students

Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production

Survey of different Large Language Model Architectures: Trends, Benchmarks, and Challenges

Investigating LLM Applications in E-Commerce

Beyond Metrics: Evaluating LLMs' Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios

The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models

Studying LLM Performance on Closed- and Open-source Data

An energy-based comparative analysis of common approaches to text classification in the Legal domain

M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering

Open, Closed, or Small Language Models for Text Classification?

Several categories of Large Language Models (LLMs): A Short Survey

Open-Source LLMs for Text Annotation: A Practical Guide for Model Setting and Fine-Tuning

Can Large Language Models Be an Alternative to Human Evaluations?

LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey

Sentiment Analysis in the Era of Large Language Models: A Reality Check

REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs

SLM-Mod: Small Language Models Surpass LLMs at Content Moderation