Abstract:Large language models (LLMs), trained on diverse data effectively acquire a breadth of information across various domains. However, their computational complexity, cost, and lack of transparency hinder their direct application for specialised tasks. In fields such as clinical research, acquiring expert annotations or prior knowledge about predictive models is often costly and time-consuming. This study proposes using LLMs to elicit expert prior distributions for predictive models. This approach also provides an alternative to in-context learning, where language models are tasked with making predictions directly. We compare LLM-elicited and uninformative priors, evaluate whether LLMs truthfully generate parameter distributions, and propose a model selection strategy for in-context learning and prior elicitation. Our findings show that LLM-elicited prior parameter distributions significantly reduce predictive error compared to uninformative priors in low-data settings. Applied to clinical problems, this translates to fewer required biological samples, lowering cost and resources. Prior elicitation also consistently outperforms and proves more reliable than in-context learning at a lower cost, making it a preferred alternative in our setting. We demonstrate the utility of this method across various use cases, including clinical applications. For infection prediction, using LLM-elicited priors reduced the number of required labels to achieve the same accuracy as an uninformative prior by 55%, at 200 days earlier in the study.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the difficult problem of obtaining expert prior knowledge in predictive modeling, especially the problem of how to use large language models (LLMs) to improve model performance in the case of scarce data. Specifically, the research focuses on the following aspects: 1. **Reducing data requirements and costs**: In fields such as clinical research, labeled data is usually very expensive and time - consuming. By using large language models to generate expert prior distributions, the prediction error can be significantly reduced with a small amount of data, thereby reducing the number of biological samples required and reducing costs and resource consumption. 2. **Replacing in - context learning**: Traditional in - context learning methods let language models make direct predictions, but this method has problems such as computational complexity, high cost, and low transparency. The paper proposes a new method, that is, generating prior distributions through language models and comparing them with in - context learning to evaluate their reliability and effectiveness. 3. **Verifying whether language models can accurately generate parameter distributions**: The research explores whether language models can truly generate the parameter distributions of prediction models, and proposes a model selection strategy to evaluate the advantages and disadvantages of in - context learning and prior extraction. 4. **Exploring the Bayesian inference ability of language models**: By providing training examples and extracting the internal prediction posterior distributions of language models, the research team attempts to verify whether language models can perform Bayesian inference and their performance consistency in different tasks. ### Main contributions of the paper - **Proposing a method of using large language models to generate expert prior distributions**: This method can significantly improve the performance of prediction models, especially in the case of scarce data. - **Comparing the effects of prior distributions generated by language models and non - informative prior distributions**: The results show that using prior distributions generated by language models can significantly reduce prediction errors and require less labeled data to achieve the same accuracy. - **Proposing methods for extracting context - prior and posterior distributions from language models**: This enables researchers to gain a deeper understanding of the performance of language models in different tasks and evaluate whether they truly perform Bayesian inference. - **Applying Bayesian factors for model selection**: By comparing the effects of prior extraction and in - context learning, the research finds that prior extraction is a better choice in all tested tasks, especially considering its consistency and lower cost. ### Formula summary - **Prior distribution**: \[ p(\theta | M, T)=\sum_{k = 1}^{K}\pi_k N(\theta | \mu_k,\sigma_k^2) \] where \((\mu_k,\sigma_k)\sim p_{M,T}(\mu,\sigma | I_k)\), \(\pi_k\sim \text{Dir}(1)\) - **Posterior predictive distribution**: \[ p(y | \tilde{x}, D)=\int_{\Theta}\sum_{k = 1}^{K}p(y | \tilde{x},\theta)p(\theta | D, I_k)p(I_k)d\theta \] - **Bayesian factor**: \[ BF(\alpha_0,\alpha_1; D)=\frac{p(D | \alpha_0)}{p(D | \alpha_1)}=\frac{\int_{\Theta}p(\theta_0 | \alpha_0)p(D | \theta_0,\alpha_0)d\theta_0}{\int_{\Theta}p(\theta_1 | \alpha_1)p(D | \theta_1,\alpha_1)d\theta_1} \] Through these methods and formulas, the paper shows how to effectively use large language models to improve predictive modeling, especially in application scenarios with scarce data and cost - sensitive.

Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Eliciting the Priors of Large Language Models using Iterated In-Context Learning

Improving Clinical Expertise in Large Language Models Using Electronic Medical Records

Large Language Model Prompting Techniques for Advancement in Clinical Medicine

Probabilistic Medical Predictions of Large Language Models

LLM-Select: Feature Selection with Large Language Models

Is larger always better? Evaluating and prompting large language models for non-generative medical tasks

Clinical Risk Prediction Using Language Models: Benefits And Considerations

Large language models in solving clinical dilemmas - advantages and drawbacks

Prompting Large Language Models for Zero-Shot Clinical Prediction with Structured Longitudinal Electronic Health Record Data

Large language models encode clinical knowledge

XAI4LLM. Let Machine Learning Models and LLMs Collaborate for Enhanced In-Context Learning in Healthcare

Leveraging Large Language Models for Medical Information Extraction and Query Generation

Can Large Language Models Provide Emergency Medical Help Where There Is No Ambulance? A Comparative Study on Large Language Model Understanding of Emergency Medical Scenarios in Resource-Constrained Settings

Efficient Reinforcement Learning with Large Language Model Priors

Can Large Language Models Replace Data Scientists in Clinical Research?

Benchmarking Large Language Models in Evidence-Based Medicine

Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation

Black Box Warning: Large Language Models and the Future of Infectious Diseases Consultation

An Active Inference Strategy for Prompting Reliable Responses from Large Language Models in Medical Practice

Keeping Experts in the Loop: Expert-Guided Optimization for Clinical Data Classification using Large Language Models