biorecap: an R package for summarizing bioRxiv preprints with a local LLM

Stephen D. Turner
2024-08-21
Abstract:The establishment of bioRxiv facilitated the rapid adoption of preprints in the life sciences, accelerating the dissemination of new research findings. However, the sheer volume of preprints published daily can be overwhelming, making it challenging for researchers to stay updated on the latest developments. Here, I introduce biorecap, an R package that retrieves and summarizes bioRxiv preprints using a large language model (LLM) running locally on nearly any commodity laptop. biorecap leverages the ollamar package to interface with the Ollama server and API endpoints, allowing users to prompt any local LLM available through Ollama. The package follows tidyverse conventions, enabling users to pipe the output of one function as input to another. Additionally, biorecap provides a single wrapper function that generates a timestamped CSV file and HTML report containing short summaries of recent preprints published in user-configurable subject areas. By combining the strengths of LLMs with the flexibility and security of local execution, biorecap represents an advancement in the tools available for managing the information overload in modern scientific research. The biorecap R package is available on GitHub at <a class="link-external link-https" href="https://github.com/stephenturner/biorecap" rel="external noopener nofollow">this https URL</a> under an open-source (MIT) license.
Other Quantitative Biology
What problem does this paper attempt to address?