Omega — harnessing the power of large language models for bioimage analysis

Loïc A. Royer
DOI: https://doi.org/10.1038/s41592-024-02310-w
IF: 48
2024-06-11
Nature Methods
Abstract:We introduce Omega, a large language model (LLM)-based 1 conversational agent implemented as a napari 2 plug-in capable of performing image processing tasks, analyzing images to gather insights, correcting its own coding mistakes, and conducting follow-up quantifications and analyses (https://github.com/royerlab/napari-chatgpt). For instance, a user can instruct Omega to "segment cell nuclei in the selected image on the napari viewer," then "count the number of segmented nuclei" and finally "return a table that lists the nuclei, their positions and areas" (Fig. 1 and Supplementary Videos 1 and 2). Moreover, Omega can provide advice and instructions on various image processing and analysis topics. A user can ask Omega to create a "step-by-step plan to segment nuclei in an image," and Omega will generate a detailed strategy (Supplementary Video 3). The user can then interactively apply these steps, make changes in response to the outcomes, and ask follow-up questions to complete the task (Supplementary Video 3). Omega can correct its own coding mistakes (Supplementary Video 9), download files from the web (Supplementary Video 10), perform web searches (Supplementary Video 11), execute arbitrary Python code (Supplementary Video 12), control and query the state as well as contents of the napari viewer (Supplementary Video 13), make napari widgets (Supplementary Videos 4 and 5) and query the parameters and documentation of Python functions (Supplementary Video 14). Omega can segment cells and nuclei in 2D and 3D images using cellpose 3 (Supplementary Video 15) and StarDist 4 (Supplementary Videos 1 and 4). It can also denoise images using Aydin 5 (Supplementary Video 16). Omega inherits ChatGPT's Python coding abilities and knowledge (Supplementary Video 17).
biochemical research methods
What problem does this paper attempt to address?