Abstract:Droplet-based microfluidic devices have substantial promise as cost-effective alternatives to current assessment tools in biological research. Moreover, machine learning models that leverage tabular data, including input design parameters and their corresponding efficiency outputs, are increasingly utilised to automate the design process of these devices and to predict their performance. However, these models fail to fully leverage the data presented in the tables, neglecting crucial contextual information, including column headings and their associated descriptions. This study presents MicroFluidic-LLMs, a framework designed for processing and feature extraction, which effectively captures contextual information from tabular data formats. MicroFluidic-LLMs overcomes processing challenges by transforming the content into a linguistic format and leveraging pre-trained large language models (LLMs) for analysis. We evaluate our MicroFluidic-LLMs framework on 11 prediction tasks, covering aspects such as geometry, flow conditions, regimes, and performance, utilising a publicly available dataset on flow-focusing droplet microfluidics. We demonstrate that our MicroFluidic-LLMs framework can empower deep neural network models to be highly effective and straightforward while minimising the need for extensive data preprocessing. Moreover, the exceptional performance of deep neural network models, particularly when combined with advanced natural language processing models such as DistilBERT and GPT-2, reduces the mean absolute error in the droplet diameter and generation rate by nearly 5- and 7-fold, respectively, and enhances the regime classification accuracy by over 4%, compared with the performance reported in a previous study. This study lays the foundation for the huge potential applications of LLMs and machine learning in a wider spectrum of microfluidic applications.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the data - processing challenges in the design and performance prediction of microfluidic droplet - generation devices. Specifically, existing machine - learning models fail to fully utilize the contextual information in tabular data, such as column headers and their descriptions, when processing tabular data. This results in limited performance of the models in predicting droplet diameter, generation rate, and classifying generation patterns, etc. In addition, the inconsistency of units and data types in different tabular - data systems also increases the complexity of data processing. To address these issues, the authors propose a framework named μ - Fluidic - LLMs. This framework effectively captures and utilizes the contextual information in tabular data by converting tabular data into natural - language sentences and using pre - trained large - scale language models (LLMs) to generate embedding vectors. These embedding vectors are then used as inputs to standard machine - learning models to improve prediction performance and the ability of design automation. ### Main contributions: 1. **Data - processing method**: A new data - processing method is proposed, which converts tabular data into natural - language sentences and uses large - scale language models to generate embedding vectors, thereby better capturing contextual information. 2. **Performance improvement**: By combining deep neural networks (DNN) and large - scale language models (such as DistilBERT and GPT - 2), the prediction accuracy of droplet diameter and generation rate is significantly improved, and the accuracy of generation - pattern classification is also enhanced. 3. **Design automation**: The application of this framework in design - automation tasks is demonstrated, which can predict design parameters more accurately. ### Specific improvements: - **Predicting droplet diameter**: The model combining DNN and GPT - 2 reduces the mean absolute error (MAE) from approximately 7.5 to approximately 2.5, which is nearly 5 times better than the previous research results. - **Predicting droplet generation rate**: Similarly, the model combining DNN and GPT - 2 reduces the MAE from approximately 20 to approximately 3.1921, which is nearly 7 times better. - **Classifying generation patterns**: The classification accuracy of the model is improved by more than 4%. ### Experimental verification: The authors used a public data set containing 998 data points to conduct evaluations of 11 prediction tasks, including geometric parameters, flow conditions, generation patterns, and performance, etc. The experimental results show that the μ - Fluidic - LLMs framework performs well in multiple tasks, especially when DNN and GPT - 2 are combined, the performance improvement is particularly significant. In conclusion, this research significantly improves the design and performance - prediction capabilities of microfluidic droplet - generation devices by introducing large - scale language models and natural - language - processing techniques, providing new ideas and tools for future microfluidic applications.

Autonomous Droplet Microfluidic Design Framework with Large Language Models

Droplet based microfluidics integrated with machine learning

Machine learning enhanced droplet microfluidics

Chat-Microreactor: A Large-Language-Model-Based Assistant for Designing Continuous Flow Systems

A deep learning perspective on electro-hydrodynamic micro-droplet interface deformation characteristics

Physics-based statistical learning perspectives on droplet formation characteristics in microfluidic cross-junctions

Changes in VEGF and nitric oxide after deep dermal injury in the female, red Duroc pig-further similarities between female, Duroc scar and human hypertrophic scar.

Design automation of microfluidic single and double emulsion droplets with machine learning

An artificial intelligence-assisted digital microfluidic system for multistate droplet control

Artificial intelligence-based droplet size prediction for microfluidic system

Precise and Fast Microdroplet Size Distribution Measurement Using Deep Learning

A Large Language Model and Denoising Diffusion Framework for Targeted Design of Microstructures with Commands in Natural Language

Deep learning enabled label-free microfluidic droplet classification for single cell functional assays

FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models

Modeling of Droplet Traffic in Interconnected Microfluidic Ladder Devices

Accelerating intelligent microfluidic image processing with transfer deep learning: A microchannel droplet/bubble breakup case study

Machine learning-enhanced predictive modeling for arbitrary deterministic lateral displacement design and test

In-process monitoring and prediction of droplet quality in droplet-on-demand liquid metal jetting additive manufacturing using machine learning

Measuring arrangement and size distributions of flowing droplets in microchannels through deep learning using DropTrack

Large Language Models as Molecular Design Engines

Explainable AI models for predicting drop coalescence in microfluidics device