Computational toolkit for predicting thickness of 2D materials using machine learning and autogenerated dataset by large language model

Chinedu Ekuma
2024-05-24
Abstract:The thickness of 2D materials not only plays a crucial role in determining the performance of nanoelectronic and optoelectronic devices but also introduces complexities in predicting volume-dependent properties such as energy storage capacity, due to the intrinsic vacuum within these materials. Although a plethora of experimental techniques, including but not limited to optical contrast, Raman spectroscopy, nonlinear optical spectroscopy, near-field optical imaging, and hyperspectral imaging, facilitate the measurement of 2D material thickness, comprehensive data for many materials remains elusive. Over the last decade, the exponential proliferation of 2D materials and their heterostructures has outstripped the capabilities of conventional experimental and computational approaches. In this evolving landscape, machine learning (ML) has emerged as an indispensable tool, offering novel avenues to augment these traditional methodologies. Addressing the critical gap, we introduce THICK2D - Thickness Hierarchy Inference and Calculation Kit for 2D Materials. This Python-based computational framework harnesses an autogenerated thickness database, developed using large language models (LLMs), and advanced ML algorithms to facilitate the rapid and scalable estimation of material thickness, relying solely on crystallographic data. To demonstrate the utility and robustness of THICK2D, we successfully employed the toolkit to predict the thickness of more than 8000 2D-based materials, sourced from two extensive 2D material databases. THICK2D is disseminated as an open-source utility, accessible on GitHub this https URL, and archived on Zenodo at this https URL}{https://doi.org/10.5281/zenodo.11216648.
Materials Science,Strongly Correlated Electrons
What problem does this paper attempt to address?
The problem addressed in this paper is how to effectively predict the thickness of two-dimensional (2D) materials. To tackle this problem, the paper introduces a Python computational framework called THICK2D. This framework utilizes machine learning algorithms and a thickness database generated by a large-scale language model to quickly and scalable estimate the thickness of 2D materials solely based on crystallographic data. Due to the time-consuming and labor-intensive limitations of traditional experimental methods (e.g. optical contrast, Raman spectroscopy) when measuring the thickness of a large number of 2D materials, THICK2D offers a new solution through data augmentation and deep neural networks. The workflow of THICK2D involves generating high-quality thickness data using PropertyExtractor, expanding the dataset through data augmentation, and then training deep neural networks and traditional machine learning models for prediction. The paper demonstrates the successful prediction of the thickness for over 8,000 types of 2D materials using THICK2D, highlighting the importance of these predictions in understanding material properties and applications in areas such as nanoelectronics, optoelectronics, and energy storage. Additionally, THICK2D is an open-source tool available on GitHub and archived on Zenodo, making it convenient for researchers to use. The introduction of this tool aims to complement traditional experimental and computational methods, accelerating the efficient analysis of 2D material thickness and promoting the development of related technologies.