Learning effective good variables from physical data

Giulio Barletta,Giovanni Trezza,Eliodoro Chiavazzo
2024-01-10
Abstract:We assume that a sufficiently large database is available, where a physical property of interest and a number of associated ruling primitive variables or observables are stored. We introduce and test two machine learning approaches to discover possible groups or combinations of primitive variables: The first approach is based on regression models whereas the second on classification models. The variable group (here referred to as the new effective good variable) can be considered as successfully found, when the physical property of interest is characterized by the following effective invariant behaviour: In the first method, invariance of the group implies invariance of the property up to a given accuracy; in the other method, upon partition of the physical property values into two or more classes, invariance of the group implies invariance of the class. For the sake of illustration, the two methods are successfully applied to two popular empirical correlations describing the convective heat transfer phenomenon and to the Newton's law of universal gravitation.
Data Analysis, Statistics and Probability,Machine Learning
What problem does this paper attempt to address?
This paper discusses methods for learning effective variables from physical data. The study assumes the existence of a large database containing interested physical properties and their relevant fundamental variables. The paper proposes two machine learning approaches to discover potential variable combinations: one based on regression models and the other based on classification models. When using these methods, if the physical properties exhibit effective invariance, i.e. invariance of the combinations within a given accuracy or invariance of the categories after classification, new effective variables are considered to be found. In the first method, the invariance of combinations implies that the attribute values remain relatively unchanged. In the second method, the invariance of combinations implies the invariance of categories after dividing the physical attribute values into two or more classes. The paper demonstrates the effectiveness of these two methods through the application of popular empirical correlations for heat convection and Newton's law of universal gravitation. Furthermore, the paper discusses how to reduce the set of material descriptors through a multi-objective optimization process to improve classification performance. This approach has been previously applied in the case of superconductors and successfully applied to identify reduced variable sets in photocatalytic microsystems, achieving high-performance combinations as alternatives to expensive components. The paper's methodology includes using regression models to find variable groups and using classification models to achieve optimal variable mixtures for category separation. Experimental results show that these methods can effectively identify simplified descriptions of complex systems and can be extended to more general functional forms. Overall, the aim of this paper is to propose an automated method that can automatically identify key variable combinations for simplified descriptions of physical systems from data, thereby simplifying theoretical modeling and numerical simulation tasks.