Is a high-throughput experimental dataset large enough to accurately estimate a statistic?
Yifan Zhou,Sirui Lin,Xuhui Zhang,Hou Wu,Jose Blanchet,Zhigang Suo,Tongqing Lu
DOI: https://doi.org/10.1016/j.jmps.2023.105521
IF: 5.582
2023-12-13
Journal of the Mechanics and Physics of Solids
Abstract:In materials science, experimental datasets are commonly used to estimate various statistics of random variables. This paper focuses on a specific random variable: the rupture stretch of a material. Examples of statistics include average, standard deviation, coefficient of variation, and different quantiles. How accurate is the estimate of such a statistic? The answer depends on the statistic, the size of the experimental dataset, and how much the random variable scatters. Here we demonstrate a procedure to generate a large experimental dataset and use the experimental dataset to estimate the accuracy of various statistics of the rupture stretch. We use a high-throughput experiment to measure the rupture stretches of 160 specimens of a silicone rubber. We then use the bootstrap method to determine the 90% confidence intervals of several statistics. We find that the experimental dataset accurately estimates the average, standard deviation, and 50% quantile. However, the experimental dataset does not reliably estimate extremely low or high quantiles. This finding indicates an experimental dataset much larger than 160 specimens is needed to accurately estimate rare-event rupture stretch. We further apply the bootstrap method to an experimental dataset of strengths of 33 specimens of a ceramic. The result indicates that this experimental dataset is too small to accurately estimate the average strength of the ceramic. Our findings demonstrate that the common practice of using small datasets to estimate statistics of material properties is outdated and meaningless. The high-throughput experiment provides a large experimental dataset of rupture stretch, from which the bootstrap method quantifies the accuracy of the estimates of various statistics. The bootstrap method does not require the user to have sophisticated expertise in statistical analysis. Nor does the bootstrap method require the dataset to obey any statistical distribution.
mechanics,materials science, multidisciplinary,physics, condensed matter