Determination of rank by median absolute deviation (DRMAD): a simple method for determining the number of principal factors responsible for a data matrix

Edmund R. Malinowski
DOI: https://doi.org/10.1002/cem.1182
IF: 2.5
2009-01-01
Journal of Chemometrics
Abstract:Median absolute deviation (MAD) is a well‐established statistical method for determining outliers. This simple statistic can be used to determine the number of principal factors responsible for a data matrix by direct application to the residual standard deviation (RSD) obtained from principal component analysis (PCA). Unlike many other popular methods the proposed method, called determination of rank by MAD (DRMAD), does not involve the use of pseudo degrees of freedom, pseudo F‐tests, extensive calibration tables, time‐consuming iterations, nor empirical procedures. The method does not require strict adherence to normal distributions of experimental uncertainties. The computations are direct, simple to use and extremely fast, ideally suitable for online data processing. The results obtained using various sets of chemical data previously reported in the chemical literature agree with the early work. Limitations of the method, determined from model data, are discussed. An algorithm, written in MATLAB format, is presented in the Appendix. Copyright © 2008 John Wiley & Sons, Ltd. DRMAD is a statistical method designed to determine the rank of a data matrix. It applies the MAD statistic to the residual standard deviation obtained from principal component analysis. The method does not require strict adherence to normal distributions of experimental uncertainties. The computations are direct, simple and fast. An algorithm, written in MATLAB format, is presented.
chemistry, analytical,instruments & instrumentation,mathematics, interdisciplinary applications,automation & control systems,computer science, artificial intelligence,statistics & probability
What problem does this paper attempt to address?