Supplementary Material to "chemometric Analysis of Aerosol Mass Spectra: Exploratory Methods to Extract and Classify Anthropogenic Aerosol Chemotypes"

Mikko Äijälä,Liine Heikkinen,Roman Fröhlich,Francesco Canonaco,Andrê S. H. Prévôt,Heikki Junninen,Tuukka Petäjä,Markku Kulmala,Douglas R. Worsnop,Mikael Ehn
DOI: https://doi.org/10.5194/acp-2016-632-supplement
2016-01-01
Abstract:S.1 PMFon robust mode and rotational ambiguitySince the amount of weight given in PMF to an observation (here: the value of a variable i at a certain time j) by the iterative process is proportional to the square of Ei,j / σi,j, outliers with abnormally high squared signal or low variance may end up dominating the model solution.This phenomenon is especially relevant in environmental observations, as there are several types of outliers that would conceivably cause this behaviour, such as errors in the functioning of the measurement instrument or extreme, rare events that are considered contamination from the point of view of the analysis.Therefore a "robust mode" for PMF was introduced (Paatero, 1997).The approach in short is to introduce a limit α, for the weight given to a point (Ei,j / σi,j) beyond which the point is considered an outlier, and dynamically down-weighted to negate its disproportional effect on the objective function Q.For a complete explanation on outliers and the robust mode, we refer the reader to the original work (Paatero, 1997).The main weaknesses of PMF and indeed most factor analytic or linear algebraic methods are: 1) The "rotational ambiguity" of the solutions, i.e. the existence of multiple, sometimes very different, mathematical solutions with equally high rate of explanation of the observed (weighted) variance (Paatero, 1997;Paatero et al., 2002).Exploring the rotations and selecting the best solution from the "solution space" needs to be done by the analyst, often based mainly on interpretability of the results in the context of the particular research topic at hand.2) The selection of number of factors, f.While exploring the rate of decrease of Q when increasing f can be considered an indicator of the amount of factors present in the set of data (Paatero and Tapper, 1993;Ulbrich et al., 2009;Reff et al., 2007), it rarely gives unambiguous answers.In the end it is up to the analyst to decide f based on both the diagnostics offered by Q and the interpretability of the result.These two subjective selections are often considered the most debatable part of a factor analysis (Ulbrich et al., 2009;Reff et al., 2007;Kim and Mueller, 1978).An additional constraint of the method is that is regards the chemical composition of a factor invariable, and
What problem does this paper attempt to address?