Robust Nonparametric Regression for Compositional Data: the Simplicial--Real case

Ana M. Bianco,Graciela Boente,Wenceslao González--Manteiga,Francisco Gude Sampedro,Ana Pérez--González
DOI: https://doi.org/10.48550/arXiv.2405.12924
2024-05-22
Abstract:Statistical analysis on compositional data has gained a lot of attention due to their great potential of applications. A feature of these data is that they are multivariate vectors that lie in the simplex, that is, the components of each vector are positive and sum up a constant value. This fact poses a challenge to the analyst due to the internal dependency of the components which exhibit a spurious negative correlation. Since classical multivariate techniques are not appropriate in this scenario, it is necessary to endow the simplex of a suitable algebraic-geometrical structure, which is a starting point to develop adequate methodology and strategies to handle compositions. We centered our attention on regression problems with real responses and compositional covariates and we adopt a nonparametric approach due to the flexibility it provides. Aware of the potential damage that outliers may produce, we introduce a robust estimator in the framework of nonparametric regression for compositional data. The performance of the estimators is investigated by means of a numerical study where different contamination schemes are simulated. Through a real data analysis the advantages of using a robust procedure is illustrated.
Methodology
What problem does this paper attempt to address?