Clustering functional data with measurement errors: a simulation-based approach

Tingyu Zhu,Lan Xue,Carmen Tekwe,Keith Diaz,Mark Benden,Roger Zoh
DOI: https://doi.org/10.48550/arxiv.2406.11942
2024-06-17
Methodology
Abstract:Clustering analysis of functional data, which comprises observations that evolve continuously over time or space, has gained increasing attention across various scientific disciplines. Practical applications often involve functional data that are contaminated with measurement errors arising from imprecise instruments, sampling errors, or other sources. These errors can significantly distort the inherent data structure, resulting in erroneous clustering outcomes. In this paper, we propose a simulation-based approach designed to mitigate the impact of measurement errors. Our proposed method estimates the distribution of functional measurement errors through repeated measurements. Subsequently, the clustering algorithm is applied to simulated data generated from the conditional distribution of the unobserved true functional data given the observed contaminated functional data, accounting for the adjustments made to rectify measurement errors. We illustrate through simulations show that the proposed method has improved numerical performance than the naive methods that neglect such errors. Our proposed method was applied to a childhood obesity study, giving more reliable clustering results
What problem does this paper attempt to address?