Generation of synthetic gait data: application to multiple sclerosis patients' gait patterns

Klervi Le Gall,Lise Bellanger,David Laplaud,Aymeric Stamm
2024-11-20
Abstract:Multiple sclerosis (MS) is the leading cause of severe non-traumatic disability in young adults and its incidence is increasing worldwide. The variability of gait impairment in MS necessitates the development of a non-invasive, sensitive, and cost-effective tool for quantitative gait evaluation. The eGait movement sensor, designed to characterize human gait through unit quaternion time series (QTS) representing hip rotations, is a promising approach. However, the small sample sizes typical of clinical studies pose challenges for the stability of gait data analysis tools. To address these challenges, this article presents two key scientific contributions. First, a comprehensive framework is proposed for transforming QTS data into a form that preserves the essential geometric properties of gait while enabling the use of any tabular synthetic data generation method. Second, a synthetic data generation method is introduced, based on nearest neighbors weighting, which produces high-fidelity synthetic QTS data suitable for small datasets and private data environments. The effectiveness of the proposed method, is demonstrated through its application to MS gait data, showing very good fidelity and respect of the initial geometry of the data. Thanks to this work, we are able to produce synthetic data sets and work on the stability of clustering methods.
Computer Vision and Pattern Recognition,Applications
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: How to generate high - fidelity synthetic data for the gait patterns of patients with multiple sclerosis (MS) in order to overcome the impact of small sample sizes and data privacy issues in clinical research on the stability of gait data analysis tools. Specifically, the paper aims to solve the following key problems: 1. **The challenge of small sample sizes**: In clinical research, the sample size is usually small, which poses challenges to the stability and reliability of gait data analysis tools. To address this issue, a method is needed to generate sufficient synthetic data to evaluate the performance of these tools. 2. **Data privacy protection**: Sharing clinical data faces the problem of privacy protection, and traditional de - identification methods are not sufficient to protect personal data. Therefore, a method that can generate synthetic data without revealing patient privacy is required. 3. **The complexity of gait data**: Gait impairments in patients with multiple sclerosis are highly variable, and traditional gait analysis methods have difficulty capturing their complex geometric features. The paper proposes a method based on unit quaternion time series (QTS) that can better describe joint movements in gait. To solve the above problems, the paper makes two main scientific contributions: - **Framework design**: A comprehensive framework for converting QTS data into tabular data is proposed, which retains the key geometric properties of gait data and allows the use of any tabular synthetic data generation method. - **Synthetic data generation method**: A synthetic data generation method based on nearest - neighbor weighting is introduced, which can generate high - quality synthetic QTS data and is especially suitable for small - sample - size and private - data environments. Through these contributions, the paper demonstrates the application effect of the proposed method on multiple - sclerosis gait data and verifies its high - fidelity and good ability to maintain geometric properties. This enables researchers to improve the stability and reliability of gait analysis tools without relying on large - scale real - data.