Abstract:Smartwatch health sensor data are increasingly utilized in smart health applications and patient monitoring, including stress detection. However, such medical data often comprise sensitive personal information and are resource-intensive to acquire for research purposes. In response to this challenge, we introduce the privacy-aware synthetization of multi-sensor smartwatch health readings related to moments of stress, employing Generative Adversarial Networks (GANs) and Differential Privacy (DP) safeguards. Our method not only protects patient information but also enhances data availability for research. To ensure its usefulness, we test synthetic data from multiple GANs and employ different data enhancement strategies on an actual stress detection task. Our GAN-based augmentation methods demonstrate significant improvements in model performance, with private DP training scenarios observing an 11.90-15.48% increase in F1-score, while non-private training scenarios still see a 0.45% boost. These results underline the potential of differentially private synthetic data in optimizing utility-privacy trade-offs, especially with the limited availability of real training samples. Through rigorous quality assessments, we confirm the integrity and plausibility of our synthetic data, which, however, are significantly impacted when increasing privacy requirements.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to generate synthetic health sensor data for wearable device stress detection while protecting privacy. Specifically, the researchers face the following challenges: 1. **Sensitivity of medical data**: Health sensor data collected by wearable devices such as smart watches usually contains personal sensitive information. Directly using this data for research may lead to the risk of privacy leakage. 2. **Difficulty in data acquisition**: High - quality real - medical - data acquisition is costly and resource - intensive, which limits its wide application in research. 3. **Trade - off between privacy protection and data utility**: How to maintain or improve the quality of data used for training machine - learning models while ensuring user privacy. To solve these problems, the paper proposes a method based on Generative Adversarial Networks (GANs) and Differential Privacy (DP) to generate synthetic multi - modal time - series data. This method can not only protect patient information but also increase the amount of data available for research, thus optimizing the trade - off between privacy protection and data utility. ### Main contributions 1. **Generate synthetic multi - modal time - series data**: By training the GAN model, generate synthetic data similar to the real smart - watch health - sensor data. Each data point represents a stressed or non - stressed moment and has a corresponding label. 2. **Ensure the authenticity and privacy of data**: The generated synthetic data is close to the original distribution and can effectively expand or replace the existing limited data set while providing privacy guarantees. 3. **Improve the performance of privacy - protection models**: The stress - detection model trained with synthetic data significantly improves the model performance under privacy - protection conditions (such as DP - training scenarios). For example, in terms of F1 - score, the improvement in the private DP - training scenario is from 11.90% to 15.48%, and there is also a 0.45% improvement in the non - private training scenario. 4. **Promote practical applications**: This method makes stress detection through smart watches possible while protecting user privacy, enabling the generated health data to be freely used in a wider user group and enhancing research capabilities. ### Key technologies - **Generative Adversarial Networks (GANs)**: Used to capture the statistical distribution of a given data set and generate new synthetic data samples. - **Differential Privacy (DP)**: Protects the privacy of individual data points by introducing controllable noise, ensuring that even the addition or deletion of a single data point will not significantly affect the statistical results. - **Differential Privacy Stochastic Gradient Descent (DP - SGD)**: A modified stochastic - gradient - descent optimization method that achieves differential privacy by introducing noise in the gradient calculation. Through these technologies, the paper successfully solves the problem of generating high - quality synthetic health - sensor data under the premise of privacy protection and demonstrates its effectiveness in stress - detection tasks.

Generating Synthetic Health Sensor Data for Privacy-Preserving Wearable Stress Detection

Generating Synthetic Health Sensor Data for Privacy-Preserving Wearable Stress Detection

Generating Synthetic Mixed-Type Longitudinal Electronic Health Records for Artificial Intelligent Applications

Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy

A Conditional GAN for Generating Time Series Data for Stress Detection in Wearable Physiological Sensor Data

Protect and Extend -- Using GANs for Synthetic Data Generation of Time-Series Medical Records

GANs for Enhancing Wearable Biosensor Data Accuracy

Privacy Risk Assessment for Synthetic Longitudinal Health Data

Evaluating Differentially Private Synthetic Data Generation in High-Stakes Domains

Towards Generating Realistic Wrist Pulse Signals Using Enhanced One Dimensional Wasserstein GAN

Privacy-Preserving Synthetic Smart Meters Data

Patient-centric synthetic data generation, no reason to risk re-identification in biomedical data analysis

Generative models for wearables data

Anonymization Through Data Synthesis Using Generative Adversarial Networks (ADS-GAN)

On the Trade-Off between Fidelity, Utility and Privacy of Synthetic Patient Data

EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing

Data Augmentation of Wrist Pulse Signal for Traditional Chinese Medicine Using Wasserstein GAN

An Explainable Deep Learning Approach for Stress Detection in Wearable Sensor Measurements

Generating high-fidelity synthetic patient data for assessing machine learning healthcare software

Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network

Privacy-hardened and hallucination-resistant synthetic data generation with logic-solvers