Multi-objective Feature Selection in Remote Health Monitoring Applications

Le Ngu Nguyen,Constantino Álvarez Casado,Manuel Lage Cañellas,Anirban Mukherjee,Nhi Nguyen,Dinesh Babu Jayagopi,Miguel Bordallo López
2024-01-11
Abstract:Radio frequency (RF) signals have facilitated the development of non-contact human monitoring tasks, such as vital signs measurement, activity recognition, and user identification. In some specific scenarios, an RF signal analysis framework may prioritize the performance of one task over that of others. In response to this requirement, we employ a multi-objective optimization approach inspired by biological principles to select discriminative features that enhance the accuracy of breathing patterns recognition while simultaneously impeding the identification of individual users. This approach is validated using a novel vital signs dataset consisting of 50 subjects engaged in four distinct breathing patterns. Our findings indicate a remarkable result: a substantial divergence in accuracy between breathing recognition and user identification. As a complementary viewpoint, we present a contrariwise result to maximize user identification accuracy and minimize the system's capacity for breathing activity recognition.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in remote health monitoring applications, how to use multi - objective feature selection methods to improve the accuracy of respiratory pattern recognition while reducing the accuracy of user identification, thereby protecting user privacy. Specifically, the paper aims to solve the following problems: 1. **Multi - task conflict**: In non - contact health monitoring, some features may be very useful for one task (such as activity recognition), but at the same time will inadvertently help another sensitive task (such as user identification), which may violate user privacy. 2. **Multi - objective optimization of feature selection**: Traditional feature selection methods usually focus only on a single objective (such as maximizing classification accuracy) and cannot handle multiple conflicting objectives. Therefore, a method that can optimize multiple objectives simultaneously is needed, for example, improving the accuracy of respiratory pattern recognition while reducing the accuracy of user identification. 3. **Balancing convenience and privacy**: In practical applications, non - contact health monitoring technology provides great convenience, but also brings potential privacy problems. How to ensure effective protection of user privacy without sacrificing convenience is an important research topic. ### Main contributions of the paper 1. **Multi - objective optimization framework**: The paper proposes a bio - inspired multi - objective genetic algorithm for feature selection. This method can improve the accuracy of respiratory pattern recognition while reducing the accuracy of user identification, thereby achieving privacy protection. 2. **Validation with a new dataset**: The paper validates the proposed method on a new dataset containing 50 subjects and four different respiratory patterns, showing significant performance differences between different tasks. 3. **Parameter exploration and model optimization**: The paper explores the key parameters in the genetic algorithm (such as population size and number of iterations), and proposes using an efficient classification model to accelerate the calculation of the fitness function to improve the efficiency of the feature selection process. 4. **Application flexibility**: The proposed multi - objective feature selection method can be customized according to user needs. For example, the system can be configured to provide high - precision respiratory activity recognition without revealing the user's identity, or focus on detecting the presence of a specific user without exposing their activities. ### Formula representation The formulas involved in the paper are as follows: - **Accuracy rate of respiratory pattern recognition**: \[ O_1 = a_R=\frac{1}{K}\sum_{i = 1}^{K}\delta(y_i,\hat{y}_i) \] where \(K\) is the total number of samples, \(y_i\) is the true label of sample \(i\), \(\hat{y}_i\) is the predicted label, and \(\delta(y_i,\hat{y}_i)\) is 1 if the prediction is correct and 0 otherwise. - **Error rate of user identification**: \[ O_2 = 1 - a_I \] where \(a_I\) is the accuracy rate of the user identification model. - **Performance difference between the two models**: \[ O_3=a_R - a_I \] - **Definition of the multi - objective optimization problem**: \[ \hat{s}=\arg\max_{s\subset F}(O_1,O_2,O_3) \] where \(F\) is the feature set and \(s\) is the selected feature subset. Through these formulas, the paper constructs a multi - objective optimization problem and uses the NSGA - II algorithm to solve it to find the optimal feature subset.