Abstract:In this research, we consider the problem of verifying user identity based on keystroke dynamics obtained from free-text. We employ a novel feature engineering method that generates image-like transition matrices. For this image-like feature, a convolution neural network (CNN) with cutout achieves the best results. A hybrid model consisting of a CNN and a recurrent neural network (RNN) is also shown to outperform previous research in this field.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is user authentication based on keystroke dynamics of free - text. Specifically, the author proposes a new feature - engineering technique, which converts keyboard - input behaviors into a feature matrix in the form of an image, and uses a hybrid model of convolutional neural network (CNN) and recurrent neural network (RNN) to improve the accuracy of authentication.
### Main Problems of the Paper
1. **Fixed - text vs. Free - text**
- Early research mainly focused on keystroke dynamics of fixed - text, that is, the content input by the user is the same each time. However, in practical applications, free - text input is more common, which increases the difficulty of feature extraction.
2. **Challenges in Feature Extraction**
- For free - text input, the number of useful features may vary depending on different input sequences. In addition, the optimal length of the keyboard - event sequence needs to be considered to balance the relationship between processing speed and feature richness.
3. **Improvement of Model Performance**
- Previous methods have limited effectiveness in dealing with keystroke dynamics of free - text, so new methods need to be explored to improve the performance of the model.
### Solutions
To address the above challenges, the author proposes the following solutions:
1. **New Feature - engineering Technique**
- Organize the time features of keyboard events into a multi - channel transition matrix similar to an image. Each channel represents a time feature (such as key - press duration, key - press interval, etc.), and reduce sparsity by averaging repeatedly occurring key pairs.
2. **Application of Deep - learning Model**
- Use a convolutional neural network (CNN) to process the feature matrix in the form of an image, and introduce the cutout regularization technique to prevent over - fitting.
- Combine a hybrid model of convolutional neural network and gated recurrent unit (GRU) to better capture the temporal dependence in sequence data.
3. **Experiment and Evaluation**
- Conduct extensive experiments on two publicly available free - text keystroke - dynamics datasets (Buffalo and Clarkson II), evaluate the performance of different models, and analyze the influence of various hyper - parameters.
### Main Contributions
- Propose a new feature - engineering method to organize keystroke - dynamics features into a matrix in the form of an image.
- Analyze the application of cutout regularization in image processing.
- Systematically study the influence of different hyper - parameters (such as the length of the keyboard - event sequence) on the performance of the model.
Through these methods, the author significantly improves the performance of authentication based on keystroke dynamics of free - text, providing a valuable reference for future related research.