HWD: A Novel Evaluation Score for Styled Handwritten Text Generation

Vittorio Pippi,Fabio Quattrini,Silvia Cascianelli,Rita Cucchiara
2023-10-31
Abstract:Styled Handwritten Text Generation (Styled HTG) is an important task in document analysis, aiming to generate text images with the handwriting of given reference images. In recent years, there has been significant progress in the development of deep learning models for tackling this task. Being able to measure the performance of HTG models via a meaningful and representative criterion is key for fostering the development of this research topic. However, despite the current adoption of scores for natural image generation evaluation, assessing the quality of generated handwriting remains challenging. In light of this, we devise the Handwriting Distance (HWD), tailored for HTG evaluation. In particular, it works in the feature space of a network specifically trained to extract handwriting style features from the variable-lenght input images and exploits a perceptual distance to compare the subtle geometric features of handwriting. Through extensive experimental evaluation on different word-level and line-level datasets of handwritten text images, we demonstrate the suitability of the proposed HWD as a score for Styled HTG. The pretrained model used as backbone will be released to ease the adoption of the score, aiming to provide a valuable tool for evaluating HTG models and thus contributing to advancing this important research area.
Computer Vision and Pattern Recognition,Digital Libraries
What problem does this paper attempt to address?
The paper aims to address the evaluation problem in the task of Handwritten Text Generation (Styled HTG). Specifically, existing evaluation methods (such as Fréchet Inception Distance, FID) have limitations when assessing the quality of handwritten style generation, as these methods mainly focus on the overall appearance of the image rather than the specific characteristics of the handwritten style. To solve this problem, the authors propose a new evaluation metric—Handwriting Distance (HWD). The main features of HWD include: 1. **Domain-Specific Feature Extraction**: Using a convolutional network pre-trained on a synthetic handwritten text image dataset to extract features, instead of using a general natural image dataset (such as ImageNet) for pre-training. 2. **Perceptual Distance**: Employing Euclidean distance to measure the perceptual difference between generated handwritten images and real handwritten images, rather than using distribution-based methods. 3. **Handling Variable-Length Images**: Capable of handling text images of different lengths, avoiding information loss caused by evaluating only a portion of the image. 4. **Numerical Stability**: Maintaining numerical stability even with a limited number of samples. Through experiments on multiple datasets, HWD demonstrates its superior performance in evaluating the task of handwritten style generation and captures the subtle differences in handwritten styles better than existing evaluation methods (such as FID). Additionally, HWD shows better stability and consistency across datasets of different scales, making it a valuable tool for evaluating handwritten text generation models.