Historical Printed Ornaments: Dataset and Tasks

Sayan Kumar Chaki,Zeynep Sonat Baltaci,Elliot Vincent,Remi Emonet,Fabienne Vial-Bonacci,Christelle Bahier-Porte,Mathieu Aubry,Thierry Fournel
2024-08-16
Abstract:This paper aims to develop the study of historical printed ornaments with modern unsupervised computer vision. We highlight three complex tasks that are of critical interest to book historians: clustering, element discovery, and unsupervised change localization. For each of these tasks, we introduce an evaluation benchmark, and we adapt and evaluate state-of-the-art models. Our Rey's Ornaments dataset is designed to be a representative example of a set of ornaments historians would be interested in. It focuses on an XVIIIth century bookseller, Marc-Michel Rey, providing a consistent set of ornaments with a wide diversity and representative challenges. Our results highlight the limitations of state-of-the-art models when faced with real data and show simple baselines such as k-means or congealing can outperform more sophisticated approaches on such data. Our dataset and code can be found at <a class="link-external link-https" href="https://printed-ornaments.github.io/" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve three key problems in the analysis of historical printing ornaments, which are especially important for book historians. Specifically: 1. **Clustering**: - The goal is to classify the ornamental images printed from woodcuts. Since historical ornamental images may have many similar but slightly different versions, how to accurately classify these images into different clusters is a challenge. - For example, different print runs of the same ornamental pattern may vary due to ink, paper quality, or changes during the printing process. 2. **Element Discovery**: - The goal is to automatically identify and segment different visual elements (vignettes) in composite ornamental images. Composite ornaments are usually composed of multiple different types of ornamental patterns, and each element may be an independent visual unit. - The difficulty of this task lies in the fact that these elements are often closely arranged or even connected to each other, making them difficult to separate and identify. 3. **Unsupervised Change Localization**: - The goal is to detect which pixels have changed in a series of ornamental images. For example, there may be subtle changes, such as ink distribution or wear, between different print versions of the same ornamental pattern. - The difficulty of this task is that the changes in practical applications are often very subtle and complex, and the algorithm needs to be able to distinguish between normal printing differences and real changes. To evaluate these tasks, the author constructed a dataset named "Rey’s Ornaments", which is based on the ornamental patterns in the books of the 18th - century publisher Marc - Michel Rey. Through this dataset, the author tested several state - of - the - art models and demonstrated the limitations of these models when dealing with real - world data. For example, simple baseline methods (such as k - means) can sometimes outperform more complex deep - learning methods, which indicates the shortcomings of existing benchmarks and the importance of designing new algorithms. ### Formula Representation - **Clustering Performance Evaluation**: - Normalized Mutual Information (NMI) and Accuracy are used to evaluate the clustering effect. \[ \text{NMI}(X, Y)=\frac{\text{I}(X; Y)}{\sqrt{\text{H}(X)\cdot\text{H}(Y)}} \] where \(\text{I}(X; Y)\) is the mutual information, and \(\text{H}(X)\) and \(\text{H}(Y)\) are the entropies of \(X\) and \(Y\) respectively. - **Element Discovery Evaluation**: - Mean Average Precision (mAP) is used to evaluate the effect of element discovery. \[ \text{mAP}=\frac{1}{N}\sum_{i = 1}^{N}\text{AP}_i \] where \(N\) is the number of classes, and \(\text{AP}_i\) is the average precision of the \(i\)-th class. - **Change Localization Evaluation**: - Intersection over Union (IoU) is used to evaluate the effect of change localization. \[ \text{IoU}=\frac{|A\cap B|}{|A\cup B|} \] where \(A\) and \(B\) are the predicted and annotated change regions respectively. Through these evaluation metrics, the author demonstrated the limitations of existing methods in dealing with real historical ornamental data and emphasized the necessity of further research and algorithm improvement.