Automating Manga Character Analysis: A Robust Deep Vision-Transformer Approach to Facial Landmark Detection
Sirawich Vachmanus,Noppanan Phinklao,Naruparn Phongsarnariyakul,Thanat Plongcharoen,Seiji Hotta,Suppawong Tuarob
DOI: https://doi.org/10.1109/access.2024.3459419
IF: 3.9
2024-09-27
IEEE Access
Abstract:Comics, particularly Japanese manga, are a powerful medium that blends images and text to convey ideas and encapsulate a unique cultural heritage. Going beyond mere entertainment, manga merges diverse styles and content deeply rooted in Japanese cultural heritage. This study utilizes computer vision analysis, with a specific focus on facial landmark detection, acknowledging the growing significance of technology in analyzing manga images. Through a comprehensive exploration of various methods, the research identifies the extended version of Bidirectional Encoder Representations from Transformers (BERT), BERT Pre-Training of Image Transformers (BEiT), model as a standout performer due to its efficiency and effectiveness. The BEiT model's success lies in its ability to extract facial features, consequently establishing itself as a go-to solution for landmark detection on manga faces. The outcomes achieved the lowest Failure Rate compared to other landmark detection networks, with a Failure Rate of approximately 9.4% and a Mean Average Error of about 4.6 pixels. Beyond its technical accomplishments, this study carries a cultural significance, contributing to the ongoing narrative of manga in Japan.
computer science, information systems,telecommunications,engineering, electrical & electronic