Comic MTL: optimized multi-task learning for comic book image analysis

Nhu-Van Nguyen,Christophe Rigaud,Jean-Christophe Burie
DOI: https://doi.org/10.1007/s10032-019-00330-3
2019-07-17
International Journal on Document Analysis and Recognition (IJDAR)
Abstract:Comic book image analysis methods often propose multiple algorithms or models for multiple tasks like panel and character (body and face) detection, balloon segmentation, text recognition, etc. In this work, we aim to reduce the processing time for comic book image analysis by proposing one model that can learn multiple tasks called Comic MTL instead of using one model per task. In addition to detection and segmentation tasks, we integrate the relation analysis task for balloons and characters into the Comic MTL model. The experiments are carried out on DCM772 and eBDtheque public datasets that contain the annotations for panels, balloons, characters and also the associations between balloon and character. We show that the Comic MTL model can detect the associations between balloons and their speakers (comic characters) and handle other tasks like panel and character detection and also balloons segmentation with promising results.
What problem does this paper attempt to address?