appjsonify: An Academic Paper PDF-to-JSON Conversion Toolkit

Atsuki Yamaguchi,Terufumi Morishita
2023-10-03
Abstract:We present appjsonify, a Python-based PDF-to-JSON conversion toolkit for academic papers. It parses a PDF file using several visual-based document layout analysis models and rule-based text processing approaches. appjsonify is a flexible tool that allows users to easily configure the processing pipeline to handle a specific format of a paper they wish to process. We are publicly releasing appjsonify as an easy-to-install toolkit available via PyPI and GitHub.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?