Automatic Taxonomy Classification by Pretrained Language Model
Ayato Kuwana,Atsushi Oba,Ranto Sawai,Incheon Paik
DOI: https://doi.org/10.3390/electronics10212656
IF: 2.9
2021-10-29
Electronics
Abstract:In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generation comprising two-stage learning. Our results showed that our best method improved accuracy by over 12.5%. As an application example, we applied our model to the Stanford Question Answering Dataset to show ontology generation in a real field. The results showed that our model can generate a good ontology, with some exceptions in the real field, indicating future research directions to improve the quality.
engineering, electrical & electronic,computer science, information systems,physics, applied