Abstract:With the enrichment of literature resources, researchers are facing the growing problem of information explosion and knowledge overload. To help scholars retrieve literature and acquire knowledge successfully, clarifying the semantic structure of the content in academic literature has become the essential research question. In the research on identifying the structure function of chapters in academic articles, only a few studies used the deep learning model and explored the optimization for feature input. This limits the application, optimization potential of deep learning models for the research task. This paper took articles of the ACL conference as the corpus. We employ the traditional machine learning models and deep learning models to construct the classifiers based on various feature input. Experimental results show that (1) Compared with the chapter content, the chapter title is more conducive to identifying the structure function of academic articles. (2) Relative position is a valuable feature for building traditional models. (3) Inspired by (2), this paper further introduces contextual information into the deep learning models and achieved significant results. Meanwhile, our models show good migration ability in the open test containing 200 sampled non-training samples. We also annotated the ACL main conference papers in recent five years based on the best practice performing models and performed a time series analysis of the overall corpus. This work explores and summarizes the practical features and models for this task through multiple comparative experiments and provides a reference for related text classification tasks. Finally, we indicate the limitations and shortcomings of the current model and the direction of further optimization.

Academic conference homepage understanding using constrained hierarchical conditional random fields.

Joint CRF and Locality-Consistent Dictionary Learning for Semantic Segmentation.

Extracting Academic Information from Conference Web Pages

Enhancing Semantic Web by Semantic Annotation: Experiences in Building an Automatic Conference Calendar.

Tree-structured conditional random fields for semantic annotation

Reliable Academic Conference Question Answering: A Study Based on Large Language Model

2D Conditional Random Fields for Web Information Extraction

Dynamic Hierarchical Markov Random Fields and Their Application to Web Data Extraction

Chunk Parsing and Entity Relation Extracting to Chinese Text by Using Conditional Random Fields Model

Simultaneous record detection and attribute labeling in web data extraction.

Closing the Loop in Webpage Understanding

A Hierarchical Multi-Task Learning Framework for Semantic Annotation in Tabular Data

Social Network Extraction of Academic Researchers

Semi-supervised learning for image annotation based on conditional random fields

Automatic Identification of Concurrent Structure Based on Conditional Random Field

Interactive Assigning of Conference Sessions with Visualization and Topic Modeling

Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction

Automatic Image Annotation Based on Wordnet and Hierarchical Ensembles

Discovering Visual Concept Structure with Sparse and Incomplete Tags

Enhancing Identification of Structure Function of Academic Articles Using Contextual Information

Hierarchical Multi-label Text Classification: Self-adaption Semantic Awareness Network Integrating Text Topic and Label Level Information