Abstract:Conditional random fields (CRFs) have been shown to be one of the most successful approaches to sequence labeling. Various linear-chain neural CRFs (NCRFs) are developed to implement the non-linear node potentials in CRFs, but still keeping the linear-chain hidden structure. In this paper, we propose NCRF transducers, which consists of two RNNs, one extracting features from observations and the other capturing (theoretically infinite) long-range dependencies between labels. Different sequence labeling methods are evaluated over POS tagging, chunking and NER (English, Dutch). Experiment results show that NCRF transducers achieve consistent improvements over linear-chain NCRFs and RNN transducers across all the four tasks, and can improve state-of-the-art results.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is that existing Linear - chain Conditional Random Fields (Linear - chain CRFs) and their neural network extended versions (NCRFs) can only capture the first - order dependency relationships between labels when handling sequence - labeling tasks, while ignoring potential long - distance dependency relationships. This limitation may lead to poor performance in practical applications. Specifically, the author proposes a new model - **Neural CRF Transducers (NCRF transducers)**. This model improves existing methods by introducing two Recurrent Neural Networks (RNNs): 1. **Feature - extraction RNN**: Extracts features from the input sequence. 2. **Prediction RNN**: Captures long - distance dependency relationships between labels. In this way, the NCRF transducer can effectively model long - distance dependency relationships between labels while maintaining global normalization, thereby improving performance on multiple sequence - labeling tasks. ### Main contributions - **Introduction of long - distance dependency modeling**: Compared with linear - chain NCRFs, the NCRF transducer can capture long - distance dependency relationships between labels and can theoretically model dependencies of infinite length. - **Global normalization**: Unlike locally - normalized RNN transducers, the NCRF transducer is globally - normalized, avoiding label - bias and exposure - bias problems. - **Experimental verification**: Through Part - of - Speech (POS) tagging, chunking, and Named Entity Recognition (NER) tasks in English and Dutch, it is proved that the NCRF transducer has consistent improvements on these tasks and has achieved state - of - the - art results. ### Experimental results The experimental results show that the performance of the NCRF transducer is better than that of linear - chain NCRFs and RNN transducers in all four tasks, especially achieving a significant improvement in the Named Entity Recognition task. For example, in the CoNLL - 2003 English NER task, the NCRF transducer has achieved an F1 score of 92.36, exceeding the previous best result. Overall, this paper aims to improve existing sequence - labeling methods by introducing long - distance dependency modeling and global normalization, thereby enhancing the performance of the model in various natural language processing tasks.

Neural CRF transducers for sequence labeling

NCRF++: an Open-source Neural Sequence Labeling Toolkit.

Citation Metadata Extraction Via Deep Neural Network-based Segment Sequence Labeling

Hybrid Semi-Markov CRF for Neural Sequence Labeling.

Embedded-State Latent Conditional Random Fields for Sequence Labeling

Upgrading CRFS to JRFS and Its Benefits to Sequence Modeling and Labeling.

Neural Latent Dependency Model for Sequence Labeling

Sequence Classification with Neural Conditional Random Fields

Sparse Higher Order Conditional Random Fields for Improved Sequence Labeling.

Label Attention Network for Structured Prediction

Gradual Transition Detection with Conditional Random Fields.

Sequence Transduction with Graph-based Supervision

Chinese Named Entity Recognition with the Improved Smoothed Conditional Random Fields

Analyzing Sequence Data Based on Conditional Random Fields with Co-training

Regular-pattern-sensitive CRFs for Distant Label Interactions

Bidirectional LSTM-CRF Models for Sequence Tagging

Recognizing Biomedical Named Entities Using Skip-Chain Conditional Random Fields.

A CRFS-based joint labeling approach for cascaded tasks

ASRNN: A recurrent neural network with an attention model for sequence labeling

A New Recurrent Neural CRF for Learning Non-linear Edge Features

Automatic Indexing Model Based on Conditional Random Fields