Abstract:Conditional Random Field (CRF) based neural models are among the most performant methods for solving sequence labeling problems. Despite its great success, CRF has the shortcoming of occasionally generating illegal sequences of tags, e.g. sequences containing an "I-" tag immediately after an "O" tag, which is forbidden by the underlying BIO tagging scheme. In this work, we propose Masked Conditional Random Field (MCRF), an easy to implement variant of CRF that impose restrictions on candidate paths during both training and decoding phases. We show that the proposed method thoroughly resolves this issue and brings consistent improvement over existing CRF-based models with near zero additional cost.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the Conditional Random Field (CRF) model, illegal label sequences that violate the rules of label encoding schemes such as BIO or BIOES are occasionally generated. For example, in the BIO encoding scheme, it is not allowed for an "O" label to be directly followed by an "I -" label. Such illegal paths will not only lead to prediction errors but also affect the overall performance of the model. ### Specific manifestations of the problem 1. **Definition of illegal paths**: - In the BIO encoding scheme, certain label transitions are prohibited, such as "O I - LOC". - In the BIOES encoding scheme, the label transition rules are more stringent. For example, an "I -" label must be followed by a "B -" or "I -" label of the same type and must be ended by an "E -" label. 2. **Shortcomings of existing methods**: - Existing methods usually rely on manually - designed post - processing steps to repair illegal paths, such as relabeling illegal fragments. - This treatment method is arbitrary and will lead to sub - optimal performance. ### Solution To solve the above problems, the paper proposes Masked Conditional Random Field (MCRF), that is, masked conditional random field. MCRF fundamentally avoids the generation of illegal paths by imposing restrictions on candidate paths during the training and decoding stages. ### Main improvement points of MCRF 1. **Training stage**: - Modify the loss function so that only legal paths are normalized, thereby avoiding the influence of illegal paths. - The new loss function is: \[ L'(W, A) := -\frac{1}{|S|} \sum_{(x,y) \in S} \log \frac{\exp(s(y,x))}{\sum_{p \in P/I} \exp(s(p,x))} \] - Where \(P/I\) represents the space of all legal paths. 2. **Decoding stage**: - When decoding, only search for the optimal path within the legal path space. - The optimal path is: \[ y'_{\text{opt}} = \arg\max_{p \in P/I} s(p, x_{\text{test}}, W'_{\text{opt}}, A'_{\text{opt}}) \] 3. **Implementation details**: - Use a mask matrix \(\bar{A}(c)\) to mask illegal transitions, where \(c \ll 0\) is a very small constant. - After each parameter update, keep the weight of illegal transitions as \(c\). ### Experimental results The paper verifies the effectiveness of MCRF through experiments on multiple datasets: - **Chinese Named Entity Recognition (NER)**: MCRF has achieved new best results on multiple Chinese NER datasets. - **Slot - filling task**: MCRF significantly outperforms the baseline model on the ATIS and SNIPS datasets. - **Chunking task**: MCRF also performs well on the CoNLL2000 chunking task. In conclusion, MCRF completely solves the illegal path problem and significantly improves the performance of the model by introducing a path - masking mechanism during the training and decoding stages.

Masked Conditional Random Fields for Sequence Labeling

Sequence Classification with Neural Conditional Random Fields

Embedded-State Latent Conditional Random Fields for Sequence Labeling

Neural CRF transducers for sequence labeling

Automatic Indexing Model Based on Conditional Random Fields

Upgrading CRFS to JRFS and Its Benefits to Sequence Modeling and Labeling.

Hybrid Semi-Markov CRF for Neural Sequence Labeling.

Sparse Higher Order Conditional Random Fields for Improved Sequence Labeling.

Analyzing Sequence Data Based on Conditional Random Fields with Co-training

Citation Metadata Extraction Via Deep Neural Network-based Segment Sequence Labeling

Uncertainty-Aware Sequence Labeling

Uncertainty-Aware Label Refinement for Sequence Labeling

Conditional Random Fields for Image Labeling

Segment-Level Sequence Modeling Using Gated Recursive Semi-Markov Conditional Random Fields

Label Attention Network for Structured Prediction

Bidirectional LSTM-CRF Models for Sequence Tagging

Neural Latent Dependency Model for Sequence Labeling

NCRF++: an Open-source Neural Sequence Labeling Toolkit.

A Chinese Part-of-speech Tagging Approach Using Conditional Random Fields

Gradual Transition Detection with Conditional Random Fields.

Chinese Named Entity Recognition with the Improved Smoothed Conditional Random Fields