Abstract:The ability to take into account the characteristics - also called features - of observations is essential in Natural Language Processing (NLP) problems. Hidden Markov Chain (HMC) model associated with classic Forward-Backward probabilities cannot handle arbitrary features like prefixes or suffixes of any size, except with an independence condition. For twenty years, this default has encouraged the development of other sequential models, starting with the Maximum Entropy Markov Model (MEMM), which elegantly integrates arbitrary features. More generally, it led to neglect HMC for NLP. In this paper, we show that the problem is not due to HMC itself, but to the way its restoration algorithms are computed. We present a new way of computing HMC based restorations using original Entropic Forward and Entropic Backward (EFB) probabilities. Our method allows taking into account features in the HMC framework in the same way as in the MEMM framework. We illustrate the efficiency of HMC using EFB in Part-Of-Speech Tagging, showing its superiority over MEMM based restoration. We also specify, as a perspective, how HMCs with EFB might appear as an alternative to Recurrent Neural Networks to treat sequential data with a deep architecture.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to enable the Hidden Markov Chain (HMC) model to effectively handle arbitrary features, so as to achieve better performance in Natural Language Processing (NLP) tasks**. Specifically, the paper points out that the traditional HMC model and its classic Forward - Backward (FB) algorithm cannot handle arbitrary features (such as affixes, word length, etc.) very well, unless it is assumed that these features are independent of each other. However, in NLP tasks, this independence assumption usually does not hold, so the HMC model performs poorly in text segmentation tasks. To solve this problem, the author proposes a new calculation method, that is, the HMC model based on **Entropic Forward - Backward (EFB)**. This method allows the HMC model to use arbitrary features as flexibly as the Maximum Entropy Markov Model (MEMM), without relying on the independence assumption. In addition, the paper also experimentally proves the superiority of the EFB - based HMC model in the Part - Of - Speech Tagging (POS Tagging) task and explores its potential as an alternative to Recurrent Neural Networks (RNN). ### Key Point Summary: 1. **Problem Background**: - Traditional HMC models and their algorithms (such as Viterbi and FB) perform poorly in NLP tasks because they cannot flexibly handle arbitrary features. - This limitation has prompted researchers to develop other models (such as MEMM and RNN) to make up for the shortcomings of HMC. 2. **Solution**: - Propose a new HMC model based on EFB probability, which solves the limitation of feature processing. - The new method allows the HMC model to directly utilize arbitrary features without the independence assumption. 3. **Experimental Results**: - In the POS tagging task, the EFB - based HMC model performs better than the MEMM model, especially when dealing with unknown words, the effect is significantly improved. 4. **Future Prospects**: - Discuss the possibility of expanding the HMC model to handle deep - sequence data, which may become an alternative to RNN. Through this research, the paper demonstrates the potential of the HMC model in the NLP field and provides a new direction for future research.

Hidden Markov Chains, Entropic Forward-Backward, and Part-Of-Speech Tagging

Introducing the Hidden Neural Markov Chain framework

Highly Fast Text Segmentation With Pairwise Markov Chains

Expressivity of Hidden Markov Chains vs. Recurrent Neural Networks from a system theoretic viewpoint

Diversified Hidden Markov Models for Sequential Labeling

Forward-Backward Latent State Inference for Hidden Continuous-Time semi-Markov Chains

Comparative Analysis of Hidden Markov Model and Bidirectional Long Short-Term Memory for POS Tagging in Eastern Armenian

Experimental Study of Hidden Markov Model Based Part-of-speech Tagging for Chinese Texts

Disentangled Sticky Hierarchical Dirichlet Process Hidden Markov Model

Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models

Thermal characteristics of Spirulina platensis cells under nongrowing conditions at various values of pH medium.

Bidirectional LSTM-CRF Models for Sequence Tagging

Sentiment analysis incorporating convolutional neural network into hidden Markov model

Part-of-Speech Tagging for Historical English

A hidden Markov optimization model for processing and recognition of English speech feature signals

End-to-End Training of a Neural HMM with Label and Transition Probabilities

A New Method in Hidden Markov Model for Modeling Frame Correlation

The Use of Hidden Markov Model in Natural ARABIC Language Processing: a survey

A maximum entropy approach to adaptive statistical language modelling

Reduction of Maximum Entropy Models to Hidden Markov Models

Efficient Ensemble for Multimodal Punctuation Restoration using Time-Delay Neural Network