Abstract:Text categorization has become an increasingly important issue for businesses that handle massive volumes of data generated online, and it has found substantial use in the field of NLP. The capacity to group texts into separate categories is crucial for users to effectively retain and utilize important information. Our goal is to improve upon existing recurrent neural network (RNN) techniques for text classification by creating a deep learning strategy through our study. Raising the quality of the classifications made is the main difficulty in text classification, nevertheless, as the overall efficacy of text classification is often hampered by the data semantics' inadequate context sensitivity. Our study presents a unified approach to examine the effects of word embedding and the GRU on text classification to address this difficulty. In this study, we use the TREC standard dataset. RCNN has four convolution layers, four LSTM levels, and two GRU layers. RNN, on the other hand, has four GRU layers and four LSTM levels. One kind of recurrent neural network (RNN) that is well-known for its comprehension of sequential data is the gated recurrent unit (GRU). We found in our tests that words with comparable meanings are typically found near each other in embedding spaces. The trials' findings demonstrate that our hybrid GRU model is capable of efficiently picking up word usage patterns from the provided training set. Remember that the depth and breadth of the training data greatly influence the model's effectiveness. Our suggested method performs remarkably well when compared to other well-known recurrent algorithms such as RNN, MV-RNN, and LSTM on a single benchmark dataset. In comparison to the hybrid GRU's F-measure 0.952, the proposed model's F-measure is 0.982%. We compared the performance of the proposed method to that of the three most popular recurrent neural network designs at the moment RNNs, MV-RNNs, and LSTMs, and found that the new method achieved better results on two benchmark datasets, both in terms of accuracy and error rate.

Continuous-bag-of-words and Skip-gram for word vector training and text classification

Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings

Knowledge-based Document Embedding for Cross-Domain Text Classification

Mining Coherent Topics in Documents Using Word Embeddings and Large-Scale Text Data

Electronic Medical Data Analysis Based on Word Vector and Deep Learning Model

Bag-of-Embeddings for Text Classification.

From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models

A Study of Text Vectorization Method Combining Topic Model and Transfer Learning

AI-based NLP section discusses the application and effect of bag-of-words models and TF-IDF in NLP tasks

Word Vector Enrichment of Low Frequency Words in the Bag-of-Words Model for Short Text Multi-class Classification Problems

Enriching Word Vectors with Subword Information

To Know by the Company Words Keep and What Else Lies in the Vicinity

How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection

WordNet-based Concept Vector Space Model for Text Classification

Improving Word Vector Model with Part-of-speech and Dependency Grammar Information

A Unified Understanding of Deep NLP Models for Text Classification

Text Classification: A Perspective of Deep Learning Methods

An Intelligent CNN-VAE Text Representation Technology Based on Text Semantics for Comprehensive Big Data

Vietnamese Text Classification Algorithm using Long Short Term Memory and Word2Vec

A Hybrid Deep Learning GRU based Approach for Text Classification using Word Embedding

Chinese text classification by combining Chinese-BERTology-wwm and GCN