A hybrid approach of Poisson distribution LDA with deep Siamese Bi-LSTM and GRU model for semantic similarity prediction for text data

Viji, D.,Revathy, S.
DOI: https://doi.org/10.1007/s11042-023-15050-4
IF: 2.577
2023-03-19
Multimedia Tools and Applications
Abstract:Prediction of semantic similarity between text data is an open and challenging research issue in the NLP-Natural Language-processing field. Traditional semantic text-similarity techniques capturing text lexical features neglect syntactic and semantic text properties and are exhibited with higher dimensions of feature vectors. To overcome these issues, the present study aims to develop a hybrid approach integrating Deep Siamese Bi-LSTM-Bidirectional Long-short term Memory network and GRU-Gated Recurrent-Unit neural network training model. The proposed model is employed in the weight estimation of vectors and minimizing feature vector dimension before the training phases. Initially, Pre-processing phase, eliminates special characters from text form, converting them to feature vectors through vectorization and weight values are updated using Weighted TF-IDF - Term Frequency Inverse-Document Frequency aided by the log-likelihood Weight calculation method. The Poisson Normal LDA-Linear-discriminant analysis technique reduced the dimensions of the feature vector. Such embedded vectors as weight values are fed into the training model, wherein the trained model estimates similarity scores of input data and performs text classification using Deep Siamese Bi-LSTM and GRU classifiers. The proposed model undergoes performance assessment by attaining 19% improved accuracy rate by using STS Dataset than the existing methods. The model also showed better results for the other datasets. The higher accuracy and F1 score elucidated the efficiency of the proposed framework.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?