Abstract:A document's keywords provide high-level descriptions of the content that summarize the document's central themes, concepts, ideas, or arguments. These descriptive phrases make it easier for algorithms to find relevant information quickly and efficiently. It plays a vital role in document processing, such as indexing, classification, clustering, and summarization. Traditional keyword extraction approaches rely on statistical distributions of key terms in a document for the most part. According to contemporary technological breakthroughs, contextual information is critical in deciding the semantics of the work at hand. Similarly, context-based features may be beneficial in the job of keyword extraction. For example, simply indicating the previous or next word of the phrase of interest might be used to describe the context of a phrase. This research presents several experiments to validate that context-based key extraction is significant compared to traditional methods. Additionally, the KeyBERT proposed methodology also results in improved results. The proposed work relies on identifying a group of important words or phrases from the document's content that can reflect the authors' main ideas, concepts, or arguments. It also uses contextual word embedding to extract keywords. Finally, the findings are compared to those obtained using older approaches such as Text Rank, Rake, Gensim, Yake, and TF-IDF. The Journals of Universal Computer (JUCS) dataset was employed in our research. Only data from abstracts were used to produce keywords for the research article, and the KeyBERT model outperformed traditional approaches in producing similar keywords to the authors' provided keywords. The average similarity of our approach with author-assigned keywords is 51%.

Keyword Extraction Based on Statistical Information for Cyrillic Mongolian Script

Multilingual Single Document Keyword Extraction For Information Retrieval

Chinese Keyword Extraction Based on N-Gram and Word Co-occurrence

Domain Term Extraction Method Based on Hierarchical Combination Strategy for Chinese Web Documents

Automatic Keywords Extraction Based on Co-Occurrence and Semantic Relationships Between Words

FRAKE: Fusional Real-time Automatic Keyword Extraction

Impact analysis of keyword extraction using contextual word embedding

Chinese Keyword Extraction Algorithm Based on Neighbour Words

KSW: Khmer Stop Word based Dictionary for Keyword Extraction

Keyword Extraction Based on Tf/idf for Chinese News Document

Keyword extraction and clustering for document recommendation in conversations

Keyword extraction from handwritten Chinese document image based on matching and voting of the local topological structure

Keyword extraction using support vector machine

Topic Detection Technology for Chinese Text Based on Statistics and Semantic Information

Keyword Extraction using the Word Co-occurrence Network Properties that is Independent of Languages and Document Types and Its Evaluation by Prediction of Headline Words

Algorithm of Chinese Keywords Extraction based on Multi-feature

Mongolian-Chinese Cross-Language Query Expansion Based on Cross-Language Word Vectors

Multi-font Printed Mongolian Document Recognition System

Domain-Specific Keyword Extraction Using Joint Modeling of Local and Global Contextual Semantics

Word Level Script Recognition for Uighur Document Mixed with English Script.

Keyword Extraction Based on Lexical Chains and Word Co-occurrence for Chinese News Web Pages