Keyword Extraction from  Single Document Using Hybrid Approach

Fairoz Ahmad,  Kiranpreet Kaur

PDF

Published: Jan 4, 2024

Fairoz Ahmad, Kiranpreet Kaur

Abstract

Keyword extraction from single documents is a pivotal task in natural language processing (NLP), playing a crucial role in information retrieval and document summarization. This research introduces a novel hybrid approach that synergistically combines statistical and semantic methods to enhance the precision and contextual relevance of keyword extraction .The hybrid model integrates Rapid Automatic Keyword Extraction (RAKE), a statistical algorithm leveraging word frequency and co-occurrence, with advanced semantic analysis using Word Embeddings. By fusing these methods, the model achieves a more comprehensive understanding of keyword significance.The proposed approach involves a meticulous weighting mechanism, assigning significance to keywords derived from each method, ensuring a balanced representation of statistical and semantic insights. Evaluation on diverse datasets demonstrates the superior performance of the hybrid model compared to individual methods. Key findings include improved accuracy in capturing contextual meaning, adaptability to domain-specific terminology, and robust keyword extraction from single documents. However, challenges and opportunities emerge in interpreting the weighting mechanism and scaling the approach for real-world applications.

This research contributes to the evolving landscape of keyword extraction by providing an effective and adaptable hybrid solution. The findings hold implications for information retrieval systems, search engines, and automated document summarization, promising advancements in contextual understanding and relevance in NLP applications.

Issue

Vol. 45 No. 01 (2024)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details