Keyword Extraction from Single Document Using Hybrid Approach
Main Article Content
Abstract
Keyword extraction from single documents is a pivotal task in natural language processing (NLP), playing a crucial role in information retrieval and document summarization. This research introduces a novel hybrid approach that synergistically combines statistical and semantic methods to enhance the precision and contextual relevance of keyword extraction .The hybrid model integrates Rapid Automatic Keyword Extraction (RAKE), a statistical algorithm leveraging word frequency and co-occurrence, with advanced semantic analysis using Word Embeddings. By fusing these methods, the model achieves a more comprehensive understanding of keyword significance.The proposed approach involves a meticulous weighting mechanism, assigning significance to keywords derived from each method, ensuring a balanced representation of statistical and semantic insights. Evaluation on diverse datasets demonstrates the superior performance of the hybrid model compared to individual methods. Key findings include improved accuracy in capturing contextual meaning, adaptability to domain-specific terminology, and robust keyword extraction from single documents. However, challenges and opportunities emerge in interpreting the weighting mechanism and scaling the approach for real-world applications.
This research contributes to the evolving landscape of keyword extraction by providing an effective and adaptable hybrid solution. The findings hold implications for information retrieval systems, search engines, and automated document summarization, promising advancements in contextual understanding and relevance in NLP applications.