Deciphering The Genomic Clues: Advanced Feature Selection Strategies for Cancer Detection in Microarray Gene Expression Profiles
Main Article Content
Abstract
In the quest for early disease detection and efficient treatment, microarray gene data analysis emerges as a pivotal research domain. Public gene expression datasets, reflecting the complex activation profiles of thousands of genes in potential disease patients, present formidable challenges due to high-dimensional feature vectors. Identifying disease-associated genes becomes paramount. This research introduces a novel method fusing feature discretization and selection into a machine learning framework. Our experiments reveal exceptional accuracy, minimal false negatives, and substantial dimensionality reduction. The resultant gene subsets are interpretable by clinical experts, facilitating disease verification. Microarray technology, integral to genetic research, offers diverse applications in health, including disease prediction and cancer investigation. However, analyzing copious raw gene expression data encounters computational complexities. Our research encompasses feature selection methods, crucial for achieving robust cancer classification amidst high dimensions, small sample sizes applicable for both labelled and unlabeled data, and noise. The comprehensive taxonomy of these methods, open research inquiries, and potential inferences are meticulously explored, enriching the field of microarray-based cancer prediction.