Evaluating the Effectiveness of Categorical Encoding Methods on Higher Secondary Student’s Data for Multi-Class Classification

Main Article Content

P. Amutha , R. Priya

Abstract

The multi-class classification is a prominent research area focused by researchers and academicians to classify data labels in various fields such as crop yield, health care, accounting, finance, agriculture, bioinformatics, cyber security, cloud computing, simulation, education, etc., to solve problems, alleviate risks and to possess new opportunities.  The encoding of categorical data into the numeric values trigger off the field of data mining-machine learning. Because, algorithms in these domains are difficult to understand the string values in data set and produce poor performance in classification. This research study focused on converting categorical data into numeric values using encoding methods. Various classifiers were considered for performance comparison of the encoding methods. The experimental result was compared in terms of accuracy, precision, F1 score, and recall. It was revealed that the combination of Random Forest and label encoding outperformed other classification methods for multiclass classification.  . 

Article Details

Section
Articles