Phishing URL Detection Using Machine Learning
Main Article Content
Abstract
Phishing attacks are one of the most significant cybersecurity risks in the online era because they may deceive users into providing sensitive data via fraud websites, which mimic reliable sources. The rule-based and blacklist-based methods of detection are usually ineffective at detecting new phishing URLs because these methods are static. To overcome this constraint, this paper proposes a machine learning model of phishing URL recognizer with real-time detection. The system proposed here will use a trained supervised Gradient Boosting Classifier to run on a labeled body of URLs. The feature extraction is done exhaustively to examine lexical, domain based, and webpage characteristics such as URL length, use of special characters, use of HTTPS, presence of IP address, age of the domain, redirection, and HTML based features. Depending on these characteristics, the model labels URLs as a legitimate or phishing one and produces a threat score that is based on confidence. The system is implemented on a web-based interface that displays risk levels in a visual manner and shows the features presented in explanations to make it more transparent and comprehensible to the user. Experimental analysis proves that the proposed system has a high detection rate, which proves its efficiency as a practically applicable, explainable, and deployable phishing URL detection system.