International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 8, Issue 3 (May-June 2026) Submit your research before last 3 days of June to publish your research paper in the issue of May-June.

Phishing attack using Ensemble Machine Learning

Author(s) Ms. Padmaja Sunil Mengane, Dr. Sarika Jadhav
Country India
Abstract The number of phishing websites and other new kinds of URL-based phishing threats continues to increase as more people get online. What are we supposed to do? These URL-based threats are phishing websites that use URLs to lure careless or unsuspecting online users and gather sensitive information like usernames and passwords or other personal and financial information. As the number of these phishing websites increase, existing traditional detection systems (e.g., blacklists) are not able to timely and effectively detect these new generation phishing sites. There is a need to devise smarter mechanisms that are more active and adaptive to counter the changing trendy dangerous behaviors.

In this study, we build a model that uses some new, clever, and innovative detection mechanisms to solve phishing detection based on the URL of websites. URL-based phishing detection will be based on some features extracted using Natural Language Processing and some clever detection will be based on some supervised learning techniques. Of these Random Forest and AdaBoost are two of the chosen to figure out the best of the two when compared. Initially, data is collected from external data repositories (for example, Kaggle and UCI). Then preprocessing steps (data cleaning, data monumenting, and removal of some stop words) based on some clever supervised learning techniques (with mostly the stemming process by using some Porter) are employed. Feature extraction and reduction are used to eliminate redundancy and minimize the high-dimensional data and, therefore, the high overload of the classifiers. Smart detection will use a lot of clever and adaptive supervised learning techniques, and will be based on optimization of features. These will be evaluated using some confusion matrix metrics accuracy, precision, recall, and the F-measure. Finally, based on the experiments the Random Forest classifiers perform better and achieve some 93.89% accuracy as compared to AdaBoost which achieves only some 92.67% accuracy. The study concludes that hybrid machine learning techniques outperform traditional methods regarding the efficiency of detecting phishing URLs. The system that was proposed is scalable and dependable, allowing for the detection of real-time phishing, and is reflected in an improvement for user safety and a reduction in threats.
Keywords Keywords: Phishing Detection, Machine Learning, Random Forest, AdaBoost, NLP, URL Classification
Field Computer
Published In Volume 8, Issue 3, May-June 2026
Published On 2026-05-12

Share this