Phishing attack using Ensemble Machine Learning

Padmaja Sunil Mengane; Sarika Jadhav

Phishing attack using Ensemble Machine Learning

Author(s)	Ms. Padmaja Sunil Mengane, Dr. Sarika Jadhav
Country	India
Abstract	The number of phishing websites and other new kinds of URL-based phishing threats continues to increase as more people get online. What are we supposed to do? These URL-based threats are phishing websites that use URLs to lure careless or unsuspecting online users and gather sensitive information like usernames and passwords or other personal and financial information. As the number of these phishing websites increase, existing traditional detection systems (e.g., blacklists) are not able to timely and effectively detect these new generation phishing sites. There is a need to devise smarter mechanisms that are more active and adaptive to counter the changing trendy dangerous behaviors. In this study, we build a model that uses some new, clever, and innovative detection mechanisms to solve phishing detection based on the URL of websites. URL-based phishing detection will be based on some features extracted using Natural Language Processing and some clever detection will be based on some supervised learning techniques. Of these Random Forest and AdaBoost are two of the chosen to figure out the best of the two when compared. Initially, data is collected from external data repositories (for example, Kaggle and UCI). Then preprocessing steps (data cleaning, data monumenting, and removal of some stop words) based on some clever supervised learning techniques (with mostly the stemming process by using some Porter) are employed. Feature extraction and reduction are used to eliminate redundancy and minimize the high-dimensional data and, therefore, the high overload of the classifiers. Smart detection will use a lot of clever and adaptive supervised learning techniques, and will be based on optimization of features. These will be evaluated using some confusion matrix metrics accuracy, precision, recall, and the F-measure. Finally, based on the experiments the Random Forest classifiers perform better and achieve some 93.89% accuracy as compared to AdaBoost which achieves only some 92.67% accuracy. The study concludes that hybrid machine learning techniques outperform traditional methods regarding the efficiency of detecting phishing URLs. The system that was proposed is scalable and dependable, allowing for the detection of real-time phishing, and is reflected in an improvement for user safety and a reduction in threats.
Keywords	Keywords: Phishing Detection, Machine Learning, Random Forest, AdaBoost, NLP, URL Classification
Field	Computer
Published In	Volume 8, Issue 3, May-June 2026
Published On	2026-05-12

View / Download PDF File

E-ISSN 2582-2160

doi

CrossRef DOI is assigned to each research paper published in our journal.

IJFMR DOI prefix is
10.36948/ijfmr

Downloads

Research Paper Format Copyright Permission Form and Undertaking Form Cover Page Vol 8 Isu 2 Cover Page Vol 8 Isu 1 Cover Page Vol 7 Isu 6

All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.

CC-BY-SA

About IJFMR Fees & Payment Current Issue Publication Archive	Submit Research Paper Track Submission Status Publication Guidelines Publication Ethics Peer Review & Plagiarism	Join as a Reviewer Editors & Reviewers Reviewer Referral Program Get Reviewer Membership Certi.	Website/Journal Policies Usage Policy Content Policies Privacy Policy

Contact Us		+91-9687-828-838	editor@ijfmr.com

International Journal For Multidisciplinary Research

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Phishing attack using Ensemble Machine Learning

Share this