
International Journal For Multidisciplinary Research
E-ISSN: 2582-2160
•
Impact Factor: 9.24
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Home
Research Paper
Submit Research Paper
Publication Guidelines
Publication Charges
Upload Documents
Track Status / Pay Fees / Download Publication Certi.
Editors & Reviewers
View All
Join as a Reviewer
Get Membership Certificate
Current Issue
Publication Archive
Conference
Publishing Conf. with IJFMR
Upcoming Conference(s) ↓
WSMCDD-2025
GSMCDD-2025
Conferences Published ↓
ICCE (2025)
RBS:RH-COVID-19 (2023)
ICMRS'23
PIPRDA-2023
Contact Us
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 7 Issue 3
May-June 2025
Indexing Partners



















Data Preprocessing Methods for Machine Learning: An Empirical Comparison
Author(s) | Dr. P Yasodha |
---|---|
Country | India |
Abstract | The accuracy and efficiency of machine learning (ML) algorithms largely depend on the quality and structure of input data. Data preprocessing is a crucial step in the ML pipeline that transforms raw data into a clean and structured format suitable for modeling. Despite the diversity of preprocessing techniques such as normalization, standardization, missing value imputation, encoding categorical variables, and feature selection there remains a lack of comprehensive empirical evaluation of their comparative effectiveness. This paper presents a systematic comparison of prominent data preprocessing methods across multiple real-world datasets and machine learning algorithms. Using a controlled experimental setup, we analyze the influence of different preprocessing techniques on model performance metrics such as accuracy, precision, recall, F1-score, and training time. The study reveals that while certain methods like standardization and one-hot encoding generally improve performance, their effectiveness is dataset- and algorithm-dependent. The findings highlight the importance of tailoring preprocessing strategies to specific use cases and provide guidelines for selecting optimal preprocessing combinations for different ML contexts. |
Keywords | optimal preprocessing combinations for different ML contexts. Keywords: Data preprocessing, Machine learning, Feature scaling, missing value imputation, Categorical encoding, Feature selection, Model performance, Empirical comparison |
Field | Arts |
Published In | Volume 7, Issue 3, May-June 2025 |
Published On | 2025-06-19 |
DOI | https://doi.org/10.36948/ijfmr.2025.v07i03.48569 |
Short DOI | https://doi.org/g9qw9w |
Share this

E-ISSN 2582-2160

CrossRef DOI is assigned to each research paper published in our journal.
IJFMR DOI prefix is
10.36948/ijfmr
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
