International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 7, Issue 3 (May-June 2025) Submit your research before last 3 days of June to publish your research paper in the issue of May-June.

Data Preprocessing Methods for Machine Learning: An Empirical Comparison

Author(s) Dr. P Yasodha
Country India
Abstract The accuracy and efficiency of machine learning (ML) algorithms largely depend on the quality and structure of input data. Data preprocessing is a crucial step in the ML pipeline that transforms raw data into a clean and structured format suitable for modeling. Despite the diversity of preprocessing techniques such as normalization, standardization, missing value imputation, encoding categorical variables, and feature selection there remains a lack of comprehensive empirical evaluation of their comparative effectiveness. This paper presents a systematic comparison of prominent data preprocessing methods across multiple real-world datasets and machine learning algorithms. Using a controlled experimental setup, we analyze the influence of different preprocessing techniques on model performance metrics such as accuracy, precision, recall, F1-score, and training time. The study reveals that while certain methods like standardization and one-hot encoding generally improve performance, their effectiveness is dataset- and algorithm-dependent. The findings highlight the importance of tailoring preprocessing strategies to specific use cases and provide guidelines for selecting optimal preprocessing combinations for different ML contexts.
Keywords optimal preprocessing combinations for different ML contexts. Keywords: Data preprocessing, Machine learning, Feature scaling, missing value imputation, Categorical encoding, Feature selection, Model performance, Empirical comparison
Field Arts
Published In Volume 7, Issue 3, May-June 2025
Published On 2025-06-19
DOI https://doi.org/10.36948/ijfmr.2025.v07i03.48569
Short DOI https://doi.org/g9qw9w

Share this