Crop Prediction Using Machine Learning Algorithm

: Many developing nations rely heavily on agriculture as their main source of income. The ongoing evolution of modern agriculture involves constant innovation in farming practices. Meeting the ever-changing demands of our planet and satisfying the expectations of merchants and consumers is significant challenges for farmers. Some of these challenges include coping with climate changes due to soil erosion and industrial emissions, addressing nutrient deficiencies in the soil (such as potassium, nitrogen, and phosphorus), which can hinder crop growth, and overcoming the tendency to cultivate the same crops repeatedly without experimenting with different varieties. The objective of the paper is to identify the optimal model for predicting crop outcomes, assisting farmers in choosing suitable crops based on climatic conditions and soil nutrient levels. This paper compare accuracy of different and using best of among these.


Introduction:
Machine learning serves as a valuable tool for decision-making in predicting agricultural yields and determining optimal crop choices and activities throughout the growing season.Various machine learning methods have been employed in crop prediction studies to enhance accuracy and efficiency.
We're diving into the world of using fancy computer programs to predict which crops will do well on a farm.Imagine we have sensors in the soil that measure things like nutrients (N, P, K), pH levels, humidity, and temperature.By gathering and analyzing this info, we want to create a smart system that helps farmers decide which crops to plant.We're not just stopping there; we're also testing different computer models to see which one works best.One standout is the Random Forest model-it's like a digital superhero for handling lots of information and making accurate predictions.The big picture is to give farmers a handy tool that considers all sorts of factors, so they can make smarter choices about what to plant among provided historical crop dataset.

Literature Survey
Ashwani Kumar Kushwaha [2] discusses methods for predicting crop yield to enhance farmer profits and the agriculture sector's quality.The study employs big data, specifically soil and weather data, collected through the Hadoop platform and agro algorithms.The repository data is utilized to predict suitable crops for specific conditions, thereby improving crop quality.• Email: editor@ijfmr.com

IJFMR230611095
Volume 5, Issue 6, November-December 2023 2 Dahikar S [3] highlights the significance of crop prediction and proposes methods to enhance accuracy.The paper introduces a feed-forward backpropagation Artificial Neural Network approach to model and forecast crop yields in rural areas based on soil parameters (PH, nitrogen, potassium) and atmospheric parameters (rainfall, humidity).
Rahul Katarya [4] investigates diverse machine learning approaches to enhance crop yield.The paper explores the application of artificial intelligence, incorporating machine learning algorithms and big data analysis in precision agriculture.The author details the implementation of a crop recommendation system, employing methods such as K-Nearest Neighbors (KNN), Ensemble-based Models, and Neural Networks.Dhanush Vishwakarma [5] utilizes the SVM algorithm for rainfall prediction and the Decision Tree algorithm for crop prediction.Input variables include N, P, K, pH, rainfall, and humidity.This integrated approach aims to provide accurate predictions for better agricultural planning.

Proposed Work :
The proposed system aims to forecast the optimal crop for a specific piece of land by considering soil composition and various weather parameters, including temperature, humidity, soil pH, and rainfall.by collecting these values by sensors and testing .

Collecting Raw Data
The process of collecting and analyzing data from diverse sources is known as data collection.This practice facilitates the retrospective examination of events and supports the application of data analysis to identify recurrent patterns.The dataset utilized in the 'Crop Recommendation' project is sourced from the Kaggle platform.This dataset includes 22 distinct crops as class labels and comprises seven attributes: 1. Nitrogen Ratio (N): The proportion of nitrogen in the soil, a critical factor for plant growth.

Data Preprocessing
Data preprocessing is the crucial step of transforming raw data into a format suitable for analysis and machine learning algorithms.It enables analysts and data scientists to derive insights or predict outcomes.In this project, our data preprocessing primarily focuses on identifying and handling missing values.It's common for datasets to contain empty cells, null values, or specific characters like question marks, all of which may indicate missing data.Fortunately, the dataset utilized in this project is free from any missing values.

Train and Test Split
The dataset is divided into a training dataset and a testing dataset using the train_test_split() method from the scikit-learn module.Out of the 2200 data points in the dataset, 80% (1760 data points) constitute the training dataset, while the remaining 20% (440 data points) form the testing dataset.

Fitting the model
Model fitting involves adjusting the model's parameters to enhance accuracy.This process entails running the algorithm on labeled data to establish a machine learning model.The model's accuracy is then assessed by comparing its predictions against the known target variable.A well-fitted model can generalize well to new, unseen data..

Model used for system is Random Forests Classifier .
The chosen model for this system is the Random Forests Classifier.This ensemble learning method utilizes multiple decision tree classifiers to improve overall performance.The algorithm creates decision trees randomly using instances from the training set.Each decision tree provides predictions, and the final model prediction is determined through majority voting.The Random Forests Classifier is favored in machine learning due to its ability to handle overfitting issues, with increased accuracy achievable by incorporating more trees.1. Randomly select K instances from the provided training dataset.2. Build decision trees based on the chosen instances.3. Specify the number of estimators (N) for the total trees to be created.4. Repeat steps 1 and 2 for N iterations. 5.For a new instance, gather predictions from each estimator, and assign the category with the highest number of votes as the final prediction.

F1 score
The F1 score is a weighted harmonic mean of precision and recall, ranging from 0.0 (indicating the worst performance) to 1.0 (representing the best performance).Unlike accuracy measurements, F1 scores tend to be lower, as they take into account both precision and recall in their computation.

* PR ( P + R )
Where P-Precision; R-Recall 4. Accuracy Accuracy Model accuracy is calculated as the ratio of correct predictions to the total number of predictions.This metric is part of the metrics module.

Accuracy =
TP + TN ( TP + TN + FP + FN ) In the given sentence, TP represents True Positive, FP stands for False Positive, TN denotes True Negative, and FN represents False Negative.

Result and Analysis
We conducted experiments with various machine learning models, including the random forest classifier, decision tree, support vector machine, k-nearest neighbors (KNN), and a stacked model.Our dataset contains seven features: (i) Content ration of N (ii) Content ration of P, (iii) Content ration of K in the soil, (iv)    Currently, our farmers are not making optimal use of technology and analysis, increasing the risk of selecting the wrong crops for cultivation and subsequently decreasing their income.In order to mitigate these potential losses, we are working on the development of a user-friendly system with a graphical user interface (GUI).This system aims to predict the most suitable crop for a specific piece of land and provide additional information about the recommended crop.By empowering farmers with accurate insights, we hope to enable them to make informed decisions in crop selection, contributing to the development of the agricultural sector through innovative ideas.

2 .
Phosphorus Ratio (P): The proportion of phosphorus in the soil, an essential nutrient influencing plant development.3. Potassium Ratio (K): The proportion of potassium in the soil, a key element for overall plant health.4. Soil Temperature: The temperature in degrees Celsius of the surrounding environment, affecting biochemical reactions in plants.• Email: editor@ijfmr.comIJFMR230611095 Volume 5, Issue 6, November-December 2023 35.Relative Humidity: The percentage of water vapor in the air relative to its maximum capacity at the given temperature.6. Soil pH: The measurement of soil acidity or alkalinity, influencing nutrient availability for plants.7. Rainfall: The amount of precipitation in millimeters, a crucial factor for crop irrigation.

Figure 4 .
Figure 4.2 pH range representation