A Comprehensive Survey on Heart Disease Prediction

Heart disease, stroke, and other vascular illnesses will kill one-third of humanity, according to the World Health Organization. Reducing mortality rates and providing the best clinical decision support for cardiac patients necessitates the use of an appropriate machine learning model to target early detection and accurate heart disease prediction. In this study, the Cleveland and Z-Alizadeh Sani datasets are used to examine the efficacy of several heart disease prediction algorithms.


Introduction
The complex and sometimes deadly nature of heart disease (HD) has long been known.This condition causes abnormal cardiac function, which in turn blocks blood arteries and increases the risk of heart attack, angina, and stroke.Coronary artery disease, coronary heart disease, congestive heart failure, and abnormal cardiac rhythms are the most frequent forms of heart disease.Conventional risk variables such as age, sex, hypertension, excessive cholesterol, irregular pulse, and many others provide significant difficulties in early prediction of such HD [1].Even in economically deprived regions and rural areas, cardiovascular disease (CVD) has been recognised as one of the leading causes of mortality in India.This is despite the fact that cardiovascular risk factors vary widely across various segments of society.The primary impetus for this research came from worldwide data showing that premature death due to CVD has increased from 23.2 million in 1990 to 37 million in 2010 at an annualised rate of 59% [2].
Because of the importance of getting an accurate diagnosis of heart disease, a number of invasive clinical procedures have been developed, such as the angiography.This has prompted a number of scientists to investigate the feasibility of using data-mining methods for the reliable diagnosis of CVD.
What we mean by "Machine Intelligence" is the ability of machines to learn and adapt, allowing them to solve problems and collaborate with other machines and the physical environment [2].
Artificial Intelligence (AI) techniques like machine learning and deep learning will likely serve as the basis for the model used to make predictions and verify the data.Both are very effective and deserve to be used in medical data analytics.Consider using several machine intelligence paradigms if you are StatLog Heart, the Hungarian, the Long Beach VA, and the Kaggle Framingham dataset are also employed in the prediction method by the researchers.Table 1 shows that there are 13 traits shared by the 270 samples that make up the Statlog dataset and that these features are comparable to those found in Cleveland.
In contrast to the Cleveland dataset shown in Table 1, the Hungarian and Long Beach VA datasets are available from the UCI repository, and both include 274 samples for each of the 14 characteristics.
The Kaggle Framingham dataset has a vast amount of information, with samples totalling 4,240 patients and 16 characteristics that integrate behavioural, demographic, and medical risk variables.
When this process fails, a lump of tissue called a tumor results, that is, when the former cells are left behind and the young cells expand needlessly.New cells are created and old ones are destroyed in a healthy human body.
The WHO has noted that the growing radio frequency electromagnetic field connected to electronics devices such as cell phones may be the cause of brain tumors.Tumors are a deadly illness, according to

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 • Website: www.ijfmr.com• Email: editor@ijfmr.comthe National Health Portal, Government of India, with a survival rate of less than 4% for surviving for greater than 4 years.
Different methods, such as neurological testing, angiograms, spinal taps, CT scans, and MRIs, aid in the identification of brain tumors based on symptoms and family history.
We looked at a number of newly popular strategies in this research to segment and categorize the brain tumor seen in MRI images.We have also provided a comparison based on how well the techniques for classifying abnormality and normalcy have worked.We also spoke about the brain tumor datasets that are currently available for future technique validation.

Basic Prediction
The following figures shows the basic diagram of a heart disease prediction system.

Comparison and Results
The comparative assessment of numerous evaluated publications related to heart disease prediction are shown in Table 1.

Discussion
Over the last few years, a number of researchers have put a significant amount of effort into developing methods to forecast instances of heart disease using the datasets described above.
In 1979, GA Diamond and J.S. Forrester used Bayes' Theorem [5] to draw a diagnostic conclusion regarding the likelihood of illness in a particular patient based on data from procedures such as stress electrocardiography, cardio kymography, thallium scintigraphy, and cardiac fluoroscopy.This conclusion was based on data from these procedures and was used to draw a conclusion regarding the likelihood of illness in the patient.Later on, W.F. Wilson et al. [6] introduced a new facet to the procedures for heart disease by predicting CHD based on risk factor categories using regression equations and logistic methods.This was done in an effort to improve the accuracy of the methodologies.
During the later stages of the project, a number of academics collaborate to develop novel machine learning and deep learning algorithms in order to provide accurate projections about the occurrence of cardiovascular sickness.
This article is a synopsis of current studies conducted on determining the likelihood of developing cardiovascular disease.
A profusion of research on the prognosis of cardiac disease have been carried out in the recent years by a variety of academics making use of the datasets described above.

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 • Website: www.ijfmr.com• Email: editor@ijfmr.com As early as 1979, G.A. Diamond and J.S. Forrester consolidated data from multiple tests, including stress combining the findings of ECG, CK, thallium scintigraphy, and cardiac fluoroscopy into a single diagnosis.This was done by stress combining the findings of ECG, CK, thallium scintigraphy, and cardiac fluoroscopy.
The chance of illness in a particular patient may be determined using Bayes' Theorem [5].In following years, cardiology made some progress in its attempt to quantify CHD by classifying probable risk factors.This was an important step forward using logistic and regression techniques developed by W.F. Wilson and colleagues [6].Many researchers then develop machine learning and deep learning methods in the subsequent phases using the datasets from the UCI repository in order to predict cardiovascular disease [7][8][9][10][11][12][13][14][15][16][17][18].
The prognosis of cardiovascular sickness is the topic of discussion in this research, which includes a literature review of works on the subject.
There were many instances in which the accuracies were found to be higher than expected, based on the features and machine learning algorithms that were used.When it came to reaching a high level of accuracy, a number of models recommended employing a limited sample size rather than a factor that was highly associated, such as age [7].In contrast to [10], the research that was published in [9], [11], and [12] provides high accuracy with complete features due to the use of a technique that is both efficient and compact.A step nearer, on the other hand, a more in-depth examination of [10] reveals that although using the identical ML approach, it has a lower degree of accuracy than [12].To put it another way, this illustrates that the size of the sample is quite important for determining the trustworthiness of predictions.
Some researchers choose for other approaches, such as feature selection and optimization procedures, in order to improve the accuracy of their predictions.These tactics remove data that have lower correlations.For instance, the removal of one or more strongly correlated and necessary parameters for the diagnosis of the illness, such as age, resting ECG, ST Depression, etc., leads to improved accuracy, as demonstrated in a number of representations [18,20,21].These parameters include things like age and resting ECG.
However, the practise of feature selection in prediction models [14,16,17,19] has not only increased accuracy but also mitigated problems such increased processing costs and overfitting brought on by irrelevant input features during the learning process.In addition, the approaches may also present problems with the design, which may be solved with the assistance of the most appropriate advanced prediction models within the framework of a prospective research project.
There are several techniques to assess the effectiveness of segmentation or classification systems.Researchers demonstrate their verified findings using a variety of approaches.Mean Square Error (MSE), Confusion matrix, Jaccard Index, Peak Signal to Noise Ratio (PSNR), Specificity, Accuracy metric, Recall, Sensitivity, and Precision are some of the commonly used performance measures that are

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 • Website: www.ijfmr.com• Email: editor@ijfmr.com analyzed in this study.The crucial information regarding the actual outcome and the projected outcome given by segmentation or classification algorithms is provided by confusion matrices.

Conclusion
Machine intelligence may be used as an alternative diagnostic approach to forecast sickness and keep patients informed.
This article examines machine learning, ensemble, and deep learning cardiac prediction systems.From the studied literature, the Cleveland heart disease dataset with 303 cases and 14 characteristics is most utilized.Small sample sizes are to blame.Any research using additional data sources used a single dataset with few characteristics.As a result, high-accuracy prediction models produced by removing extraneous information, removing strongly correlated components, or employing feature selection / optimization approaches cannot be generalized, which is a severe flaw.
Despite the researcher's efforts, prediction models are not standardized.Investigate alternative heart disease datasets with more characteristics to improve classification and prediction accuracy.Future studies will focus on developing a predictive framework model that addresses most of this paper's flaws.In addition, real-time data should be analysed using the working learning model to verify clinical correlation and validation.