XGBoost for Heart Disease Prediction: Achieving High Accuracy with Robust Machine Learning Techniques
Keywords:
Heart Disease Prediction, Machine Learning, XGBoost, Predictive ModelingAbstract
Examining the efficacy of the XGBoost algorithm, this study investigates the use of machine learning for the prediction of cardiac illness. We tested XGBoost against Decision Trees and Random Forests using a Kaggle dataset that included more than 1,000 patient records with fourteen important features. With an F1-score of 0.9816, a recall of 1.0, a precision of 96.39%, and an accuracy of 98.04%, our hyperparameter-optimized XGBoost model performed admirably. Prior approaches, such the hybrid Random Forest model, which achieved an accuracy of 88.7 percent with a reduced dataset, are surpassed by this model. There were no missing instances and few false positives, demonstrating the XGBoost model's dependability in predicting heart disease. Its recall and accuracy were also good. Our results show that XGBoost might be a powerful tool for early detection of cardiac disease; it offers major gains over current methods and lays the groundwork for predictive analytics studies in the future.
References
[1] Alizadehsani, Roohallah, et al. "Machine learning-based coronary artery disease diagnosis: A comprehensive review." Computers in biology and medicine 111 (2019): 103346. https://doi.org/10.1016/j.compbiomed.2019.103346
[2] Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016. https://doi.org/10.1145/2939672.2939785
[3] Zhou, Zhi-Hua. "Ensemble methods." Combining pattern classifiers. Wiley, Hoboken (2014): 186-229. https://doi.org/10.1002/9781118914564.ch6
[4] Krishnan, Santhana, and S. Geetha. "Prediction of heart disease using machine learning algorithms." 2019 1st international conference on innovations in information and communication technology (ICIICT). IEEE, 2019.
[5] Quinlan, J.R. "Induction of Decision Trees." Machine Learning, vol. 1, no. 1, 1986, pp. 81-106. https://doi.org/10.1007/BF00116251
[6] Breiman, Leo, et al. "Classification and Regression Trees." Monographs in Statistics and Applied Probability, vol. 1, Wadsworth International, 1986.
[7] Breiman, Leo. "Random forests." Machine learning 45 (2001): 5-32. https://doi.org/10.1023/A:1010933404324
[8] Liaw, Andy, and Matthew Wiener. "Classification and Regression by RandomForest." R News, vol. 2, no. 3, 2002, pp. 18-22.
[9] Chen, Tianqi, and Carlos Guestrin. "XGBoost: A Scalable Tree Boosting System." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785-794. https://doi.org/10.1145/2939672.2939785
[10] Friedman, Jerome H. "Greedy Function Approximation: A Gradient Boosting Machine." Annals of Statistics, vol. 29, no. 5, 2001, pp. 1189-1232. https://doi.org/10.1214/aos/1013203451
[11] Chen, Austin H., et al. "HDPS: Heart disease prediction system." 2011 computing in Cardiology. IEEE, 2011.
[12] Patel, Jaymin, Dr TejalUpadhyay, and Samir Patel. "Heart disease prediction using machine learning and data mining technique." Heart Disease 7.1 (2015): 129-137.
[13] Palaniappan, Sellappan, and Rafiah Awang. "Intelligent heart disease prediction system using data mining techniques." 2008 IEEE/ACS international conference on computer systems and applications. IEEE, 2008. https://doi.org/10.1109/AICCSA.2008.4493524
[14] Bhatt, Chintan M., et al. "Effective heart disease prediction using machine learning techniques." Algorithms 16.2 (2023): 88. https://doi.org/10.3390/a16020088
[15] Repaka, Anjan Nikhil, Sai Deepak Ravikanti, and Ramya G. Franklin. "Design and implementing heart disease prediction using naives Bayesian." 2019 3rd International conference on trends in electronics and informatics (ICOEI). IEEE, 2019. https://doi.org/10.1109/ICOEI.2019.8862604
[16] Sharma, Vijeta, Shrinkhala Yadav, and Manjari Gupta. "Heart disease prediction using machine learning techniques." 2020 2nd international conference on advances in computing, communication control and networking (ICACCCN). IEEE, 2020. https://doi.org/10.1109/ICACCCN51052.2020.9362842
[17] Ayon, Safial Islam, Md Milon Islam, and Md Rahat Hossain. "Coronary artery heart disease prediction: a comparative study of computational intelligence techniques." IETE Journal of Research 68.4 (2022): 2488-2507. https://doi.org/10.1080/03772063.2020.1713916
[18] Dubey, Animesh Kumar, and Kavita Choudhary. "A systematic review and analysis of the heart disease prediction methodology." International Journal of Advanced Computer Research 8.38 (2018): 240-256. https://doi.org/10.19101/IJACR.2018.837025
[19] Mohan, Senthilkumar, Chandrasegar Thirumalai, and Gautam Srivastava. "Effective heart disease prediction using hybrid machine learning techniques." IEEE access 7 (2019): 81542-81554. https://doi.org/10.1109/ACCESS.2019.2923707
[20] Mohan, Senthilkumar, Chandrasegar Thirumalai, and Gautam Srivastava. "Effective heart disease prediction using hybrid machine learning techniques." IEEE access 7 (2019): 81542-81554.[2] Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016. https://doi.org/10.1109/ACCESS.2019.2923707
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Rajni Gandha, Pankaj Richhariya

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Re-users must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license allows for redistribution, commercial and non-commercial, as long as the original work is properly credited.