TY - GEN
T1 - Cardiovascular Disease Detection Using Machine Learning Models
AU - Lucas, Augusto Manuel Juanino
AU - Al-Absi, Hamada R.H.
AU - Alirr, Omar Ibrahim
AU - Alam, Tanvir
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Cardiovascular diseases (CVDs) are the leading cause of death worldwide. Therefore, early detection of CVD is crucial for managing the health outcome as well as reducing the burden on healthcare. Artificial intelligence-based solutions have been studied in this domain to improve CVD detection and management. In this article, our objective is to detect the onset of CVD with high accuracy using biomarkers that are easily accessible in the community. We used three publicly available datasets from Kaggle and the UCI Repository to build machine learning models for the detection of CVD. Three different feature subset selection methods - genetic algorithm, recursive feature elimination, and Chi-square-based techniques - were applied to the datasets that supported the Ensemble-based Extreme Tree Classifier (ETC) model, achieving the best results with 98.35%, 99.61%, and 94.02% accuracy on these datasets. After feature subset selection, the RF-based model outperformed the existing models in the literature for all three datasets. We also identified the features (e.g., age, gender, fasting blood sugar, type of chest pain, cholesterol, and exercise-induced angina) that contributed to the improvement in the AI model's performance. We believe the proposed model will support the early detection of CVD with high accuracy in a clinical setup and reduce the healthcare burden.
AB - Cardiovascular diseases (CVDs) are the leading cause of death worldwide. Therefore, early detection of CVD is crucial for managing the health outcome as well as reducing the burden on healthcare. Artificial intelligence-based solutions have been studied in this domain to improve CVD detection and management. In this article, our objective is to detect the onset of CVD with high accuracy using biomarkers that are easily accessible in the community. We used three publicly available datasets from Kaggle and the UCI Repository to build machine learning models for the detection of CVD. Three different feature subset selection methods - genetic algorithm, recursive feature elimination, and Chi-square-based techniques - were applied to the datasets that supported the Ensemble-based Extreme Tree Classifier (ETC) model, achieving the best results with 98.35%, 99.61%, and 94.02% accuracy on these datasets. After feature subset selection, the RF-based model outperformed the existing models in the literature for all three datasets. We also identified the features (e.g., age, gender, fasting blood sugar, type of chest pain, cholesterol, and exercise-induced angina) that contributed to the improvement in the AI model's performance. We believe the proposed model will support the early detection of CVD with high accuracy in a clinical setup and reduce the healthcare burden.
KW - artificial intelligence
KW - cardiovascular disease
KW - machine learning
UR - https://www.scopus.com/pages/publications/105018035423
U2 - 10.1109/ICoDSA67155.2025.11157545
DO - 10.1109/ICoDSA67155.2025.11157545
M3 - Conference contribution
AN - SCOPUS:105018035423
T3 - 2025 International Conference on Data Science and Its Applications, ICoDSA 2025
SP - 1261
EP - 1266
BT - 2025 International Conference on Data Science and Its Applications, ICoDSA 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th International Conference on Data Science and Its Applications, ICoDSA 2025
Y2 - 3 July 2025 through 5 July 2025
ER -