Teknik Scaling Menggunakan Robust Scaler Untuk Mengatasi Outlier Data Pada Model Prediksi Serangan Jantung
DOI:
https://doi.org/10.62411/tc.v23i2.10463Keywords:
Heart attack prediction, Outlier data, Robust scaler, ClassificationAbstract
Serangan jantung adalah salah satu faktor utama dalam tingginya tingkat angkat penyebab kematian di seluruh dunia dan memerlukan prosedur diagnosa yang canggih sehingga dapat mengakibatkan peningkatan biaya yang signifikan. Memprediksi penyakit jantung merupakan tantangan utama dalam bidang kesehatan karena keterbatasan peralatan diagnosis penyakit ini. Prediksi penyakit jantung yang akurat sangat penting untuk mengobati pasien sebelum serangan jantung terjadi. Prediksi ini dapat dicapai dengan menggunakan model pembelajaran mesin (machine learning) yang optimal dengan data layanan kesehatan yang kaya (datasets) mengenai penyakit jantung. Namun, Permasalahan yang umumnya dihadapi oleh model prediksi penyakit jantung seperti data yang menyimpang secara ekstrim (outliers), data yang hilang, data yang tidak konsisten, dan data yang tercampur baik secara numerik maupun kategorikal. Data yang tidak konsisten menyebabkan kemungkinan kesalahan prediksi dan akan mempengaruhi hasil dari prediksi. Pada penelitian ini kami mencoba mengatasi masalah outlier pada dataset penyakit jantung menggunakan salah satu metode feature scaling yaitu robust scaler. Hasil Eksperimen dengan model klasifikasi algoritma K-Nearest Neighbors menggunakan metode scaling robust scaler memperoleh nilai lebih baik dibandingkan dengan tanpa robust scaler dengan nilai F1 score sebesar 0.86.References
M. M. Ahsan, M. A. P. Mahmud, P. K. Saha, K. D. Gupta, and Z. Siddique, “Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance,” Technologies (Basel), vol. 9, no. 3, pp. 5–9, 2021, doi: 10.3390/technologies9030052.
F. Ali et al., “A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion,” Information Fusion, vol. 63, pp. 208–222, 2020, doi: 10.1016/j.inffus.2020.06.008.
J. S. Soni, U. Ansari, D. Sharma, and S. Soni, “Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction,” International Journal of Computer Applications (0975-8887), vol. 17, no. 8, pp. 43–48, 2011, doi: 10.4337/9781848442986.00014.
M. Ozcan and S. Peker, “A classification and regression tree algorithm for heart disease modeling and prediction,” Healthcare Analytics, vol. 3, no. December 2022, p. 100130, 2023, doi: 10.1016/j.health.2022.100130.
W. P. Lord and D. C. Wiggins, Medical decision support systems, G. Spekowi. Springer: Berlin/Heidelberg, 2006.
R. Williams, T. Shongwe, A. N. Hasan, and V. Rameshar, “Heart Disease Prediction using Machine Learning Techniques,” 2021 International Conference on Data Analytics for Business and Industry, ICDABI 2021, no. October, pp. 118–123, 2021, doi: 10.1109/ICDABI53623.2021.9655783.
P. Ghosh et al., “Efficient prediction of cardiovascular disease using machine learning algorithms with relief and lasso feature selection techniques,” IEEE Access, vol. 9, pp. 19304–19326, 2021, doi: 10.1109/ACCESS.2021.3053759.
A. U. Haq, J. P. Li, M. H. Memon, S. Nazir, R. Sun, and I. Garciá-Magarinõ, “A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms,” Mobile Information Systems, vol. 2018, 2018, doi: 10.1155/2018/3860146.
M. S. Maulana, R. Sabarudin, and W. Nugraha, “Prediksi Ketepatan Kelulusan Mahasiswa Diploma dengan Komparasi Algoritma Klasifikasi,” JUSTIN, vol. 07, no. 03, pp. 202–206, 2019.
H. Qian, Q. Wen, L. Sun, J. Gu, Q. Niu, and Z. Tang, “RobustScaler: QoS-Aware Autoscaling for Complex Workloads,” Apr. 2022, [Online]. Available: http://arxiv.org/abs/2204.07197
K. V. A. Reddy, S. R. Ambati, Y. S. Rithik Reddy, and A. N. Reddy, “AdaBoost for Parkinson’s Disease Detection using Robust Scaler and SFS from Acoustic Features,” in 2021 Smart Technologies, Communication and Robotics (STCR), IEEE, Oct. 2021, pp. 1–6. doi: 10.1109/STCR51658.2021.9588906.
S. Yulianto and K. H. Hidayatullah, “Analisis klaster untuk pengelompokan kabupaten/kota di provinsi Jawa Tengah berdasarkan indikator kesejahteraan rakyat,” J. Statistika. Univ. Muhammadiyah Semarang, vol. 2, no. 1, pp. 56–63, May 2014.
K. V. A. Reddy, S. R. Ambati, Y. S. Rithik Reddy, and A. N. Reddy, “AdaBoost for Parkinson’s Disease Detection using Robust Scaler and SFS from Acoustic Features,” in 2021 Smart Technologies, Communication and Robotics (STCR), 2021, pp. 1–6. doi: 10.1109/STCR51658.2021.9588906.
X. H. Cao, I. Stojkovic, and Z. Obradovic, “A robust data scaling algorithm to improve classification accuracies in biomedical data,” BMC Bioinformatics, vol. 17, no. 1, Sep. 2016, doi: 10.1186/s12859-016-1236-x.
Q. H. Nguyen et al., “Influence of data splitting on performance of machine learning models in prediction of shear strength of soil,” Math Probl Eng, vol. 2021, 2021, doi: 10.1155/2021/4832864.
A. Luque, A. Carrasco, A. Martín, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit, vol. 91, pp. 216–231, 2019, doi: 10.1016/j.patcog.2019.02.023.
Downloads
Published
Issue
Section
License
License Terms
All articles published in Techno.COM Journal are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). This means:
1. Attribution
Readers and users are free to:
-
Share – Copy and redistribute the material in any medium or format.
-
Adapt – Remix, transform, and build upon the material.
As long as proper credit is given to the original work by citing the author(s) and the journal.
2. Non-Commercial Use
-
The material cannot be used for commercial purposes.
-
Commercial use includes selling the content, using it in commercial advertising, or integrating it into products/services for profit.
3. Rights of Authors
-
Authors retain copyright and grant Techno.COM Journal the right to publish the article.
-
Authors can distribute their work (e.g., in institutional repositories or personal websites) with proper acknowledgment of the journal.
4. No Additional Restrictions
-
The journal cannot apply legal terms or technological measures that restrict others from using the material in ways allowed by the license.
5. Disclaimer
-
The journal is not responsible for how the published content is used by third parties.
-
The opinions expressed in the articles are solely those of the authors.
For more details, visit the Creative Commons License Page:
? https://creativecommons.org/licenses/by-nc/4.0/