Improvement of Accuracy and Handling of Missing Value Data in the Naive Bayes Kernel Algorithm

Bijanto Bijanto, Ryan Yunus

Abstract


The lost impact on the research process, can be serious in classifying results leading to biased parameter estimates, statistical information, decreased quality, increased standard error, and weak generalization of the findings. In this paper, we discuss the problems that exist in one of the algorithms, namely the Naive Bayes Kernel algorithm. The Naive Bayes kernel algorithm has the disadvantage of not being able to process data with the mission value. Therefore, in order to process missing value data, there is one method that we propose to overcome, namely using the mean imputation method. The data we use is public data from UCI, namely the HCV (Hepatisis C Virus) dataset. The input method used to correct the missing data so that it can be filled with the average value of the existing data. Before the imputation process means, the dataset uses yahoo bootstrap first. The data that has been corrected using the mean imputation method has just been processed using the Naive Bayes Kernel Algorithm. From the results of the research tests that have been carried out, it can be obtained an accuracy value of 96.05% and the speed of the data computing process with 1 second.

Full Text:

PDF

References


Y. Dong and C. Y. J. Peng, “Principled missing data methods for researchers,” Springerplus, vol. 2, no. 1, pp. 1–17, 2013.

R. Sarmento, E. Text, and M. Visualization, “Hepatitis C Records - A Complete Statistical Analysis,” no. January, 2021.

M. S. and V. K. T. Pang-Ning, “Introduction to data mining,” Libr. Congr, 2006.

S. A. Setiawan. T. A., Wahono. R. S., “Integrasi Metode Sample Bootstrapping dan Weighted Principal Component Analysis untuk Meningkatkan Performa k Nearest Neighbor pada Dataset Besar,” J. Intell. Syst., p. 796, 2015.

Sahibsingh A. Dudani, “The Distance-Weighted k-Nearest-Neighbor Rule,” IEEE Trans. Syst. Man. Cybern., vol. SMC-6, pp. 325–327, 1976.

M. Aladjem, “Projection pursuit mixture density estimation,” IEEE Trans. Signal Process, vol. 53, pp. 4376–4383, 2005.

J. Bilmes, “A gentle tutorial on the EM algorithm and its application to parameter estimation for gaussian mixture models,” Int. Comput. Sci. Inst., 1998.

Christopher M. Bishop, “Neural Networks for Pattern Recognition,” Oxford Univ. Press. Inc.198 Madison Ave. New York, NYUnited States, p. 482, 1995.




DOI: https://doi.org/10.33633/jais.v6i2.5288

Article Metrics

Abstract view : 265 times
PDF - 204 times

Refbacks

  • There are currently no refbacks.


Flag Counter

 

 

 

 

Journal of Applied Intelligent System (e-ISSN : 2502-9401p-ISSN : 2503-0493) is published by Department of Informatics Universitas Dian Nuswantoro Semarang and IndoCEISS.

  

 

Journal of Applied Intelligent System indexed by :


This journal is under licensed of Creative Commons Attribution 4.0 International License.

Visitor Stats