Manhattan, Euclidean And Chebyshev Methods In K-Means Algorithm For Village Status Grouping In Aceh Province

Authors

DOI:

https://doi.org/10.33633/jais.v7i3.7037

Abstract

The Ministry of Villages, Development of Disadvantaged Regions and Transmigration (Ministry of Villages PDTT) is a ministry within the Government of Indonesia in charge of developing villages and rural areas, empowering rural communities, accelerating the development of disadvantaged areas, and transmigration. The 2014 Village Potential Data (Podes 2014) is data released by the Central Statistics Agency in collaboration with the Ministry of Villages PDTT in unsupervised form and consists of 6474 villages in the province of Aceh. Podes 2014 data is based on the level of village development (village specific) in Indonesia by using the village as the unit of analysis. Data mining is a method that can be used to group objects in a data into classes that have the same criteria (clustering). One of the algorithms that can be used for the clustering process is the k-means algorithm. Grouping data using k-means is done by calculating the shortest distance from a data point to a centroid point. In this study, a comparison of the distance calculation method on k-means between Manhattan, Euclidean and Chebyshev will be carried out. Tests will be performed using the execution time and the davies boulder index. From the tests that have been carried out, it is found that the number of villages in each cluster is 2,639 developing villages, 1,188 independent villages, 1,182 very underdeveloped villages, 1,266 developed villages and 199 disadvantaged clusters. The Chebyshev distance calculation method has the most efficient accumulation of time compared to Manhattan and Euclidean, while the Euclidean method has the most optimal Davies Index.

References

Al-Roby, MF, & El-Halees, AM (2013). Classifying Multi-Class Imbalance Data. 37(5), 74–81.

Amandeep Kaur Mann, NK (2013). Review Paper on Clustering Techniques. Global Journal of Computer Science and Technology.

Awasthi, R., Tiwari, AK, & Pathak, S. (2013). Empirical Evaluation On K Means Clustering With Effect Of Distance Functions For Bank Dataset. International Journal of Innovative Technology and Research, 1(3), 233–235.

Mishra, BK, Rath, A., Nayak, NR, & Swain, S. (2012). Far efficient K-means clustering algorithm. ACM International Conference Proceeding Series. https://doi.org/10.1145/2345396.2345414

Chakraborty, S., Nagwani, NK, & Dey, L. (2011). Performance Comparison of Incremental K-means and Incremental DBSCAN Algorithms. International Journal of Computer Applications. https://doi.org/10.5120/3346-4611

Chaudhari, B., & Parikh, M. (2012). A Comparative Study of Clustering Algorithms using Weka Tools. International Journal of Application or Innovation in Engineering and Management (IJAIEM).

Claypo, N., & Jaiyen, S. (2015). Opinion mining for Thai restaurant reviews using K-Means clustering and MRF feature selection. 2015 7th International Conference on Knowledge and Smart Technology (KST), 105–108. https://doi.org/10.1109/KST.2015.7051469

Deepa, VK, Rexy, J., & Geetha, R. (2013). Rapid development of applications in data mining. 2013 International Conference on Green High Performance Computing, ICGHPC 2013. https://doi.org/10.1109/ICGHPC.2013.6533916

Ding, S., Wu, F., Qian, J., Jia, H., & Jin, F. (2015). Research on data stream clustering algorithms. Artificial Intelligence Reviews. https://doi.org/10.1007/s10462-013-9398-7

Directorate General of Public Administration, KDN (nd). https://www.bps.go.id/statictable/2014/09/05/1366/wide-area-dan-number-islands-menurut-provinsi-2002-2016.html.

Gandhi, G., & Srivastava, R. (2014). Review Paper: A Comparative Study on Partitioning Techniques of Clustering Algorithms. International Journal of Computer Applications, 87(9), 10–13. https://doi.org/10.5120/15235-3770

Ghosh, S., & Kumar, S. (2013). Comparative Analysis of K-Means and Fuzzy C-Means Algorithms. International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/ijacsa.2013.040406

Grabusts, P. (2011). The choice of metrics for clustering algorithms. Video. Technology. Resurrection - Environment, Technology, Resources. https://doi.org/10.17770/etr2011vol2.973

Hope, FR (2013). The Impact of Urbanization on City Development in Indonesia. Society, 1(1), 35–45. https://doi.org/10.33019/society.v1i1.40

Kouser, K., & Sunita, S. (2013). A comparative study of K Means Algorithm by Different Distance Measures. International Journal of Innovative Research in Computer and Communication Engineering.

KumarSagar, H., & Sharma, V. (2014). Error Evaluation on K-Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset. International Journal of Computer Applications. https://doi.org/10.5120/15066-3429

Pratap, S., Kushwah, S., Rawat, K., & Gupta, P. (2012). Analysis and Comparison of Efficient Techniques of Clustering Algorithms in Data Mining. 3, 109–113.

Singh, A., Yadav, A., & Rana, A. (2013). K-means with Three different Distance Metrics. International Journal of Computer Applications. https://doi.org/10.5120/11430-6785

Soleh, A. (2017). Village Potential Development Strategy. Sungkai Journal, 5(1), 35–52.

Verma, M., Srivastava, M., Chack, N., Diswar, AK, & Gupta, N. (2012). A Comparative Study of Various Clustering Algorithms in Data Mining. International Journal of Engineering Research and Applications Www.Ijera.Com.

Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: Privacy and data mining. IEEE Access. https://doi.org/10.1109/ACCESS.2014.2362522

Yadav, J., & Sharma, M. (2013). A Review of K-mean Algorithms. International Journal of Engineering Trends and Technology.

GT Pranoto, W Hadikristanto, Y Religia (2022). "Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm" JISA (Journal of Informatics and Science) 5 (1): 28-34.

Downloads

Published

2022-12-28