Manhattan, Euclidean And Chebyshev Methods In K-Means Algorithm For Village Status Grouping In Aceh Province
DOI:
https://doi.org/10.33633/jais.v7i3.7037Abstract
The Ministry of Villages, Development of Disadvantaged Regions and Transmigration (Ministry of Villages PDTT) is a ministry within the Government of Indonesia in charge of developing villages and rural areas, empowering rural communities, accelerating the development of disadvantaged areas, and transmigration. The 2014 Village Potential Data (Podes 2014) is data released by the Central Statistics Agency in collaboration with the Ministry of Villages PDTT in unsupervised form and consists of 6474 villages in the province of Aceh. Podes 2014 data is based on the level of village development (village specific) in Indonesia by using the village as the unit of analysis. Data mining is a method that can be used to group objects in a data into classes that have the same criteria (clustering). One of the algorithms that can be used for the clustering process is the k-means algorithm. Grouping data using k-means is done by calculating the shortest distance from a data point to a centroid point. In this study, a comparison of the distance calculation method on k-means between Manhattan, Euclidean and Chebyshev will be carried out. Tests will be performed using the execution time and the davies boulder index. From the tests that have been carried out, it is found that the number of villages in each cluster is 2,639 developing villages, 1,188 independent villages, 1,182 very underdeveloped villages, 1,266 developed villages and 199 disadvantaged clusters. The Chebyshev distance calculation method has the most efficient accumulation of time compared to Manhattan and Euclidean, while the Euclidean method has the most optimal Davies Index.References
Al-Roby, MF, & El-Halees, AM (2013). Classifying Multi-Class Imbalance Data. 37(5), 74–81.
Amandeep Kaur Mann, NK (2013). Review Paper on Clustering Techniques. Global Journal of Computer Science and Technology.
Awasthi, R., Tiwari, AK, & Pathak, S. (2013). Empirical Evaluation On K Means Clustering With Effect Of Distance Functions For Bank Dataset. International Journal of Innovative Technology and Research, 1(3), 233–235.
Mishra, BK, Rath, A., Nayak, NR, & Swain, S. (2012). Far efficient K-means clustering algorithm. ACM International Conference Proceeding Series. https://doi.org/10.1145/2345396.2345414
Chakraborty, S., Nagwani, NK, & Dey, L. (2011). Performance Comparison of Incremental K-means and Incremental DBSCAN Algorithms. International Journal of Computer Applications. https://doi.org/10.5120/3346-4611
Chaudhari, B., & Parikh, M. (2012). A Comparative Study of Clustering Algorithms using Weka Tools. International Journal of Application or Innovation in Engineering and Management (IJAIEM).
Claypo, N., & Jaiyen, S. (2015). Opinion mining for Thai restaurant reviews using K-Means clustering and MRF feature selection. 2015 7th International Conference on Knowledge and Smart Technology (KST), 105–108. https://doi.org/10.1109/KST.2015.7051469
Deepa, VK, Rexy, J., & Geetha, R. (2013). Rapid development of applications in data mining. 2013 International Conference on Green High Performance Computing, ICGHPC 2013. https://doi.org/10.1109/ICGHPC.2013.6533916
Ding, S., Wu, F., Qian, J., Jia, H., & Jin, F. (2015). Research on data stream clustering algorithms. Artificial Intelligence Reviews. https://doi.org/10.1007/s10462-013-9398-7
Directorate General of Public Administration, KDN (nd). https://www.bps.go.id/statictable/2014/09/05/1366/wide-area-dan-number-islands-menurut-provinsi-2002-2016.html.
Gandhi, G., & Srivastava, R. (2014). Review Paper: A Comparative Study on Partitioning Techniques of Clustering Algorithms. International Journal of Computer Applications, 87(9), 10–13. https://doi.org/10.5120/15235-3770
Ghosh, S., & Kumar, S. (2013). Comparative Analysis of K-Means and Fuzzy C-Means Algorithms. International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/ijacsa.2013.040406
Grabusts, P. (2011). The choice of metrics for clustering algorithms. Video. Technology. Resurrection - Environment, Technology, Resources. https://doi.org/10.17770/etr2011vol2.973
Hope, FR (2013). The Impact of Urbanization on City Development in Indonesia. Society, 1(1), 35–45. https://doi.org/10.33019/society.v1i1.40
Kouser, K., & Sunita, S. (2013). A comparative study of K Means Algorithm by Different Distance Measures. International Journal of Innovative Research in Computer and Communication Engineering.
KumarSagar, H., & Sharma, V. (2014). Error Evaluation on K-Means and Hierarchical Clustering with Effect of Distance Functions for Iris Dataset. International Journal of Computer Applications. https://doi.org/10.5120/15066-3429
Pratap, S., Kushwah, S., Rawat, K., & Gupta, P. (2012). Analysis and Comparison of Efficient Techniques of Clustering Algorithms in Data Mining. 3, 109–113.
Singh, A., Yadav, A., & Rana, A. (2013). K-means with Three different Distance Metrics. International Journal of Computer Applications. https://doi.org/10.5120/11430-6785
Soleh, A. (2017). Village Potential Development Strategy. Sungkai Journal, 5(1), 35–52.
Verma, M., Srivastava, M., Chack, N., Diswar, AK, & Gupta, N. (2012). A Comparative Study of Various Clustering Algorithms in Data Mining. International Journal of Engineering Research and Applications Www.Ijera.Com.
Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: Privacy and data mining. IEEE Access. https://doi.org/10.1109/ACCESS.2014.2362522
Yadav, J., & Sharma, M. (2013). A Review of K-mean Algorithms. International Journal of Engineering Trends and Technology.
GT Pranoto, W Hadikristanto, Y Religia (2022). "Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm" JISA (Journal of Informatics and Science) 5 (1): 28-34.
Downloads
Published
Issue
Section
License
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).