Part-of-Speech Tagging in Javanese Using Pre-Trained Bidirectional Encoder Representation Model from Transformers
DOI:
https://doi.org/10.33633/joins.v11i1.14923Keywords:
BERT, deep learing, javanese language, part-of-speech taggingAbstract
Part-of-Speech Tagging (POS tagging) is the process of determining word classes in a text that is important in natural language processing. In Javanese, POS tagging is still a challenge due to limited linguistic resources and the complexity of the language. With the development of deep learning technology, the BERT (Bidirectional Encoder Representations from Transformers) fine-tuning method has been applied to classify word classes in Javanese, which is a language with limited resources. The javanese-bert-small model was trained using the UD_Javanese-CSUI dataset, and evaluated using precision, recall, F1-score, and accuracy metrics. The results showed that the model achieved good performance with an accuracy of 88,87%, and showed stability during training without significant overfitting. These findings indicate that the BERT-based approach is effective in handling word class ambiguity in Javanese and can be a stepping stone for further development in NLP systems for regional languages.References
S. M. Ah, R. D. Permata, and R. Nugrahani, “Pengaruh Pemanfaatan Aplikasi Digital Berbasis Android terhadap Perkembangan Bahasa Jawa pada Anak Usia Dini,” Indones. Res. J. Educ., vol. 5, no. https://irje.org/irje/issue/view/15, pp. 155 – 163, 2025, doi: https://doi.org/10.31004/irje.v5i1.1801.
I. Alfina, A. Yuliawati, D. Tanaya, A. Dinakaramani, and D. Zeman, “A Gold Standard Dataset for Javanese Tokenization, POS Tagging, Morphological Feature Tagging, and Dependency Parsing,” Forum Linguist. Stud., vol. 6, no. 5, pp. 131–148, 2024, doi: 10.30564/fls.v6i5.6957.
A. Raup, W. Ridwan, Y. Khoeriyah, S. Supiana, and Q. Y. Zaqiah, “Deep Learning dan Penerapannya dalam Pembelajaran,” JIIP - J. Ilm. Ilmu Pendidik., vol. 5, no. 9, pp. 3258–3267, 2022, doi: 10.54371/jiip.v5i9.805.
T. M. Nasir, Y. K. I. Rohima, M. Sabarudin, M. Yasir, S. Supiana, and Q. Y. Zaqiah, “Innovation in the Field of Learning: Deep Learning Approach and Its Application in Learning at Hayat School Bandung City,” Al Ulya J. Pendidik. Islam, vol. 10, no. 2, pp. 221–239, 2025, doi: 10.32665/alulya.v10i2.4414.
Y. Banua and W. Wiji, “The Implementation of Deep Learning Based Experiential Learning in Developing Metacognitive and Critical Thinking Skills of High School Students: A Systematic Literature Review,” Eurasia Proc. Educ. Soc. Sci., vol. 46, pp. 10–19, 2025, doi: 10.55549/epess.977.
E. C. Garrido-Merchan, R. Gozalo-Brizuela, and S. Gonzalez-Carvajal, “Comparing BERT Against Traditional Machine Learning Models in Text Classification,” J. Comput. Cogn. Eng., vol. 2, no. 4, pp. 352–356, 2023, doi: 10.47852/bonviewJCCE3202838.
M. Raquib et al., “A Unified BERT-CNN-BiLSTM Framework for Simultaneous Headline Classification and Sentiment Analysis of Bangla News,” pp. 1–16, 2025, [Online]. Available: http://arxiv.org/abs/2511.18618
Y. Liang and J. Liu, “Robust Text Classification via Improved CNN, Unbalanced BiLSTM, and Multi-Head Attention,” Informatica, vol. 49, no. 35, pp. 95–108, 2025, doi: 10.31449/inf.v49i35.11100.
M. Homburg et al., “AI-driven early infectious disease detection in Dutch primary care using BERT and ERNIE,” npj Digit. Med., 2025, doi: 10.1038/s41746-025-02278-7.
J. Rawa and J. Sienkiewicz, “Quantifying correlations between information overload and fake news during COVID-19 pandemic: a Reddit study with BERT model approach,” pp. 1–22, 2026, [Online]. Available: http://arxiv.org/abs/2601.00496
M. Alfian, U. L. Yuhana, and D. Siahaan, “Indonesian Part-of-Speech Tagger: A Comparative Study,” 2023 10th Int. Conf. Adv. Informatics Concept, Theory Appl. ICAICTA 2023, no. October 2023, pp. 1–6, 2023, doi: 10.1109/ICAICTA59291.2023.10390353.
M. Alfian, U. L. Yuhana, D. Siahaan, H. Munazharoh, and E. Pardede, “Out-of-Vocabulary Handling in Part-of-Speech Tagging: A Semantic Web-Driven Systematic Review,” Int. J. Semant. Web Inf. Syst., vol. 21, no. 1, pp. 1–36, 2025, doi: 10.4018/IJSWIS.388421.
A. Sultana and F. Ahmed, “Explicit Grammar Semantic Feature Fusion for Robust Text Classification,” 2026, [Online]. Available: http://arxiv.org/abs/2602.20749
A. Zilziana, A. A. Suryani, and I. Asror, “Part of Speech Tagging Menggunakan Bahasa Jawa Dengan Metode Condition Random Fields,” e-Proceeding Eng., vol. 7, no. 2, pp. 8103–8111, 2020.
H. Li, H. Mao, and J. Wang, “Part-of-speech tagging with rule-based data preprocessing and transformer,” Electron., vol. 11, no. 1, 2022, doi: 10.3390/electronics11010056.
H. Visuwalingam, R. Sakuntharaj, J. Alawatugoda, and R. Ragel, “Deep Learning Model for Tamil Part-of-Speech Tagging,” Comput. J., vol. 67, no. 8, pp. 2633–2642, 2024, doi: 10.1093/comjnl/bxae033.
P. Sonawane, K. T. Patil, R. P. Bhavsar, and B. V Pawar, “POS Tagging : A Review of Recent Techniques,” 2026.
Ryan Armiditya Pratama, A. A. Suryani, and W. Maharani, “Part of Speech Tagging for Javanese Language with Hidden Markov Model,” J. Comput. Sci. Informatics Eng., vol. 4, no. 1, pp. 84–91, 2020, doi: 10.29303/jcosine.v4i1.346.
D. Fimoza, A. Amalia, and T. Henny Febriana Harumy, “Sentiment Analysis for Movie Review in Bahasa Indonesia Using BERT,” 2021 Int. Conf. Data Sci. Artif. Intell. Bus. Anal. DATABIA 2021 - Proc., pp. 27–34, 2021, doi: 10.1109/DATABIA53375.2021.9650096.
P. You, C. So, S. Choe, and Y. Lee, “Word Embeddings Network and Transformer Based Part of Speech Tagging for Korean,” vol. 12, no. 1, pp. 11–24, 2026.
Y. Jumaryadi, R. Meiyanti, R. Fajriah, A. N. Mahsyar, and P. S. Anggraeni, “Implementasi Algoritma Random Forest untuk Analisis Sentimen Ulasan Pengguna Aplikasi Merdeka Mengajar,” Bull. Comput. Sci. Res., vol. 5, no. 4, pp. 813–820, 2025, doi: 10.47065/bulletincsr.v5i4.530.
W. Wongso, D. S. Setiawan, and D. Suhartono, “Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures,” 2021 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2021, 2021, doi: 10.1109/ICACSIS53237.2021.9631331.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 JOINS (Journal of Information System)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

This work is licensed under a Creative Commons Attribution 4.0 International License.


















