Part-of-Speech Tagging Bahasa Jawa Menggunakan Model Pre-Trained Bidirectional Encoder Representation from Transformers
DOI:
https://doi.org/10.33633/joins.v11i1.14923Kata Kunci:
bahasa jawa, deep learning, BERT, part-of-speech taggingAbstrak
Part-of-Speech Tagging (POS tagging) merupakan proses penentuan kelas kata dalam suatu teks yang penting dalam pemrosesan bahasa alami (Natural Language Processing). Pada bahasa Jawa, POS tagging masih merupakan tantangan karena keterbatasan sumber daya linguistik dan kompleksitas bahasa tersebut. Dengan perkembangan teknologi deep learning, metode fine-tuning BERT (Bidirectional Encoder Representations from Transformers) telah diterapkan untuk melakukan penandaan kelas kata dalam bahasa Jawa, yang merupakan bahasa dengan sumber daya terbatas. Model javanese-bert-small dilatih menggunakan dataset UD_Javanese-CSUI, dan dievaluasi menggunakan metrik precision, recall, F1-score, dan accuracy. Hasil penelitian menunjukkan bahwa model mencapai performa mumpuni dengan akurasi tercapai 88,87%, serta menunjukkan kestabilan selama pelatihan tanpa overfitting signifikan. Temuan ini menunjukkan bahwa pendekatan berbasis BERT efektif untuk menangani ambiguitas kelas kata dalam bahasa Jawa dan dapat menjadi pijakan untuk pengembangan lebih lanjut dalam sistem NLP untuk bahasa daerah.Referensi
S. M. Ah, R. D. Permata, and R. Nugrahani, “Pengaruh Pemanfaatan Aplikasi Digital Berbasis Android terhadap Perkembangan Bahasa Jawa pada Anak Usia Dini,” Indones. Res. J. Educ., vol. 5, no. https://irje.org/irje/issue/view/15, pp. 155 – 163, 2025, doi: https://doi.org/10.31004/irje.v5i1.1801.
I. Alfina, A. Yuliawati, D. Tanaya, A. Dinakaramani, and D. Zeman, “A Gold Standard Dataset for Javanese Tokenization, POS Tagging, Morphological Feature Tagging, and Dependency Parsing,” Forum Linguist. Stud., vol. 6, no. 5, pp. 131–148, 2024, doi: 10.30564/fls.v6i5.6957.
A. Raup, W. Ridwan, Y. Khoeriyah, S. Supiana, and Q. Y. Zaqiah, “Deep Learning dan Penerapannya dalam Pembelajaran,” JIIP - J. Ilm. Ilmu Pendidik., vol. 5, no. 9, pp. 3258–3267, 2022, doi: 10.54371/jiip.v5i9.805.
T. M. Nasir, Y. K. I. Rohima, M. Sabarudin, M. Yasir, S. Supiana, and Q. Y. Zaqiah, “Innovation in the Field of Learning: Deep Learning Approach and Its Application in Learning at Hayat School Bandung City,” Al Ulya J. Pendidik. Islam, vol. 10, no. 2, pp. 221–239, 2025, doi: 10.32665/alulya.v10i2.4414.
Y. Banua and W. Wiji, “The Implementation of Deep Learning Based Experiential Learning in Developing Metacognitive and Critical Thinking Skills of High School Students: A Systematic Literature Review,” Eurasia Proc. Educ. Soc. Sci., vol. 46, pp. 10–19, 2025, doi: 10.55549/epess.977.
E. C. Garrido-Merchan, R. Gozalo-Brizuela, and S. Gonzalez-Carvajal, “Comparing BERT Against Traditional Machine Learning Models in Text Classification,” J. Comput. Cogn. Eng., vol. 2, no. 4, pp. 352–356, 2023, doi: 10.47852/bonviewJCCE3202838.
M. Raquib et al., “A Unified BERT-CNN-BiLSTM Framework for Simultaneous Headline Classification and Sentiment Analysis of Bangla News,” pp. 1–16, 2025, [Online]. Available: http://arxiv.org/abs/2511.18618
Y. Liang and J. Liu, “Robust Text Classification via Improved CNN, Unbalanced BiLSTM, and Multi-Head Attention,” Informatica, vol. 49, no. 35, pp. 95–108, 2025, doi: 10.31449/inf.v49i35.11100.
M. Homburg et al., “AI-driven early infectious disease detection in Dutch primary care using BERT and ERNIE,” npj Digit. Med., 2025, doi: 10.1038/s41746-025-02278-7.
J. Rawa and J. Sienkiewicz, “Quantifying correlations between information overload and fake news during COVID-19 pandemic: a Reddit study with BERT model approach,” pp. 1–22, 2026, [Online]. Available: http://arxiv.org/abs/2601.00496
M. Alfian, U. L. Yuhana, and D. Siahaan, “Indonesian Part-of-Speech Tagger: A Comparative Study,” 2023 10th Int. Conf. Adv. Informatics Concept, Theory Appl. ICAICTA 2023, no. October 2023, pp. 1–6, 2023, doi: 10.1109/ICAICTA59291.2023.10390353.
M. Alfian, U. L. Yuhana, D. Siahaan, H. Munazharoh, and E. Pardede, “Out-of-Vocabulary Handling in Part-of-Speech Tagging: A Semantic Web-Driven Systematic Review,” Int. J. Semant. Web Inf. Syst., vol. 21, no. 1, pp. 1–36, 2025, doi: 10.4018/IJSWIS.388421.
A. Sultana and F. Ahmed, “Explicit Grammar Semantic Feature Fusion for Robust Text Classification,” 2026, [Online]. Available: http://arxiv.org/abs/2602.20749
A. Zilziana, A. A. Suryani, and I. Asror, “Part of Speech Tagging Menggunakan Bahasa Jawa Dengan Metode Condition Random Fields,” e-Proceeding Eng., vol. 7, no. 2, pp. 8103–8111, 2020.
H. Li, H. Mao, and J. Wang, “Part-of-speech tagging with rule-based data preprocessing and transformer,” Electron., vol. 11, no. 1, 2022, doi: 10.3390/electronics11010056.
H. Visuwalingam, R. Sakuntharaj, J. Alawatugoda, and R. Ragel, “Deep Learning Model for Tamil Part-of-Speech Tagging,” Comput. J., vol. 67, no. 8, pp. 2633–2642, 2024, doi: 10.1093/comjnl/bxae033.
P. Sonawane, K. T. Patil, R. P. Bhavsar, and B. V Pawar, “POS Tagging : A Review of Recent Techniques,” 2026.
Ryan Armiditya Pratama, A. A. Suryani, and W. Maharani, “Part of Speech Tagging for Javanese Language with Hidden Markov Model,” J. Comput. Sci. Informatics Eng., vol. 4, no. 1, pp. 84–91, 2020, doi: 10.29303/jcosine.v4i1.346.
D. Fimoza, A. Amalia, and T. Henny Febriana Harumy, “Sentiment Analysis for Movie Review in Bahasa Indonesia Using BERT,” 2021 Int. Conf. Data Sci. Artif. Intell. Bus. Anal. DATABIA 2021 - Proc., pp. 27–34, 2021, doi: 10.1109/DATABIA53375.2021.9650096.
P. You, C. So, S. Choe, and Y. Lee, “Word Embeddings Network and Transformer Based Part of Speech Tagging for Korean,” vol. 12, no. 1, pp. 11–24, 2026.
Y. Jumaryadi, R. Meiyanti, R. Fajriah, A. N. Mahsyar, and P. S. Anggraeni, “Implementasi Algoritma Random Forest untuk Analisis Sentimen Ulasan Pengguna Aplikasi Merdeka Mengajar,” Bull. Comput. Sci. Res., vol. 5, no. 4, pp. 813–820, 2025, doi: 10.47065/bulletincsr.v5i4.530.
W. Wongso, D. S. Setiawan, and D. Suhartono, “Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures,” 2021 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2021, 2021, doi: 10.1109/ICACSIS53237.2021.9631331.
##submission.downloads##
Diterbitkan
Cara Mengutip
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2026 JOINS (Journal of Information System)

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

This work is licensed under a Creative Commons Attribution 4.0 International License.


















