Language-Similarity-Guided Transfer Fine-Tuning of Pre-trained Transformer Models for Sentiment Analysis Across 12 Indonesian Regional Languages

Brian Rizqi Paradisiaca Darnoto; Dony Bahtera Firmawan

doi:10.62411/jcta.15975

Authors

Brian Rizqi Paradisiaca Darnoto University of Jember
Dony Bahtera Firmawan University of Jember

DOI:

https://doi.org/10.62411/jcta.15975

Keywords:

Indonesian regional languages, Low-resource NLP, NusaX, Pre-trained language models, Sentiment analysis, SHAP explainability, Transfer learning, XLM-R

Abstract

Sentiment analysis for Indonesian regional languages faces two persistent challenges: labeled training data is extremely limited for most regional varieties, and transformer models pre-trained on Bahasa Indonesia do not generalize reliably to languages with substantially different morphological structures. Prior work on the NusaX benchmark has primarily relied on direct fine-tuning, treating each regional language independently and without exploiting linguistic proximity between related languages as a transfer signal. This paper proposes Language-Similarity-Guided Transfer (LSGT), a sequential fine-tuning strategy that first adapts a pre-trained model to a pivot language selected using character trigram similarity, followed by fine-tuning on the target language. Four transformer models are evaluated across all 12 NusaX languages using the official train/validation/test splits: IndoBERT, NusaBERT, mBERT, and XLM-R. Performance is evaluated using four metrics: accuracy, macro F1, macro precision, and macro recall. Experimental results show that LSGT improves macro F1 in 44 of 48 model-language combinations, demonstrating that the fine-tuning strategy itself is a major factor in low-resource cross-lingual sentiment classification. XLM-R benefits most strongly from LSGT, achieving an average improvement of +0.137 macro F1 and a peak gain of +0.298 on Madurese. SHAP-based token attribution analysis further reveals that predictions rely heavily on named entities and domain-specific nouns rather than sentiment-bearing vocabulary, indicating a dataset-level bias inherited from the original SmSA corpus and propagated through the NusaX translation pipeline.

Author Biographies

Brian Rizqi Paradisiaca Darnoto, University of Jember

Informatics Department, University of Jember, Jember 68121, Indonesia

Dony Bahtera Firmawan, University of Jember

Informatics Department, University of Jember, Jember 68121, Indonesia

References

B. Liu, Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. Cambridge University Press, 2020. doi: 10.1017/9781108639286.

W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, Dec. 2014, doi: 10.1016/j.asej.2014.04.011.

M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artif. Intell. Rev., vol. 55, no. 7, pp. 5731–5780, Oct. 2022, doi: 10.1007/s10462-022-10144-1.

S. Zein, Language Policy in Superdiverse Indonesia. New York : Routledge, 2020. | Series: Routledge studies in sociolinguistics: Routledge, 2020. doi: 10.4324/9780429019739.

A. F. Aji et al., “One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 7226–7249. doi: 10.18653/v1/2022.acl-long.500.

G. I. Winata et al., “NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages,” in Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023, pp. 815–834. doi: 10.18653/v1/2023.eacl-main.57.

J. Devlin, M.-W. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North, Oct. 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423.

B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” in Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020, pp. 843–857. doi: 10.18653/v1/2020.aacl-main.85.

W. Wongso, D. S. Setiawan, S. Limcorn, and A. Joyoadikusumo, “NusaBERT: Teaching IndoBERT to be Multilingual and Multicultural,” arXiv. Mar. 04, 2024. [Online]. Available: http://arxiv.org/abs/2403.01817

T. Pires, E. Schlinger, and D. Garrette, “How Multilingual is Multilingual BERT?,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 4996–5001. doi: 10.18653/v1/P19-1493.

A. Conneau et al., “Unsupervised Cross-lingual Representation Learning at Scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 8440–8451. doi: 10.18653/v1/2020.acl-main.747.

Taufiq Dwi Purnomo and Joko Sutopo, “Comparison of Pre-Trained Bert-Based Transformer Models for Regional Language Text Sentiment Analysis in Indonesia,” Int. J. Sci. Technol., vol. 3, no. 3, pp. 11–21, Nov. 2024, doi: 10.56127/ijst.v3i3.1739.

J. Phang et al., “English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too,” in Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020, pp. 557–575. doi: 10.18653/v1/2020.aacl-main.56.

S. Wu and M. Dredze, “Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 833–844. doi: 10.18653/v1/D19-1077.

S. M. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” in NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Nov. 2017, pp. 4768–4777. [Online]. Available: http://arxiv.org/abs/1705.07874

M. Aufar, R. Andreswari, and D. Pramesti, “Sentiment Analysis on Youtube Social Media Using Decision Tree and Random Forest Algorithm: A Case Study,” in 2020 International Conference on Data Science and Its Applications (ICoDSA), Aug. 2020, pp. 1–7. doi: 10.1109/ICoDSA50139.2020.9213078.

H. A. Santoso, E. H. Rachmawanto, A. Nugraha, A. A. Nugroho, D. R. I. M. Setiadi, and R. S. Basuki, “Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization,” TELKOMNIKA (Telecommunication Comput. Electron. Control., vol. 18, no. 2, p. 799, Apr. 2020, doi: 10.12928/telkomnika.v18i2.14744.

S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.

A. Purwarianti and I. A. P. A. Crisdayanti, “Improving Bi-LSTM Performance for Indonesian Sentiment Analysis Using Paragraph Vector,” in 2019 International Conference of Advanced Informatics: Concepts, Theory and Applications (ICAICTA), Sep. 2019, pp. 1–5. doi: 10.1109/ICAICTA.2019.8904199.

S. Styawati, A. Nurkholis, A. A. Aldino, S. Samsugi, E. Suryati, and R. P. Cahyono, “Sentiment Analysis on Online Transportation Reviews Using Word2Vec Text Embedding Model Feature Extraction and Support Vector Machine (SVM) Algorithm,” in 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), Jan. 2022, pp. 163–167. doi: 10.1109/ISMODE53584.2022.9742906.

H. Murfi, Syamsyuriani, T. Gowandi, G. Ardaneswari, and S. Nurrohmah, “BERT-based combination of convolutional and recurrent neural network for indonesian sentiment analysis,” Appl. Soft Comput., vol. 151, p. 111112, Jan. 2024, doi: 10.1016/j.asoc.2023.111112.

H. Ahmadian, T. F. Abidin, H. Riza, and K. Muchtar, “Hybrid Models for Emotion Classification and Sentiment Analysis in Indonesian Language,” Appl. Comput. Intell. Soft Comput., vol. 2024, no. 1, Jan. 2024, doi: 10.1155/2024/2826773.

K. S. Nugroho, A. Y. Sukmadewa, H. Wuswilahaken DW, F. A. Bachtiar, and N. Yudistira, “BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps Reviews,” in 6th International Conference on Sustainable Information Engineering and Technology 2021, Sep. 2021, pp. 258–264. doi: 10.1145/3479645.3479679.

P. Subarkah, P. Arsi, D. I. S. Saputra, A. Aminuddin, Berlilana, and N. Hermanto, “Indonesian Police in the Twitterverse: A Sentiment Analysis Perspectives,” in 2023 IEEE 7th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Nov. 2023, pp. 76–81. doi: 10.1109/ICITISEE58992.2023.10405357.

A. Angdresey, L. Sitanayah, and I. L. H. Tangka, “Sentiment Analysis for Political Debates on YouTube Comments using BERT Labeling, Random Oversampling, and Multinomial Naïve Bayes,” J. Comput. Theor. Appl., vol. 2, no. 3, pp. 342–354, Jan. 2025, doi: 10.62411/jcta.11668.

A. Vaswani et al., “Attention Is All You Need,” arXiv, vol. 30, Aug. 2023, [Online]. Available: http://arxiv.org/abs/1706.03762

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” arXiv. Nov. 02, 2020. [Online]. Available: http://arxiv.org/abs/2011.00677

T. Nguyen et al., “CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages,” in Language Resources and Evaluation Conference, May 2024, pp. 4226–4237. doi: 10.63317/5iz6z5g7eit3.

A. Kumar and V. H. C. Albuquerque, “Sentiment Analysis Using XLM-R Transformer and Zero-shot Transfer Learning on Resource-poor Indian Language,” ACM Trans. Asian Low-Resource Lang. Inf. Process., vol. 20, no. 5, pp. 1–13, Sep. 2021, doi: 10.1145/3461764.

P. Přibáň, J. Šmíd, J. Steinberger, and A. Mištera, “A comparative study of cross-lingual sentiment analysis,” Expert Syst. Appl., vol. 247, p. 123247, Aug. 2024, doi: 10.1016/j.eswa.2024.123247.

M. E. Peters, S. Ruder, and N. A. Smith, “To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks,” in Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), 2019, pp. 7–14. doi: 10.18653/v1/W19-4302.

S. Ruder, I. Vulić, and A. Søgaard, “A Survey of Cross-lingual Word Embedding Models,” J. Artif. Intell. Res., vol. 65, pp. 569–631, Aug. 2019, doi: 10.1613/jair.1.11640.

J. Barnes, R. Klinger, and S. Schulte im Walde, “Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 2483–2493. doi: 10.18653/v1/P18-1231.

N. D. A. Saputra, M. Muljono, A. Karim, and D. R. I. M. Setiadi, “End-to-End Fine-Tuning of DeBERTa-Base for Stance Detection,” J. Futur. Artif. Intell. Technol., vol. 2, no. 4, pp. 698–715, Feb. 2026, doi: 10.62411/faith.3048-3719-168.

E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” arXiv. Oct. 16, 2021. [Online]. Available: http://arxiv.org/abs/2106.09685

M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic Attribution for Deep Networks,” in ICML’17: Proceedings of the 34th International Conference on Machine Learning - Volume 70, Jun. 2017, pp. 3319–3328. [Online]. Available: http://arxiv.org/abs/1703.01365

S. Jain and B. C. Wallace, “Attention is not Explanation,” in Proceedings of the 2019 Conference of the North, 2019, pp. 3543–3556. doi: 10.18653/v1/N19-1357.

S. Wiegreffe and Y. Pinter, “Attention is not not Explanation,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 11–20. doi: 10.18653/v1/D19-1002.

A. Barredo Arrieta et al., “Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI,” Inf. Fusion, vol. 58, pp. 82–115, Jun. 2020, doi: 10.1016/j.inffus.2019.12.012.

J. B. Oluwagbemi, A. E. Mesioye, and R. S. Akinbo, “Depress-HybridNet: A Linguistic-Behavioral Hybrid Framework for Early and Accurate Depression Detection on Social Media,” J. Futur. Artif. Intell. Technol., vol. 2, no. 3, pp. 432–444, Sep. 2025, doi: 10.62411/faith.3048-3719-266.