A Systematic Evaluation of BERT Classifiers for Indonesia-based Text Data

Yogie Oktavianus Sihombing; Khusnul Muchlisin; Tri Fidrian Arya; Moh. Jabir Mubarok; Reza Fuad Rachmadi

doi:10.62411/tc.v25i2.15843

A Systematic Evaluation of BERT Classifiers for Indonesia-based Text Data

Authors

Yogie Oktavianus Sihombing Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia
Khusnul Muchlisin Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia
Tri Fidrian Arya Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia
Moh. Jabir Mubarok Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia
Reza Fuad Rachmadi Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia

DOI:

https://doi.org/10.62411/tc.v25i2.15843

Abstract

This study presents a systematic evaluation of Indonesian BERT models across multiple natural language processing (NLP) tasks, including named entity recognition (NER), sentiment analysis (SA), emotion classification (EmoT), and hate speech detection (HS). Unlike prior studies that primarily focus on effectiveness metrics, this work incorporates both effectiveness (F1-Macro and accuracy) and efficiency (training time and memory usage) to provide a more comprehensive benchmark. Experimental results show that IndoRoBERTa achieves the highest overall F1-Macro (0.826), indicating strong generalization across tasks, while IndoNLU attains the highest accuracy (0.833), suggesting better performance on dominant classes. IndoLEM demonstrates superior efficiency with the lowest training time (988.68 seconds) and minimal GPU memory usage (4.00 GB), making it suitable for resource-constrained environments. In contrast, the multilingual mBERT model exhibits higher computational cost with comparatively lower efficiency. The findings highlight a trade-off between performance and computational efficiency, where monolingual Indonesian models consistently outperform multilingual models in both effectiveness and resource utilization. These results provide practical insights for selecting appropriate pretrained language models based on task requirements and computational constraints in Indonesian NLP applications. Keywords - BERT; Indonesian NLP; model efficiency; multi-task evaluation

Downloads

Published

2026-05-28

Issue

Vol. 25 No. 2 (2026): May 2026

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

License Terms

All articles published in Techno.COM Journal are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). This means:

1. Attribution

Readers and users are free to:

Share – Copy and redistribute the material in any medium or format.
Adapt – Remix, transform, and build upon the material.

As long as proper credit is given to the original work by citing the author(s) and the journal.

2. Non-Commercial Use

The material cannot be used for commercial purposes.
Commercial use includes selling the content, using it in commercial advertising, or integrating it into products/services for profit.

3. Rights of Authors

Authors retain copyright and grant Techno.COM Journal the right to publish the article.
Authors can distribute their work (e.g., in institutional repositories or personal websites) with proper acknowledgment of the journal.

4. No Additional Restrictions

The journal cannot apply legal terms or technological measures that restrict others from using the material in ways allowed by the license.

5. Disclaimer

The journal is not responsible for how the published content is used by third parties.
The opinions expressed in the articles are solely those of the authors.

For more details, visit the Creative Commons License Page:
? https://creativecommons.org/licenses/by-nc/4.0/

A Systematic Evaluation of BERT Classifiers for Indonesia-based Text Data

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

License Terms

1. Attribution

2. Non-Commercial Use

3. Rights of Authors

4. No Additional Restrictions

5. Disclaimer

Make a Submission

Google Scholar Metric

Information

Developed By

Current Issue