A Comparative Study of Multi-Label Classification for Document Labeling in Ethical Protocol Review


  • Rizka Wakhidatus Sholikah Institut Teknologi Sepuluh Nopember
  • Diana Purwitasari Institut Teknologi Sepuluh Nopember
  • Mohammad Zaenuddin Hamidi Institut Teknologi Sepuluh Nopember




Ethical protocol, multi-label classification, automatic labeling


An ethical clearance document ensures that the research will protect the subject in accordance with existing ethical principles. The ethical clearance is issued by the Research Ethics Commission (KEP). KEP will conduct a review of the proposed ethical protocol based on the seven standards contained in a protocol. The review process is done manually by KEP. This process often creates bottlenecks in research due to the large number of protocols that must be reviewed, so that the process to get ethical clearance takes a long time. This can affect the setback in the schedule of the research process. Therefore, in this research, a comparative study was conducted on the problem of multi-label classification to automate the ethical protocol review process. Automation of the labeling process can increase the effectiveness of the review process because it can provide an overview to the reviewer regarding the label of a document before conducting a more in-depth review process. The experiment results show that the use of the traditional machine learning approach produces better performance than the deep learning approach. The machine learning method with the best results is Naïve Bayes+BoW with precision, recall, and F-score values of 0.76, 0.80, and 0.78, respectively.


A. F. Abdillah, M. Z. Hamidi, R. N. E. Anggraeni and R. Sarno, "Comparative Study of Single-task and Multi-task Learning on Research Protocol Document Classification," in 13th International Conference on Information & Communication Technology and System (ICTS), Surabaya, 2021.

A. Binik and S. P. Hey, "A Framework for Assessing Scientific Merit in Ethical Review of Clinical Research," ethics & Human Research, vol. 41, no. 2, pp. 2-13, 2019.

Pemerintah Republik Indonesia, "JDIH BPK RI: Database Peraturan," 2009. [Online]. Available: https://peraturan.bpk.go.id/Home/Details/38778/uu-no-36-tahun-2009. [Accessed 10 10 2021].

Pemerintah Republik Indonesia, "Badan Pembina Hukum Nasional," [Online]. Available: http://www.bphn.go.id/data/documents/95pp039.pdf. [Accessed 10 10 2021].

The Council for International Organizations of Medical Sciences (CIOMS), "International Ethical Guidelines for Health-related Research Involving Humans," Geneva, 2016.

A. Benton, G. Coppersmith and M. Dredze, "Ethical Research Protocols for Social Media Health Research," in Procedsings of the First Workshop on Ethics in Natural Language Processing, Valencia, Spain, 2017.

J. M. Barrow, G. D. Brannan and P. B. Khandar, "StatPearls [internet]," StatPerl Publishing, 28 8 2021. [Online]. Available: https://www.ncbi.nlm.nih.gov/books/NBK459281/. [Accessed 10 10 2021].

Ministry of Health of the Republic of Indonesia, Pedoman dan Standar Etik Penelitian dan Pengembangan Kesehatan Nasional, Jakarta, 2017.

G. Mustafa, M. Usman, L. Yu, M. T. Afzal, M. Sulaiman and A. Shahid, "Multi-label classification of research articles using Word2Vec and identification of similarity threshold," Sci Rep, vol. 11, 2021.

T. Pradhan and S. Pal, "A multi-level fusion based decision support system for academic collaborator recommendation," Knowledge-Based Systems, 2020.

D. Chakrabarti, N. Patodia, U. Bhattacharya, I. Mitra, S. Roy, J. Mandi, N. Roy and P. Nandy, "Use of Artificial Intelligence to Analyse Risk in Legal Documents for a Better Decision Support," in Proceedings of TENCON 2018, Jeju, 2018.

W. K. Sari, D. P. Rini, R. F. Malik and I. S. B. Azhar, "Klasifikasi teks multilabel pada artikel berita menggunakan long short-term memory dengan Word2Vec," RESTI, vol. 5, no. 3, 2021.

T. Baumel, J. N. Kassis, R. Cohen, M. Elhadad and N. Elhadad, "Multi-label classification of patient notes: case study on ICD code assignment," in The Workshops of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

A. Omar, T. M. Mahmoud, T. A. El-Hafeez and A. Mahfouz, "Multi-label Arabic text classification in online social networks," Information Systems, vol. 100, 2021.

S. Burkhardt and S. Kramer, "Online multi-label dependency topic models for text classification," Mach learn, vol. 107, pp. 859-886, 2018.

A. Pal, M. Selvakumar and M. Sankarasubbu, "MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network," in In Proceedings of the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020), 2020.

N. Isnaini, Adiwijaya, M. S. Mubarok and M. Y. Abu-Bakar, "A multi-label classification on topics of Indonesian news using K-Nearest Neighbor," in The 2nd International Conference on Data and Information Science, 2019.

J. Pennington, R. Socher and C. D. Manning, "GloVe: Global Vectors for Word Representation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.

G. Biau and E. Scornet, "A random forest guided tour," TEST, vol. 25, pp. 197-227, 2016.





