Peringkasan Teks Multi Dokumen Berbahasa Indonesia Menggunakan Sentence Scoring dan SVM
DOI:
https://doi.org/10.62411/tc.v23i1.9648Keywords:
Peringkasan Teks, Multi Dokumen, SVM, Berita OnlineAbstract
Berita online berasal dari berbagai sumber portal berita yang tersedia secara luas di dunia maya. Namun, berita online yang melimpah dapat mengesampingkan detail dan keakuratan berita karena tujuannya untuk memberikan informasi terkini sebanyak mungkin. Ketersediaan berita online di internet dapat menyebabkan penerimaan informasi yang berlebihan, memberikan pemahaman yang kurang jelas mengenai substansi berita tersebut. Oleh karena itu, penting untuk menemukan representasi dokumen berita online guna memahami inti dari berita tersebut. Penelitian ini fokus pada menghasilkan ringkasan berita online multi dokumen dari ekstraksi fitur dan proses klasifikasi menggunakan support vector machine. penelitian ini mengklasifikan berita multi dokumen menggunakan ekstraksi fitur Sentence Scoring dan SVM. Sentence Scoring digunakan untuk input pada metode SVM agar dapat melakukan proses klasifikasi untuk menentukan hasil ringkasan. Hasil pengujian menunjukkan bahwa Fold 3 memberikan hasil terbaik, dengan rata-rata Recall 0.946, Presisi 0.487, dan F-Measure 0.634. ROUGE-1 juga mencapai nilai tertinggi pada Fold 3, yaitu 0.946. Faktor kunci dalam hasil peringkasan adalah proses ekstraksi fitur menggunakan Sentence Scoring dan pelatihan data dengan SVM. Fitur seperti data numerik dan kemiripan antar kalimat berpengaruh signifikan terhadap hasil akhir dari peringkasan.References
N. Hayatin, K. M. Ghufron, and G. W. Wicaksono, “Summarization of COVID-19 news documents deep learning-based using transformer architecture,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 19, no. 3, pp. 754–761, 2021, doi: 10.12928/TELKOMNIKA.v19i3.18356.
R. C. Belwal, S. Rai, and A. Gupta, “Extractive text summarization using clustering-based topic modeling,” Soft Comput., 2022, doi: 10.1007/s00500-022-07534-6.
W. Widodo, M. Nugraheni, and I. P. Sari, “A comparative review of extractive text summarization in Indonesian language,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1098, no. 3, p. 032041, 2021, doi: 10.1088/1757-899x/1098/3/032041.
T. M. P. Aulia, A. Jamaludin, and ..., “Extractive Text Summerization Pada Berita Berbahasa Indonesia Menggunakan Algoritma Support Vector Machine,” J-SAKTI (Jurnal Sains …, vol. 5, no. September, pp. 727–735, 2021, [Online]. Available: http://ejurnal.tunasbangsa.ac.id/index.php/jsakti/article/view/371.
A. Qaroush, I. Abu Farha, W. Ghanem, M. Washaha, and E. Maali, “An efficient single document Arabic text summarization using a combination of statistical and semantic features,” J. King Saud Univ. - Comput. Inf. Sci., vol. 33, no. 6, pp. 677–692, 2021, doi: 10.1016/j.jksuci.2019.03.010.
M. Gambhir and V. Gupta, “Recent automatic text summarization techniques: a survey,”
Artif. Intell. Rev., vol. 47, no. 1, pp. 1–66, 2017, doi: 10.1007/s10462-016-9475-9.
K. Kurniawan and S. Louvan, “I NDO S UM : A New Benchmark Dataset for Indonesian
Text Summarization,” 2018 Int. Conf. Asian Lang. Process., pp. 215–220, 2018.
E. Rainarli and K. E. Dewi, “Relevance Vector Machine for Summarization,” IOP Conf.
Ser. Mater. Sci. Eng., vol. 407, no. 1, 2018, doi: 10.1088/1757-899X/407/1/012075.
S. Mandal, P. Achary, S. Phalke, K. V. K. Poorvaja, and M. Kulkarni, “Extractive Text
Summarization Using Supervised Learning and Natural Language Processing,” 2021 Int.
Conf. Intell. Technol. CONIT 2021, pp. 1–7, 2021, doi:
1109/CONIT51480.2021.9498322.
N. S. Shirwandkar and S. Kulkarni, “Extractive Text Summarization Using Deep
Learning,” Proc. - 2018 4th Int. Conf. Comput. Commun. Control Autom. ICCUBEA 2018,
pp. 1–5, 2018, doi: 10.1109/ICCUBEA.2018.8697465.
P. Verma and H. Om, “A novel approach for text summarization using optimal
combination of sentence scoring methods,” Sadhana - Acad. Proc. Eng. Sci., vol. 44, no.
, 2019, doi: 10.1007/s12046-019-1082-4.
Downloads
Published
Issue
Section
License
License Terms
All articles published in Techno.COM Journal are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). This means:
1. Attribution
Readers and users are free to:
-
Share – Copy and redistribute the material in any medium or format.
-
Adapt – Remix, transform, and build upon the material.
As long as proper credit is given to the original work by citing the author(s) and the journal.
2. Non-Commercial Use
-
The material cannot be used for commercial purposes.
-
Commercial use includes selling the content, using it in commercial advertising, or integrating it into products/services for profit.
3. Rights of Authors
-
Authors retain copyright and grant Techno.COM Journal the right to publish the article.
-
Authors can distribute their work (e.g., in institutional repositories or personal websites) with proper acknowledgment of the journal.
4. No Additional Restrictions
-
The journal cannot apply legal terms or technological measures that restrict others from using the material in ways allowed by the license.
5. Disclaimer
-
The journal is not responsible for how the published content is used by third parties.
-
The opinions expressed in the articles are solely those of the authors.
For more details, visit the Creative Commons License Page:
? https://creativecommons.org/licenses/by-nc/4.0/