Information Retrieval Pada Frequently Asked Questions (FAQ) dengan metode String Similarity
DOI:
https://doi.org/10.33633/tc.v21i4.6843Keywords:
Information Retrieval, Frequently Asked Questions, Cosine SimilarityAbstract
Information retrieval merupakan sebuah sarana untuk menemukan informasi berdasarkan kumpulan informasi pada data terstruktur maupun tidak terstruktur secara otomatis. implementasi information retrival seperti mesin pencari menggunakan query dari pengguna dengan bahasa alami manusia kemudian sistem dapat menemukan dokumen atau informasi yang berkaitan dengan query dari pengguna. Pada penelitian ini di usulkan sistem information retrieval pada Frequently Asked Questions atau FAQ dengan mencari pertanyaan yang mirip (similar) pada daftar pertanyaan di basis data terhadap pertanyaan yang diberikan oleh pengguna menggunakan algoritma Cosine similarity untuk mencari kesamaan kosinus tertinggi. Selanjutnya memberikan respon jawaban yang sebelum nya sudah di berikan label terhadap pertanyaan yang relevan dan memiliki similaritas paling tinggi. Telah dihasilkan dataset FAQ dan dilakukan preprocessing, penerapan algoritma Cosine Similarity terhadap input pertanyaan (query) dengan dataset dan menghasilkan bobot pada setiap pertanyaan (label) pada dataset. Melalui evaluasi akurasi pemberian bobot similaritas yang dilakukan dengan memberikan sembilan input pertanyaan dibagai pada tiga kategori berdasarkan tingkat kemiripan memiliki akurasi mencapai 100%, dengan demikian information retrieval dengan Cosine similarity telah mampu memberikan bobot sesuai dengan tingkat similaritas pertanyaan (query) dengan dataset pertanyaan pada FAQReferences
F. Razzaghi, H. Minaee, and A. A. Ghorbani, “Context Free Frequently Asked Questions Detection Using Machine Learning Techniques,” in 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), 2016, pp. 558–561, doi: 10.1109/WI.2016.0095.
K. Miyamoto, A. Koseki, and M. Ohno, “Effective data curation for frequently asked questions,” in 2017 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), 2017, pp. 7–12, doi: 10.1109/SOLI.2017.8120960.
F. Amin and Purwatiningtyas, “Rancang Bangun Information Retrieval System (IRS) Bahasa Jawa Ngoko pada Palintangan Penjebar Semangad dengan Metode Vector Space Model (VSM),” J. Teknol. Inf. Din., vol. 20, no. 1, pp. 25–35, 2015.
F. Ramli, S. A. Noah, and T. B. Kurniawan, “Ontology-based information retrieval for historical documents,” in 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP), 2016, pp. 55–59, doi: 10.1109/INFRKM.2016.7806335.
X. Li and X. Xie, “Research of intelligent word segmentation and information retrieval,” in 2010 2nd International Conference on Education Technology and Computer, 2010, vol. 5, pp. V5-411-V5-414, doi: 10.1109/ICETC.2010.5529961.
M. A. P. Subali and P. Wijaya, “Sistem Question Answering untuk Bahasa Bali menggunakan Metode Rule-Based dan String Similarity,” Techno.Com, vol. 20, no. 2, pp. 300–308, 2021, doi: 10.33633/tc.v20i2.4390.
M. M. umilasari Syabani reni, “Penerapan Metode Cosine Similarity dan Pembobotan TF/IDF pada Sistem Klasifikasi Sinopsis Buku di Perpustakaan Kejaksaan Negeri Jember,” JUSTINDO (Jurnal Sist. dan Teknol. Inf. Indones., no. Vol 3, No 1 (2018): JUSTINDO, pp. 31–42, 2018, [Online]. Available: http://jurnal.unmuhjember.ac.id/index.php/JUSTINDO/article/view/2345.
A. Latreche and L. Guezouli, “Similarity measure for semi-structured information retrieval based on the path and neighborhood,” in 2012 International Conference on Information Technology and e-Services, 2012, pp. 1–5, doi: 10.1109/ICITeS.2012.6216597.
D. Soyusiawaty and Y. Zakaria, “Book Data Content Similarity Detector With Cosine Similarity (Case study on digilib.uad.ac.id),” in 2018 12th International Conference on Telecommunication Systems, Services, and Applications (TSSA), 2018, pp. 1–6, doi: 10.1109/TSSA.2018.8708758.
M. Alodadi and V. P. Janeja, “Similarity in Patient Support Forums Using TF-IDF and Cosine Similarity Metrics,” in 2015 International Conference on Healthcare Informatics, 2015, pp. 521–522, doi: 10.1109/ICHI.2015.99.
S. Pattnaik and A. K. Nayak, “Summarization of Odia Text Document Using Cosine Similarity and Clustering,” in 2019 International Conference on Applied Machine Learning (ICAML), 2019, pp. 143–146, doi: 10.1109/ICAML48257.2019.00035.
R. Shekhar and C. V Jawahar, “Word Image Retrieval Using Bag of Visual Words,” in 2012 10th IAPR International Workshop on Document Analysis Systems, 2012, pp. 297–301, doi: 10.1109/DAS.2012.96.
T. S. Kartikasari, H. Setiawan, and P. Lucky Tirma Irawan, “Implementasi Text Mining Untuk Analisis Opini Publik Terhadap Calon Presiden,” J. Simantec, vol. 7, no. 1, pp. 39–47, 2020, doi: 10.21107/simantec.v7i1.6528.
P. Yu, X. Ruan, and X. Zhu, “The loop closure Detection Algorithm Based on Bag of Semantic Word For Robot Navigation,” in 2020 IEEE International Conference on Information Technology,Big Data and Artificial Intelligence (ICIBA), 2020, vol. 1, pp. 54–58, doi: 10.1109/ICIBA50161.2020.9277317.
R. T. Wahyuni, D. Prastiyanto, and E. Supraptono, “Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF pada Sistem Klasifikasi Dokumen Skripsi,” J. Tek. Elektro Univ. Negeri Semarang, vol. 9, no. 1, pp. 18–23, 2017, [Online]. Available: https://journal.unnes.ac.id/nju/index.php/jte/article/download/10955/6659.
Downloads
Published
Issue
Section
License
Copyright (c) 2022 Gede Herdian Setiawan, I Made Budi Adnyana

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
License Terms
All articles published in Techno.COM Journal are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). This means:
1. Attribution
Readers and users are free to:
-
Share – Copy and redistribute the material in any medium or format.
-
Adapt – Remix, transform, and build upon the material.
As long as proper credit is given to the original work by citing the author(s) and the journal.
2. Non-Commercial Use
-
The material cannot be used for commercial purposes.
-
Commercial use includes selling the content, using it in commercial advertising, or integrating it into products/services for profit.
3. Rights of Authors
-
Authors retain copyright and grant Techno.COM Journal the right to publish the article.
-
Authors can distribute their work (e.g., in institutional repositories or personal websites) with proper acknowledgment of the journal.
4. No Additional Restrictions
-
The journal cannot apply legal terms or technological measures that restrict others from using the material in ways allowed by the license.
5. Disclaimer
-
The journal is not responsible for how the published content is used by third parties.
-
The opinions expressed in the articles are solely those of the authors.
For more details, visit the Creative Commons License Page:
? https://creativecommons.org/licenses/by-nc/4.0/