Tuning Model Analisis Sentimen Tweeter Sepakbola Pada Dataset Kecil dan Seimbang
DOI:
https://doi.org/10.33633/joins.v5i1.3275Abstract
Suporter bola adalah orang yang mendukung dan memberikan motivasi serta semangat untuk pemain klub bola yang memiliki fanatisme positif maupun negatif, baik dalam dunia nyata atau social media, tweeter. Penelitian ini menghasilkan model klasifikasi untuk prediksi tweet supporter sepakbola dengan sedikit data dan berimbang. Model klasifikasi dibangun berdasarkan ekplorasi analisis data dan penentuan baseline model dari akurasi null, polarisasi dan subyektivitas, seleksi fitur, klasifikasi linier dan non linier. Model terpilih akan dilakukan tuning untuk mendapatkan hasil yang lebih presisi dan akurat serta dievaluasi dengan confusion matrik serta laporan klasifikasi untuk memberikan intuisi lebih dalam tentang perilaku classifier atas akurasi global. Hasil penelitian ditemukannya polarisasi kata bermakna negative yang berada dikelas positif sebesar 88% dengan frekuensi 4% dan rerata harmoni 8%. Model klasisfikasi Multinomial Naïve Bayes terpilih sebagai model terbaik dengan akurasi 99%, error 0.8% pada data train dan 100%, error 0% pada data validasi. Eksperimen untuk menguji model terhadap 30 entri data test baru, menghasilkan prediksi denganakurasinya 87% dengan error 13%, artinya hanya terdapat 4 kesalahan prediksi. Kedepan disarankan untuk menguji model ektraksi fitur atau melakukan boosting, bagging dan deep learning untuk mengetahui apakah hasilnya menjadi lebih baik.References
M. F. Ismawan, “Jumlah Penonton Championship Lebih Banyak daripada La Liga dan Serie A,†Detik.Com, 2018. [Online]. Available: https://sport.detik.com/sepakbola/uefa/3818812/jumlah-penonton-championship-lebih-banyak-daripada-la-liga-dan-serie-a. [Accessed: 11-May-2019].
R. Darmawan, “Empat Klub Teratas Liga 1 2018 dengan Penonton Terbanyak di Stadion hingga Pekan Keempat,†SuperBall.id, 2018. [Online]. Available: https://superball.bolasport.com/read/331441565/empat-klub-teratas-liga-1-2018-dengan-penonton-terbanyak-di-stadion-hingga-pekan-keempat?page=2. [Accessed: 11-Apr-2019].
T. N. Habibie, “Hubungan Antara Fanatisme Dan Solidatritas Sosial Di Komunitas ICI MORATTI Regional Malang,†J. Mhs. Sosiologi, Univ. Brawijaya, vol. 2, no. Novembere, 2014.
V. Widiastuti, “Haringga Sirla Tewas Dikeroyok, Bermula dari KTP Korban Dirazia Sejumlah Suporter Persib Bandung,†Www.Tribunnews.Com, 2018. [Online]. Available: https://www.tribunnews.com/nasional/2018/09/24/haringga-sirla-tewas-dikeroyok-bermula-dari-ktp-korban-dirazia-sejumlah-suporter-persib-bandung. [Accessed: 11-May-2019].
Y. S. Mahardika and E. Zuliarso, “Analisis Sentimen Terhadap Pemerintahan Joko Widodo Pada Media Sosial Twitter Menggunakan Algoritma Naives Bayes,†Pros. SINTAK 2018, no. 2015, pp. 409–413, 2018.
A. K. B. A. Putra, M. A. Fauzi, B. D. Setiawan, and E. Setiawati, “Identifikasi Ujaran Kebencian Pada Facebook Dengan Metode Ensemble Feature Dan Support Vector Machine,†J. Pengemb. Teknol. Inf. Dan Ilmu Komput., vol. 2, no. 12, 2018.
A. R. T. Lestari, R. S. Perdana, and M. A. Fauzi, “Analisis Sentimen Tentang Opini Pilkada Dki 2017 Pada Dokumen Twitter Berbahasa Indonesia Menggunakan Näive Bayes dan Pembobotan Emoji,†J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 12, pp. 1718–1724, 2017.
S. Madhu, “An approach to analyze suicidal tendency in blogs and tweets using Sentiment Analysis,†Int. J. Sci. Res. Comput. Sci. Eng., vol. 6, no. 4, pp. 34–36, 2018.
H. Li, “Sentiment Analysis and Opinion Mining on Twitter With Gmo Keyword,†North Dakota State University, 2016.
B. Das and S. Chakraborty, “An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation,†2018.
M. de Vries, “Machine Learning for Sentiment Analysis of Children’s Diaries,†Utrecht University, 2017.
K. Ghag and K. Shah, “SentiTFIDF – Sentiment Classification using Relative Term Frequency Inverse Document Frequency,†Int. J. Adv. Comput. Sci. Appl., vol. 5, no. 2, pp. 36–43, 2014.
D. UmniySalamah, “Implementation of Logistic Regression Algorithm for Complaint Text Classification in Indonesian Ministry of Marine and Fisheries Abstract :,†Int. J. Comput. Tech., vol. 5, no. 5, pp. 74–78, 2018.
K. Korovkinas and G. Garšva, “Selection of intelligent algorithms for sentiment classification method creation,†CEUR Workshop Proc., vol. 2145, pp. 152–157, 2018.
A. Tyagi and N. Sharma, “Sentiment Analysis using logistic regression and effective word score heuristic,†Int. J. Eng. Technol., vol. 7, no. 2, pp. 20–23, 2018.
D. Jurafsky, “Text Classification and Naive Bayes,†2016.
A. M. Muscolino, “Sentiment Analysis, a Support Vector Machine Model Based on Social Network Data,†Int. J. Res. Eng. Technol., vol. 07, no. 07, pp. 154–157, 2018.
B. M. and V. B., “Sentiment Analysis using Support Vector Machine based on Feature Selection and Semantic Analysis,†Int. J. Comput. Appl., vol. 146, no. 13, pp. 26–30, 2016.
M. K. D. Ms. Gaurangi Patil1, Ms. Varsha Galande2, Mr. Vedant Kekan3, “Sentiment analysis using Support Vector Machine,†I4CT 2014 - 1st Int. Conf. Comput. Commun. Control Technol. Proc., vol. 2, no. 1, pp. 333–337, 2014.
S. A. Aljuhani and N. S. Alghamdi, “A comparison of sentiment analysis methods on Amazon reviews of Mobile Phones,†Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 6, pp. 608–617, 2019.
M. Abbas, K. Ali Memon, and A. Aleem Jamali, “Multinomial Naive Bayes Classification Model for Sentiment Analysis,†IJCSNS Int. J. Comput. Sci. Netw. Secur., vol. 19, no. 3, p. 62, 2019.
A. A. Farisi, Y. Sibaroni, and S. Al Faraby, “Sentiment analysis on hotel reviews using Multinomial Naïve Bayes classifier,†J. Phys. Conf. Ser., vol. 1192, no. 1, 2019.
H. M. Ismail, S. Harous, and B. Belkhouche, “A Comparative Analysis of Machine Learning Classifiers for Twitter Sentiment Analysis,†Res. Comput. Sci., vol. 110, no. April, pp. 71–83, 2016.
M. Ahmad, S. Aftab, and I. Ali, “Sentiment Analysis of Tweets using SVM,†Int. J. Comput. Appl., vol. 177, no. 5, pp. 25–29, 2017.
M. Rezwanul, A. Ali, and A. Rahman, “Sentiment Analysis on Twitter Data using KNN and SVM,†Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 6, pp. 19–25, 2017.
J. Jotheeswaran and Y. S. Kumaraswamy, “Opinion mining using decision tree based feature selection through Manhattan hierarchical cluster measure,†J. Theor. Appl. Inf. Technol., vol. 58, no. 1, pp. 72–80, 2013.
A. S. and C. R. Bharathi, “Sentiment Classifi cation using Decision Tree Based Feature Selection Sentiment Classi fi cation using Decision Tree Based Feature Selection,†IJCTA, vol. 9(36), no. January, pp. 419–425, 2016.
M. Guia, R. R. Silva, and J. Bernardino, “Comparison of Naive Bayes, support vector machine, decision trees and random forest on sentiment analysis,†IC3K 2019 - Proc. 11th Int. Jt. Conf. Knowl. Discov. Knowl. Eng. Knowl. Manag., vol. 1, pp. 525–531, 2019.
Jonbakerfish, “TweetScraper,†https://github.com, 2020. [Online]. Available: https://github.com/jonbakerfish/TweetScraper. [Accessed: 12-Nov-2019].
C. Gibrat, Continuous Gibrat ’ s Law and Gabaix ’ s Derivation of Zipf ’ s Law, no. June 2014. 2010.
L. Wentian, “Zipf’s Law everywhere,†Glottometrics, vol. 5, no. June, pp. 14–21, 2002.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution 4.0 International License.