Film Review Sentiment Analysis: Comparison of Logistic Regression and Support Vector Classification Performance Based on TF-IDF

Dadan Saepul Ramdan, Riri Damayanti Apnena, Castaka Agus Sugianto

Abstract


Film sentiment analysis is a process for evaluating a sentiment value that exists in film reviews, so that positive or negative responses from films can be identified. In this study, a sentiment analysis will be carried out on film reviews on IMBD. The analysis was carried out to find out which reviews were positive and negative from film critics. The method used to carry out sentiment analysis in this study is review analysis and processing with TF-IDF and a positive or negative prediction process based on reviews that have been processed using a logistic regression algorithm and support vector classification. The data to be used is film reviews on IMBD, which consists of 2000 data, which is divided into 1000 positive data and 1000 negative data. Which is where the data will be preprocessed first and split with a percentage of 70% training data and 30% testing data. In the prediction process using the logistic regression algorithm, obtaining a test accuracy of 80.61%. While the prediction process using the support vector classification algorithm obtains a test accuracy of 82.42%.


Full Text:

PDF

References


Astuti, R. W., Waluyo, H. J., & Rohmadi, M. (2019). Character Education Values in Animation Movie of Nussa and Rarra. Budapest International Research and Critics Institute (BIRCI-Journal) : Humanities and Social Sciences, 2(4), 215–219. https://doi.org/10.33258/birci.v2i4.610

Fithratullah, M. (2021). Representation of Korean values sustainability in American remake movies. Teknosastik, 19(1), 60-73. https://doi.org/10.33365/ts.v19i1.874

Pavitha, N., Pungliya, V., Raut, A., Bhonsle, R., Purohit, A., Patel, A., & Shashidhar, R. (2022). Movie recommendation and sentiment analysis using machine learning. Global Transitions Proceedings, 3(1), 279-284. https://doi.org/10.1016/j.gltp.2022.03.012

Rahman, A., & Hossen, M. S. (2019, September 1). Sentiment Analysis on Movie Review Data Using Machine Learning Approach. 2019 International Conference on Bangla Speech and Language Processing, ICBSLP 2019. https://doi.org/10.1109/ICBSLP47725.2019.201470

Rehman, A. U., Malik, A. K., Raza, B., & Ali, W. (2019). A Hybrid CNN-LSTM Model for Improving Accuracy of Movie Reviews Sentiment Analysis. Multimedia Tools and Applications, 78(18), 26597–26613. https://doi.org/10.1007/s11042-019-07788-7

A. M. Rahat, A. Kahir and A. K. M. Masum, "Comparison of Naive Bayes and SVM Algorithm based on Sentiment Analysis Using Review Dataset," 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India, 2019, pp. 266-270, doi: 10.1109/SMART46866.2019.9117512.

Teixeira, M. B. M., Galvão, L. L. da C., Mota-Santos, C. M., & Carmo, L. J. O. (2021). Women and work: film analysis of Most Beautiful Thing. In Revista de Gestao (Vol. 28, Issue 1, pp. 66–83). Emerald Group Holdings Ltd. https://doi.org/10.1108/REGE-03-2020-0015

Kumar, K., Harish, B. S., & Darshan, H. K. (2019). Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method. International Journal of Interactive Multimedia and Artificial Intelligence, 5(5), 109. https://doi.org/10.9781/ijimai.2018.12.005

Bintang Purnomoputra, R., & Novia Wisesty, U. (2019). Sentiment Analysis of Movie Reviews using Naïve Bayes Method with Gini Index Feature Selection. OPEN ACCESS J DATA SCI APPL, 2(2), 85–094. https://doi.org/10.34818/JDSA.2019.2.36

Kumar, S., De, K., & Roy, P. P. (2020). Movie Recommendation System Using Sentiment Analysis from Microblogging Data. IEEE Transactions on Computational Social Systems, 7(4), 915–923. https://doi.org/10.1109/TCSS.2020.2993585

Dang, N. C., Moreno-García, M. N., & de la Prieta, F. (2020). Sentiment analysis based on deep learning: A comparative study. Electronics (Switzerland), 9(3). https://doi.org/10.3390/electronics9030483

Bonta, V., Kumaresh, N., & Janardhan, N. (2019). A Comprehensive Study on Lexicon Based Approaches for Sentiment Analysis. Asian Journal of Computer Science and Technology, 8(S2), 1–6. https://doi.org/10.51983/ajcst-2019.8.s2.2037

Behera, R. K., Jena, M., Rath, S. K., & Misra, S. (2021). Co-LSTM: Convolutional LSTM model for sentiment analysis in social big data. Information Processing and Management, 58(1). https://doi.org/10.1016/j.ipm.2020.102435

Li, L., Goh, T. T., & Jin, D. (2020). How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Computing and Applications, 32(9), 4387–4415. https://doi.org/10.1007/s00521-018-3865-7

Malviya, S., Tiwari, A. K., Srivastava, R., & Tiwari, V. K. (2020). Machine Learning Techniques for Sentiment Analysis: A Review. SAMRIDDHI : A Journal of Physical Sciences, Engineering and Technology, 12(2), 72–78. https://doi.org/10.18090/samriddhi.v12i02.3

Maulana, R., Rahayuningsih, P. A., Irmayani, W., Saputra, D., & Jayanti, W. E. (2020). Improved Accuracy of Sentiment Analysis Movie Review Using Support Vector Machine Based Information Gain. Journal of Physics: Conference Series, 1641(1). https://doi.org/10.1088/1742-6596/1641/1/012060

Qaisar, S. M. (2020, October 13). Sentiment Analysis of IMDb Movie Reviews Using Long Short-Term Memory. 2020 2nd International Conference on Computer and Information Sciences, ICCIS 2020. https://doi.org/10.1109/ICCIS49240.2020.9257657

Haque, M. R., Akter Lima, S., & Mishu, S. Z. (2019). Performance Analysis of Different Neural Networks for Sentiment Analysis on IMDb Movie Reviews. 3rd International Conference on Electrical, Computer and Telecommunication Engineering, ICECTE 2019, 161–164. https://doi.org/10.1109/ICECTE48615.2019.9303573

Sharma, N., Sharma, R., & Jindal, N. (2021). Machine Learning and Deep Learning Applications-A Vision. Global Transitions Proceedings, 2(1), 24–28. https://doi.org/10.1016/j.gltp.2021.01.004

Wei, J., Chu, X., Sun, X. Y., Xu, K., Deng, H. X., Chen, J., Wei, Z., & Lei, M. (2019). Machine learning in materials science. In InfoMat (Vol. 1, Issue 3, pp. 338–358). Blackwell Publishing Ltd. https://doi.org/10.1002/inf2.12028

Sen, P. C., Hajra, M., & Ghosh, M. (2020). Supervised Classification Algorithms in Machine Learning: A Survey and Review. Advances in Intelligent Systems and Computing, 937, 99–111. https://doi.org/10.1007/978-981-13-7403-6_11

Ramesh, T. R., Lilhore, U. K., Poongodi, M., Simaiya, S., Kaur, A., & Hamdi, M. (2022). PREDICTIVE ANALYSIS OF HEART DISEASES WITH MACHINE LEARNING APPROACHES. Malaysian Journal of Computer Science, 2022(Special Issue 1), 132–148. https://doi.org/10.22452/mjcs.sp2022no1.10

Lee, C. S., & Lee, A. Y. (2020). Clinical applications of continual learning machine learning. In The Lancet Digital Health (Vol. 2, Issue 6, pp. e279–e281). Elsevier Ltd. https://doi.org/10.1016/S2589-7500(20)30102-3

Breck, E., Polyzotis, N., Roy, S., Whang, S., & Zinkevich, M. (2019, April). Data Validation for Machine Learning. In MLSys. Proceedings of the 2 nd SysML Conference, Palo Alto, CA, USA, 2019

Huy, D. T. N., Le, T. H., Hang, N. T., Gwo?dziewicz, S., Trung, N. D., & Van Tuan, P. (2021). Further researches and discussion on machine learning meanings-and methods of classifying and recognizing users gender on internet. Advances in Mechanics, 9(3), 1190-1204.

Zhou, Z., Qin, J., Xiang, X., Tan, Y., Liu, Q., & Xiong, N. N. (2020). News text topic clustering optimized method based on TF-iDF algorithm on spark. Computers, Materials and Continua, 62(1), 217–231. https://doi.org/10.32604/cmc.2020.06431

Dalaorao, G. A., Sison, A. M., & Medina, R. P. (2019). Integrating Collocation as TF-IDF Enhancement to Improve Classification Accuracy. 2019 IEEE 13th International Conference on Telecommunication Systems, Services, and Applications (TSSA). doi:10.1109/tssa48701.2019.8985458

Wang, J., Xu, W., Yan, W., & Li, C. (2019). Text similarity calculation method based on hybrid model of LDA and TF-IDF. ACM International Conference Proceeding Series, 1–8. https://doi.org/10.1145/3374587.3374590

Samsudin, N. M., Mohd Foozy, C. F. B., Alias, N., Shamala, P., Othman, N. F., & Wan Din, W. I. S. (2019). Youtube spam detection framework using naïve bayes and logistic regression. Indonesian Journal of Electrical Engineering and Computer Science, 14(3), 1508–1517. https://doi.org/10.11591/ijeecs.v14.i3.pp1508-1517

Zou, X., Hu, Y., Tian, Z., & Shen, K. (2019). Logistic Regression Model Optimization and Case Analysis. Proceedings of IEEE 7th International Conference on Computer Science and Network Technology, ICCSNT 2019. https://doi.org/10.1109/ICCSNT47585.2019.8962457

Luo, H., Pan, X., Wang, Q., Ye, S., & Qian, Y. (2019). Logistic regression and random forest for effective imbalanced classification. Proceedings - International Computer Software and Applications Conference, 1, 916–917. https://doi.org/10.1109/COMPSAC.2019.00139

Alotaibi, F. M. (2019). Classifying text-based emotions using logistic regression. http://dx.doi.org/10.21015/vtcs.v16i2.551

Shah, K., Patel, H., Sanghvi, D., & Shah, M. (2020). A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification. Augmented Human Research, 5(1). https://doi.org/10.1007/s41133-020-00032-0

Robles-Velasco, A., Cortés, P., Muñuzuri, J., & Onieva, L. (2020). Prediction of pipe failures in water supply networks using logistic regression and support vector classification. Reliability Engineering and System Safety, 196. https://doi.org/10.1016/j.ress.2019.106754

Rákos, O., Aradi, S., & Bécsi, T. (2020). Lane change prediction using Gaussian classification, support vector classification and neural network classifiers. Periodica Polytechnica Transportation Engineering, 48(4), 327–333. https://doi.org/10.3311/PPTR.15849

Liu, W., & Rao, Z. (2020). Road Icing Warning System Based on Support Vector Classification. IOP Conference Series: Earth and Environmental Science, 440(5). https://doi.org/10.1088/1755-1315/440/5/052071

Djedidi, O., Djeziri, M. A., Morati, N., Seguin, J. L., Bendahan, M., & Contaret, T. (2021). Accurate detection and discrimination of pollutant gases using a temperature modulated MOX sensor combined with feature extraction and support vector classification. Sensors and Actuators, B: Chemical, 339. https://doi.org/10.1016/j.snb.2021.129817

Soubraylu, S., & Rajalakshmi, R. (2021). Hybrid convolutional bidirectional recurrent neural network based sentiment analysis on movie reviews. Computational Intelligence, 37(2), 735–757. https://doi.org/10.1111/coin.12400

Bodapati, J. D., Veeranjaneyulu, N., & Shaik, S. (2019). Sentiment analysis from movie reviews using LSTMs. Ingenierie Des Systemes d’Information, 24(1), 125–129. https://doi.org/10.18280/isi.240119




DOI: https://doi.org/10.33633/jais.v8i3.9090

Article Metrics

Abstract view : 33 times
PDF - 11 times

Refbacks

  • There are currently no refbacks.


Flag Counter

 

 

 

 

Journal of Applied Intelligent System (e-ISSN : 2502-9401p-ISSN : 2503-0493) is published by Department of Informatics Universitas Dian Nuswantoro Semarang and IndoCEISS.

  

 

Journal of Applied Intelligent System indexed by :


This journal is under licensed of Creative Commons Attribution 4.0 International License.

Visitor Stats