Utilization Of Principal Component Analysis To Improve Emotion Classification Performance In Text Using Artificial Neural Networks


  • Mahazam Afrad Universitas Dian Nuswantoro
  • Muljono Muljono Universitas Dian Nuswantoro
  • Pujiono Pujiono Universitas Dian Nuswantoro




Emotions, being transient and variable, differ across locations, times, and individuals. Automatic emotion identification holds significant importance across various domains, such as education and business. In education, emotional analysis contributes to intelligent electronic learning environments, while in business, it aids in assessing customer satisfaction with products. This study advocates the application of Principal Component Analysis (PCA) to enhance the performance of text emotion classification using the Artificial Neural Network (ANN) method. PCA, a pattern identification method, reduces text dimensions, improving the classification process by determining word similarities. PCA offers the advantage of dimension reduction without compromising information integrity. The classification approach involves two stages: one after PCA dimension reduction and the other without PCA post TF-IDF stage. The study's conclusive findings, incorporating PCA in ANN classification, demonstrated a notable increase in recall for the happy class, reaching 0.92 compared to the pre-PCA score of 0.91. Furthermore, precision in the sadness class improved to 0.90, surpassing the pre-PCA precision of 0.80. This affirms the efficacy of integrating PCA in enhancing the accuracy and performance of emotion classification in text analysis.


M. S. Saputri, R. Mahendra, and M. Adriani, “Emotion Classification on Indonesian Twitter Dataset,” Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, pp. 90–95, 2019, doi: 10.1109/IALP.2018.8629262.

Muljono, A. S. Winarsih, and C. Supriyanto, “Evaluation of classification methods for Indonesian text emotion detection,” in Proceedings - 2016 International Seminar on Application of Technology for Information and Communication, ISEMANTIC 2016, 2016, pp. 130–133. doi: 10.1109/ISEMANTIC.2016.7873824.

T. Tabashum and S. Chanda, “Sentiment Extraction From Text Using Emotion Tagged Corpus,” in 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IEEE, 2019, pp. 1–6.

M. A. Tocoglu and A. Alpkocak, “Emotion extraction from turkish text,” Proceedings - 2014 European Network Intelligence Conference, ENIC 2014, pp. 130–133, 2014, doi: 10.1109/ENIC.2014.17.

M. A. Tocoglu, O. Ozturkmenoglu, and A. Alpkocak, “Emotion Analysis From Turkish Tweets Using Deep Neural Networks,” IEEE Access, vol. 7, pp. 183061–183069, 2019, doi: 10.1109/access.2019.2960113.

J. Herzig, M. Shmueli-Scheuer, and D. Konopnicki, “Emotion detection from text via ensemble classification using word embeddings,” ICTIR 2017 - Proceedings of the 2017 ACM SIGIR International Conference on the Theory of Information Retrieval, pp. 269–272, 2017, doi: 10.1145/3121050.3121093.

P. Vora, M. Khara, and K. Kelkar, “Classification of Tweets based on Emotions using Word Embedding and Random Forest Classifiers,” Int J Comput Appl, vol. 178, no. 3, pp. 1–7, 2017, doi: 10.5120/ijca2017915773.

E. Batbaatar, M. Li, and K. H. Ryu, “Semantic-Emotion Neural Network for Emotion Recognition From Text,” IEEE Access, vol. 7, pp. 111866–111878, 2019, doi: 10.1109/access.2019.2934529.

S. E. Saad and J. Yang, “Twitter Sentiment Analysis Based on Ordinal Regression,” IEEE Access, vol. 7, pp. 163677–163685, 2019, doi: 10.1109/ACCESS.2019.2952127.

Institute of Electrical and Electronics Engineers, International Conference on Information Communication and Embedded Systems : 27-28 February 2014, Chennai, India.

J. Singh, G. Singh, R. Singh, and P. Singh, “Morphological evaluation and sentiment analysis of Punjabi text using deep learning classification,” Journal of King Saud University - Computer and Information Sciences, vol. 33, no. 5, pp. 508–517, 2021, doi: 10.1016/j.jksuci.2018.04.003.

P. Ahmadi, M. Tabandeh, and I. Gholampour, “Persian text classification based on topic models,” 2016 24th Iranian Conference on Electrical Engineering, ICEE 2016, pp. 86–91, 2016, doi: 10.1109/IranianCEE.2016.7585495.

S. Narasimhan and S. L. Shah, “Model identification and error covariance matrix estimation from noisy data using PCA,” IFAC Proceedings Volumes (IFAC-PapersOnline), vol. 37, no. 1, pp. 511–516, 2004, doi: 10.1016/s1474-6670(17)38783-9.

H. Cartwright, “Artificial Neural Networks,” Methods in Molecular Biology, vol. 1260, pp. 631–645, 2015, doi: 10.1007/978-1-4939-2239_0.

López F.J Ariza, Rodríguez Avi J, and Alba-Fernández M.V “Complete control of an observed confusion matrix,” pp. 1222–1225, 2018.