Enhanced Vision Transformer and Transfer Learning Approach to Improve Rice Disease Recognition
DOI:
https://doi.org/10.62411/jcta.10459Keywords:
Multi-head Attention, Paddy Disease Classification, Rice Leaves Disease Recognition, Self-Attention, Transfer Learning, Vision TransformerAbstract
In the evolving landscape of agricultural technology, recognizing rice diseases through computational models is a critical challenge, predominantly addressed through Convolutional Neural Networks (CNN). However, the localized feature extraction of CNNs often falls short in complex scenarios, necessitating a shift towards models capable of global contextual understanding. Enter the Vision Transformer (ViT), a paradigm-shifting deep learning model that leverages a self-attention mechanism to transcend the limitations of CNNs by capturing image features in a comprehensive global context. This research embarks on an ambitious journey to refine and adapt the ViT Base(B) transfer learning model for the nuanced task of rice disease recognition. Through meticulous reconfiguration, layer augmentation, and hyperparameter tuning, the study tests the model's prowess across both balanced and imbalanced datasets, revealing its remarkable ability to outperform traditional CNN models, including VGG, MobileNet, and EfficientNet. The proposed ViT model not only achieved superior recall (0.9792), precision (0.9815), specificity (0.9938), f1-score (0.9791), and accuracy (0.9792) on challenging datasets but also established a new benchmark in rice disease recognition, underscoring its potential as a transformative tool in the agricultural domain. This work not only showcases the ViT model's superior performance and stability across diverse tasks and datasets but also illuminates its potential to revolutionize rice disease recognition, setting the stage for future explorations in agricultural AI applications.References
E. B. Wijayanti, D. R. I. M. Setiadi, and B. H. Setyoko, “Dataset Analysis and Feature Characteristics to Predict Rice Production based on eXtreme Gradient Boosting,” J. Comput. Theor. Appl., vol. 1, no. 3, pp. 299–310, Feb. 2024, doi: 10.62411/jcta.10057.
D. J. Chaudhari and K. Malathi, “Detection and Prediction of Rice Leaf Disease Using a Hybrid CNN-SVM Model,” Opt. Mem. Neural Networks, vol. 32, no. 1, pp. 39–57, Mar. 2023, doi: 10.3103/S1060992X2301006X.
P. I. Ritharson, K. Raimond, X. A. Mary, J. E. Robert, and A. J, “DeepRice: A deep learning and deep feature based classification of Rice leaf disease subtypes,” Artif. Intell. Agric., vol. 11, pp. 34–49, Mar. 2024, doi: 10.1016/j.aiia.2023.11.001.
M. T. Ahad, Y. Li, B. Song, and T. Bhuiyan, “Comparison of CNN-based deep learning architectures for rice diseases classification,” Artif. Intell. Agric., vol. 9, pp. 22–35, Jul. 2023, doi: 10.1016/j.aiia.2023.07.001.
S. P. Singh, K. Pritamdas, K. J. Devi, and S. D. Devi, “Custom Convolutional Neural Network for Detection and Classification of Rice Plant Diseases,” Procedia Comput. Sci., vol. 218, pp. 2026–2040, 2023, doi: 10.1016/j.procs.2023.01.179.
R. R. Kovvuri, A. Kaushik, and S. Yadav, “Disruptive technologies for smart farming in developing countries: Tomato leaf disease recognition systems based on machine learning,” Electron. J. Inf. Syst. Dev. Ctries., vol. 89, no. 6, Nov. 2023, doi: 10.1002/isd2.12276.
S. Saha and S. M. M. Ahsan, “Rice Disease Detection using Intensity Moments and Random Forest,” in 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), Feb. 2021, pp. 166–170. doi: 10.1109/ICICT4SD50815.2021.9396986.
B. Chakraborty et al., “Detection of Rice Blast Disease (Magnaporthe grisea) Using Different Machine Learning Techniques,” Int. J. Environ. Clim. Chang., vol. 13, no. 8, pp. 2256–2264, Jun. 2023, doi: 10.9734/ijecc/2023/v13i82190.
M. A. Araaf, K. Nugroho, and D. R. I. M. Setiadi, “Comprehensive Analysis and Classification of Skin Diseases based on Image Texture Features using K-Nearest Neighbors Algorithm,” J. Comput. Theor. Appl., vol. 1, no. 1, pp. 31–40, Sep. 2023, doi: 10.33633/jcta.v1i1.9185.
C. Zuo et al., “Deep learning in optical metrology: a review,” Light Sci. Appl., vol. 11, no. 1, p. 39, Feb. 2022, doi: 10.1038/s41377-022-00714-x.
S. B. Imanulloh, A. R. Muslikh, and D. R. I. M. Setiadi, “Plant Diseases Classification based Leaves Image using Convolutional Neural Network,” J. Comput. Theor. Appl., vol. 1, no. 1, pp. 1–10, Aug. 2023, doi: 10.33633/jcta.v1i1.8877.
M. S. Sunarjo, H. Gan, and D. R. I. M. Setiadi, “High-Performance Convolutional Neural Network Model to Identify COVID-19 in Medical Images,” J. Comput. Theor. Appl., vol. 1, no. 1, pp. 19–30, Aug. 2023, doi: 10.33633/jcta.v1i1.8936.
H. T. Adityawan, O. Farroq, S. Santosa, H. M. M. Islam, M. K. Sarker, and D. R. I. M. Setiadi, “Butterflies Recognition using Enhanced Transfer Learning and Data Augmentation,” J. Comput. Theor. Appl., vol. 1, no. 2, pp. 115–128, Nov. 2023, doi: 10.33633/jcta.v1i2.9443.
S. Ghosal and K. Sarkar, “Rice Leaf Diseases Classification Using CNN With Transfer Learning,” in 2020 IEEE Calcutta Conference (CALCON), Feb. 2020, pp. 230–236. doi: 10.1109/CALCON49167.2020.9106423.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in 3rd International Conference on Learning Representations (ICLR 2015), 2015, pp. 1–14.
A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” Apr. 2017, [Online]. Available: http://arxiv.org/abs/1704.04861
M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” May 2019, [Online]. Available: http://arxiv.org/abs/1905.11946
M. Tan and Q. V. Le, “EfficientNetV2: Smaller Models and Faster Training,” Apr. 2021, [Online]. Available: http://arxiv.org/abs/2104.00298
F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, vol. 7, no. 3, pp. 1800–1807. doi: 10.1109/CVPR.2017.195.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 770–778, Dec. 2015, doi: 10.1109/CVPR.2016.90.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017, doi: 10.1145/3065386.
A. Kesarwani, S. Das, D. R. Kisku, and M. Dalui, “Non-invasive anaemia detection based on palm pallor video using tree-structured 3D CNN and vision transformer models,” J. Exp. Theor. Artif. Intell., pp. 1–29, Jan. 2024, doi: 10.1080/0952813X.2023.2301401.
N. Perwaiz, M. Shahzad, and M. Moazam Fraz, “TransPose Re-ID: transformers for pose invariant person Re-identification,” J. Exp. Theor. Artif. Intell., pp. 1–14, May 2023, doi: 10.1080/0952813X.2023.2214570.
X. Gao, Z. Xiao, and Z. Deng, “High accuracy food image classification via vision transformer with data augmentation and feature augmentation,” J. Food Eng., vol. 365, p. 111833, Mar. 2024, doi: 10.1016/j.jfoodeng.2023.111833.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Oct. 2020, [Online]. Available: http://arxiv.org/abs/2010.11929
F. Jerbi, N. Aboudi, and N. Khlifa, “Automatic classification of ultrasound thyroids images using vision transformers and generative adversarial networks,” Sci. African, vol. 20, p. e01679, Jul. 2023, doi: 10.1016/j.sciaf.2023.e01679.
G. I. Okolo, S. Katsigiannis, and N. Ramzan, “IEViT: An enhanced vision transformer architecture for chest X-ray image classification,” Comput. Methods Programs Biomed., vol. 226, p. 107141, 2022, doi: 10.1016/j.cmpb.2022.107141.
A. Vaswani et al., “Attention is All you Need,” in Advances in Neural Information Processing Systems, 2017, vol. 30. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Y. Bazi, L. Bashmal, M. M. Al Rahhal, R. Al Dayil, and N. Al Ajlan, “Vision Transformers for Remote Sensing Image Classification,” Remote Sens., vol. 13, no. 3, p. 516, Feb. 2021, doi: 10.3390/rs13030516.
D. Shome et al., “COVID-Transformer: Interpretable COVID-19 Detection Using Vision Transformer for Healthcare,” Int. J. Environ. Res. Public Health, vol. 18, no. 21, p. 11086, Oct. 2021, doi: 10.3390/ijerph182111086.
E. Goceri, “Vision transformer based classification of gliomas from histopathological images,” Expert Syst. Appl., vol. 241, no. November 2023, p. 122672, May 2024, doi: 10.1016/j.eswa.2023.122672.
J. Wu, R. Hu, Z. Xiao, J. Chen, and J. Liu, “Vision Transformer?based recognition of diabetic retinopathy grade,” Med. Phys., vol. 48, no. 12, pp. 7850–7863, Dec. 2021, doi: 10.1002/mp.15312.
L. Tanzi, A. Audisio, G. Cirrincione, A. Aprato, and E. Vezzetti, “Vision Transformer for femur fracture classification,” Injury, vol. 53, no. 7, pp. 2625–2634, Jul. 2022, doi: 10.1016/j.injury.2022.04.013.
D. I. D. Saputra, “Rice Leafs Disease Dataset,” Kaggle.com, 2022. https://www.kaggle.com/datasets/dedeikhsandwisaputra/rice-leafs-disease-dataset (accessed Aug. 16, 2023).
C. G, “Rice-leaf-disease,” Kaggle.com, 2022. https://www.kaggle.com/datasets/chandrug/riceleafdisease (accessed Oct. 20, 2023).
C. Shorten and T. M. Khoshgoftaar, “A survey on Image Data Augmentation for Deep Learning,” J. Big Data, vol. 6, no. 60, 2019, doi: 10.1186/s40537-019-0197-0.
J. J. J. Chen, J. J. J. Chen, D. Zhang, Y. Sun, and Y. A. A. Nanehkaran, “Using deep transfer learning for image-based plant disease identification,” Comput. Electron. Agric., vol. 173, no. November 2019, p. 105393, Jun. 2020, doi: 10.1016/j.compag.2020.105393.
F. S. Gomiasti, W. Warto, E. Kartikadarma, J. Gondohanindijo, and D. R. I. M. Setiadi, “Enhancing Lung Cancer Classification Effectiveness Through Hyperparameter-Tuned Support Vector Machine,” J. Comput. Theor. Appl., vol. 1, no. 4, pp. 396–406, Mar. 2024, doi: 10.62411/jcta.10106.
T. R. Noviandy, K. Nisa, G. M. Idroes, I. Hardi, and N. R. Sasmita, “Classifying Beta-Secretase 1 Inhibitor Activity for Alzheimer’s Drug Discovery with LightGBM,” J. Comput. Theor. Appl., vol. 1, no. 4, pp. 358–367, Mar. 2024, doi: 10.62411/jcta.10129.
F. Mustofa, A. N. Safriandono, A. R. Muslikh, and D. R. I. M. Setiadi, “Dataset and Feature Analysis for Diabetes Mellitus Classification using Random Forest,” J. Comput. Theor. Appl., vol. 1, no. 1, pp. 41–48, Jan. 2023, doi: 10.33633/jcta.v1i1.9190.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Rahadian Kristiyanto Rachman
This work is licensed under a Creative Commons Attribution 4.0 International License.