Deep Learning Convolution Neural Network to Detect and Classify Tomato Plant Leaf Diseases

Abstract

The tomato crop is an important staple in the ?market and it is one of the most common crops daily ?consumed. Plant or crop diseases cause reduction of quality and quantity of the production; therefore detection and classification of these diseases are very necessary. There are many types of diseases that infect ?tomato plant like (bacterial spot, late blight, sartorial leaf ?spot, tomato mosaic and yellow curved). Early detection of plant diseases increases production and improves its quality. Currently, intelligent ?approaches have been widely used to detect and classify these ?diseases. This approach helps the farmers to identify the types? of diseases that infect crop. The main object of the current work is ?to apply a modern technique to identify and classify the ?disease. Intelligent technique is based on using convolution ?neural network (CNN) which is a part of machine learning to ?obtain an early detection about the situation of plants. CNN ?method depends on feature extraction (such as color, leaves ?edge, etc.) from input image and on this basis the decision of ?classification is done. A Matlab m-file has been used to build ?the CNN structure. A dataset obtained from plant village has ?been used for training the network (CNN). The suggested ?neural network has been applied to classify six types of tomato ?leaves situation (one healthy and five types of leave plant ?diseases). The results show that the convolution neural ?network (CNN) has achieved a classification accuracy ??of 96.43%. Real images are used to validate the ability of ?suggested CNN technique for detection and classification, and obtained using a 5-megapixel camera from a real ?farm because most common diseases which infect the planet are similar.

Share and Cite:

Salih, T. , Ali, A. and Ahmed, M. (2020) Deep Learning Convolution Neural Network to Detect and Classify Tomato Plant Leaf Diseases. Open Access Library Journal, 7, 1-12. doi: 10.4236/oalib.1106296.

1. Introduction

Agriculture has a major impact on the nation’s economy, in addition to being the backbone of people’s lives. The tomato crop is one of the most important plants, and it directly affects human life. Recently, plant diseases (such as bacteria, late blight, leaf-leaf spot, tomato mosaic, and yellow curved) are wide spread and badly affecting plant growth and causing reduction of quality and quantity of the production [1]. 80% - 90% of plant diseases occurred on the leaves [2]. The process of monitoring the farm and identifying the different types of diseases that affected plants due to the farms is time consuming and requires a long time. In addition, the determination of the type of plant disease by farmers may be inaccurate, and as a result of this decision, the protection mechanisms adopted may be ineffective and sometimes harmful to the plant. It is important to find a smart technology that aims to detect and classify diseases that affect tomato plants with high accuracy.

Deep Learning Neural Network (DLCNN) technology is widely used to detect and classify plant leaf diseases as it achieves high-resolution. The general form of this technique is applied to the tomato plant, Figure 1(a), Figure 1(b). It shows infected and healthy types of tomato plant diseases.

The tomato plant diseases became a domain interesting for many researchers due to both wide-spread and manufacturing important requirements.

Zhang et al. [3] discussed how to identify tomato leaf disease by using deep learning convolution neural network (CNN). The paper had used many pre-trained networks such as (AlexNet, googleNet and ResNet) with the accuracy of 97.19%.

(a) (b)

Figure 1. (a) Types tomato plant leaf diseases; (b) Healthy tomato plant leaf.

Prajwala TM et al. [4] suggested a method to detect and classify tomato plant leaf diseases using convolution neural network (CNN) based on using a pre-trained network model called (LeNet). The achieved accuracy was 94% - 95%.

Santosh Adhikari et al. [5] had created a system containing Raspberry Pi microcontroller (RPM) with a convolution neural network model to detect and classify tomato plant leaf diseases with 89% accuracy achieved.

H. Sabrol et al. [6] proposed approach to identify tomato plant disease by using Tree classifier model (TCM). Five types of diseases and one healthy were classified which used 382 images and 97.3% accuracy achieved.

Vetal et al. [7] introduced a method to find solution for classifying four types of diseases using Kurtosis, skewness filters and multi-class support vector machine (SVM) classifier model with the accuracy of 93.75% achieved.

Ishak et al. [8] had discussed approach to analyze the plant leaf quality, the process started from image acquisition, image processing and classification. Image acquisition was done by using 8-mege pixel smart phone camera, the samples of images then were divided into fifty for healthy and fifty for unhealthy. The image processing method consists of three components, contrast enhancement, segmentation and feature extraction. The classification method has been done using artificial neural network, uses multi-layer feed forward neural network, then comparison between two types of network structures which are Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF). RBF network performance achieved result better than MLP network. The search classifies the plant leaf images to only healthy and unhealthy, it can’t detect the type of disease.

Sabrol et al. [9] had discussed approach to identify and classify tomato plant leaf. CIE XYZ color space analysis, color moment, histogram, and color coherence are used. The best classification accuracy achieved is 87.2%.

Rangarajan et al. [10] proposed a feasible solution to classify tomato crop diseases using tow pre-trained (AlexNet and VGG16 net). The best classification accuracy achieved with number of image 13,262 using AlexNet and VGG16 net was 97.49%.

Coulibaly et al. [1] implemented a system to detect and diagnose the diseases that infect millet crops. Their approach was used to extract leaf’s features based on the transfer learning technique of the CNN model. A pre-trained network VGG16 model had been used to transfer its learning ability to their suggested neural network, where the best accuracy achieved 95%.

de Luna et al. [11] designed a convolution neural network to detect and classify tomato plant leaf’s diseases using transfer learning as a training mechanism with deep learning CNN based Alexnet. This approach was used to classify four types of tomato plant diseases. A 4932 of images is used, where it is divided into 80% for training and 20% for testing, and the achieved accuracy is 95.75%.

Mortazi et al. [12] had built their own net which used to detect and classified five different types of tomato plant diseases. Their work depends on constructing a network consisting of several layers and requires a short time for training compare with related previous works based on pre-trained networks like (AlexNet, LeNet, … etc.). These neural networks consist of any numbers of layers and hyperparameters (learning rat bias and weight, the size of minibatch, the classification precision and accuracy in addition to the time required for execution). This is the main reason for building our network instead of using the pre-trained network.

The suggested CNN are in this paper capable of categorizing 5 different diseases classes in tomato crop using 6202 images with an accuracy of 96.34%. Table 1 shows a summary of all relevant work that used.

The paper is organized as follows: Section 2 is the description of the proposed methodology‎. Simulation analysis is given in Section 3. Section 4 explains the results and discussion. Finally, conclusions are stated in Section 5.

2. Proposed Methodology

The structure of classifier model in Figure 2 consists of four major stages. In the first stage, the plant Village dataset is ‎obtained. ‎All images in the dataset necessary are resized, in the‎ second stage which are split and classified‎ in the third and ‎fourth stages respectively using deep learning convolution ‎neural network (DLCNN), which ‎consist of many layers such as (input layer, convolution ‎layer, batch normalization layer activation function layer, ‎max pooling layer, fully connected layer, soft max layer, and ‎classification layer). As shown in Figure 3.

2.1. Dataset

The dataset used in proposed system obtained from plant ‎village dataset [13]. The dataset contains 6202 images used, divided into ‎six groups (five groups for

Table 1. Summary for related work.

Figure 2. Classifier model.

Figure 3. Architecture of proposed network.

inflected plant leaves and the last ‎one is for healthy leaves). The common diseases types are ‎distributed as (bacterial Spot 591, late blight 460, sartorial ‎Leaf Spot 591, tomato mosaic 372, yellow curved 3597 and ‎healthy 591) are used in this work. All images are in RGB ‎color space, its format JPG and PNG.

2.2. Resizing Dataset

(145 × 145 × 3) image size is proposed in ‎this approach to reduce the ‎training time and increase the accuracy. The size of the ‎image depends on the size of network and graphic ‎processing unit (GPU).

‎2.3. Split Images

The offline neural network must be trained in a set of data to improve the accuracy of the network before testing it with real data, and, accordingly, the data set should be divided into training and testing groups, to avoid increased relevance. The data set can be divided (60% - 80%) for training and (40% - 20%) [14]. for testing, and sometimes increases training data in order to increase the efficiency of the network. In this paper, the images in the dataset are divided into 70% for training and 30% for testing. Table 2 shows the training and test data that correspond to the accuracy of the network.

2.4. Classification

Deep learning convolution neural network (DLCNN) can ‎be used to detect and classify tomato plant leaf diseases. ‎The proposed approach is a simple ‎model from DLCNN that consist of many convolution ‎layers, batch normalization, activation, max-pooling, fully ‎connect, softmax, and classification. The proposed network ‎architecture consists of three blocks, the first one includes ‎convolution, batch normalization, activation function, ‎max-pooling. The remaining blocks include convolution, ‎activation function, max-pooling followed by ‎fully connected layer, softmax layer and classification ‎layer as shown in Figure 3.

Table 2. Split dataset with efficiency of network.

Convolution operation is used to extract features such as color and edges from the image. In current work, the size of the filter is fixed in all convolution layers, but the number of filters is changed. In the first convolution layer, the number of filters is 8, while in the second and third convolution layer are 16 and 32 respectively. The function of these layers is extract features such as color and shape from the input image.

Batch normalization layer used to speed up the training of convolution neural networks and reduce the sensitivity for network initialization.

Activation function Rectified Linear Unit (Relu) layer is to eliminate negative value, which can be represented by Equation (1) and Figure 4.

f ( x ) = { x , | x > 0 0 , | x 0 (1)

Max pooling layer contains parameters such as number of ‎filters and number of step size is used to reduce the ‎samples by select the maximum value and eliminate the ‎remaining value, as shown in Figure 5. The features extracted from convolution1, Relu1, maxpolling1, convolution2, batch normalization1, Relu2, maxpolling2, Convolution3, Relu3, maxpolling3, see Figure 6.

Fully connected layer (FcL) refers to the number of ‎classes, which is 6 classes of tomato plant leaf diseases in ‎this work. Number of class represents the number of ‎neurons is used to connected each ‎input to all neurons.‎

Soft max layer its function used to calculate the probability ‎of each six target class with a range from 0 to 1 and the sum ‎of all the probabilities will be equal to one. It returns the ‎probabilities of each class and the target class will have a ‎high probability.‎

Classification layer, is the output layer (final layer) in deep ‎learning convolution neural network. It is responsible for determining the image’s affiliation with a specific category.

Pre-trained networks like (AlexNet, RasNet, LeNet, googleNet… etc.) have a large number of layers and millions of parameters, therefore it consumed huge time during training, also its parameters cannot be changed, therefore building an own network with a limited number of layers and other training parameters consider a best choice for solving the problems of pre-trained networks.

3. Simulation Analysis

Matlab software has been used to simulate the suggested ‎‎14-layers neural network

Figure 4. Function of Relu.

Figure 5. Max pooling function.

Figure 6. Features extracted.

based on deep learning algorithm. ‎The proposed DLCCN method has been trained using the ‎dataset obtained from plant village for tomato plant leaf ‎diseases. 6202 images divided into six different classes. It is split to 4342 image for training and 1860 for testing. The ‎training options used in this experiment are shown in Table‎‎3.

Table 3. Training parameters of the suggested neural network.

4. Results and Discussion

The proposed classification system (deep learning convolution neural network) achieves good results to ‎detect and classify dataset of the tomato plant leaves diseases ‎obtained from plant village. Plant’s diseases are divided into five ‎categories in addition to a healthy type. These ‎dataset passed in many stages such as resize image and split ‎into training images, testing images and classified using ‎DLCNN. Figure 7 illustrates the relation between training and validation of ‎the network (DLCNN), through the relationship between ‎accuracy and the number of epochs. The total number of ‎epochs is 10 with 67 iterations per epoch, so the total ‎number of iterations is 670.

The operation process of the neural network consumes about 21 ‎minutes for training and validation. Figure 8 shows the ‎ loss diagram of training and validation, during each step time. Increasing the accuracy of training and validation corresponding to ‎decrease of the training and validation loss. The classification ‎accuracy achieved by proposed network is 96.34%, while the achieved ‎training accuracy is 99.36%.

In addition a confusion matrix method (CMM) has been applied to evaluate the accuracy of classification for each class, also it can be easily used to decide the True Positive Rate (TPR) and False Positive Rate (FPR) according to equations given in Equation (2) and Equation (3) [15] [16] [17].

TPR = TP TP + FN 100 % (2)

FPR = TP TP + TN 100 % (3)

The confusion matrix method is used to compare network output (predicated) and target (actual) output, then calculates classification and error percentage for each class in a test dataset, it describes the performance of the neural network model. Table 4 and Figure 9 illustrate.

5. Conclusions

Plant diseases have a bad effect on farmer budgets and the economy of many countries. If these diseases are not discovered in early infected stage time, the treatment will be very expensive and generate a reduction in crop production.

Table 4. Classification accuracy percentage for each class.

Figure 7. Training and validation accuracy.

Figure 8. Training and validation loss.

Thus, this work focuses on detecting the primary infection on the plant, where neural network has been trained to monitor any change in color or shape of the leaves. The deep learning approach is considered a modern technique for recognizing and detecting the variation occurring in color and shape of images, also it

Figure 9. Confusion matrix.

can be successfully used to identify the diseases of plant leaves. Deep learning convolution neural network has been learned to recognize and detect all these changes in a short time. Because the pre-trained neural networks consist of a huge number of layers and elements, in addition, it needs a very long time to obtain the results; a neural network that used in current work has been built to reduce the drawbacks of these neural networks. A new simple structure of a convolution neural network has been suggested by authors with a minimum number of layers, which consists of 14 layers. DLCNN limitations are: requiring the amount of dataset, needing a long time during training and determining the resolution of an accepted input image.

This network has been trained using a dataset obtained from plant village (6202 images) for tomato plant leaves based on 70% for training and 30% for testing. There is no rule for determining the number of images to the training network, but a lot of images in the dataset will increase network efficiency.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Coulibaly, S., Kamsu-Foguem, B., Kamissoko, D. and Traore, D. (2019) Deep Neural Networks with Transfer Learning in Millet Crop Images. Computers in Industry, 108, 115-120.
https://doi.org/10.1016/j.compind.2019.02.003
[2] Zhang, S.W., Shang, Y.J. and Wang, L. (2015) Plant Disease Recognition Based on Plant Leaf Image. The Journal of Animal & Plant Sciences, 25, 25-28.
[3] Zhang, K., Wu, Q., Liu, A. and Meng, X. (2018) Can Deep Learning Identify Tomato Leaf Disease? Advances in Multimedia, 2018, Article ID: 6710865.
https://doi.org/10.1155/2018/6710865
[4] Tm, P., Pranathi, A., Saiashritha, K., Chittaragi, N.B. and Koolagudi, S.G. (2018) Tomato Leaf Disease Detection Using Convolutional Neural Networks. The 11th International Conference on Contemporary Computing, Noida, 2-4 August 2018, 1-5.
https://doi.org/10.1109/IC3.2018.8530532
[5] Adhikari, S., Saban Kumar, K.C., Balkumari, L., Shrestha, B. and Baiju, B. (2018) Tomato Plant Diseases Detection System Using Image Processing. 1st KEC Conference on Engineering and Technology, Lalitpur, Vol. 1, 81-86.
[6] Sabrol, H. and Satish, K. (2016) Tomato Plant Disease Classification in Digital Images Using Classification Tree. International Conference on Communication and Signal Processing, Melmaruvathur, 6-8 April 2016, 1242-1246. https://doi.org/10.1109/ICCSP.2016.7754351
[7] Vetal, S. and Khule, R.S. (2017) Tomato Plant Disease Detection Using Image Processing. International Journal of Advanced Research in Computer and Communication Engineering, 6, 293-297. https://doi.org/10.17148/IJARCCE.2017.6651
[8] Ishak, S., Rahiman, M.H.F., Kanafiah, S.N.A.M. and Saad, H. (2015) Leaf Disease Classification Using Artificial Neural Network. Jurnal Teknologi, 77, 109-114.
https://doi.org/10.11113/jt.v77.6463
[9] Sabrol, H. and Kumar, S. (2016) Fuzzy and Neural Network Based Tomato Plant Disease Classification Using Natural Outdoor Images. Indian Journal of Science and Technology, 9, 1-8.
https://doi.org/10.17485/ijst/2016/v9i44/92825
[10] Rangarajan, A.K., Purushothaman, R. and Ramesh, A. (2018) Tomato Crop Disease Classification Using Pre-Trained Deep Learning Algorithm. Procedia Computer Science, 133, 1040-1047. https://doi.org/10.1016/j.procs.2018.07.070
[11] De Luna, R.G., Dadios, E.P. and Bandala, A.A. (2019) Automated Image Capturing System for Deep Learning-Based Tomato Plant Leaf Disease Detection and Recognition. Proceedings/TENCON, Vol. 2018, 1414-1419. https://doi.org/10.1109/TENCON.2018.8650088
[12] Mortazi, A. and Bagci, U. (2018) Automatically Designing CNN Architectures for Medical Image Segmentation. International Workshop on Machine Learning in Medical Imaging, Granada, 16 September 2018, Lecture Notes in Computer Science Book Series (LNCS, Volume 11046), 98-106. https://doi.org/10.1007/978-3-030-00919-9_12
[13] Hughes, D.P. and Salathe, M. (2015) An Open Access Repository of Images on Plant Health to Enable the Development of Mobile Disease Diagnostics.
[14] Mohanty, S.P., Hughes, D.P. and Salathé, M. (2016) Using Deep Learning for Image-Based Plant Disease Detection. Frontiers in Plant Science, 7, 1-10. https://doi.org/10.3389/fpls.2016.01419
[15] Alnima, R.R.O. (2017) Signal Processing and Machine Learning Techniques for Human Verification Based on Finger Textures. Newcastle University, Newcastle upon Tyne, 1-195.
[16] Al-Sumaidaee, S.A.M., Abdullah, M.A.M., Al-Nima, R.R.O., Dlay, S.S. and Chambers, J.A. (2017) Multi-Gradient Features and Elongated Quinary Pattern Encoding for Image-Based Facial Expression Recognition. Pattern Recognition, 71, 249-263.
https://doi.org/10.1016/j.patcog.2017.06.007
[17] Al-Nima, R.R.O., Dlay, S.S., Woo, W.L. and Chambers, J.A. (2016) A Novel Biometric Approach to Generate ROC Curve from the Probabilistic Neural Network. 24th IEEE Signal Processing and Communication Application Conference, Zonguldak, 16-19 May 2016, 141-144.
https://doi.org/10.1109/SIU.2016.7495697

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.