Bridge Girder Crack Assessment Using Faster RCNN Inception V2 and Infrared Thermography

Manual inspections of infrastructures such as highway bridge, pavement, dam, and multistoried garage ceiling are time consuming, sometimes can be life threatening, and costly. An automated computerized system can reduce time, faulty inspection, and cost of inspection. In this study, we developed a computer model using deep learning Convolution Neural Network (CNN), which can be used to automatically detect the crack and non-crack type structure. The goal of this research is to allow application of state-of-the-art deep neural network and Unmanned Aerial Vehicle (UAV) technologies for highway bridge girder inspection. As a pilot study of implementing deep learning in Bridge Girder, we study the recognition, length, and location of crack in the structure of the UTC campus old garage concrete ceiling slab. A total of 2086 images of crack and non-crack were taken from UTC Old Li-brary parking garage ceiling using handheld mobile phone and drone. After training the model shows 98% accuracy with crack and non-crack types of structures.


Introduction
Bridges constitute a sizable number of infrastructures in our environment and are often quite expensive to construct and maintain. Their integrity, safety, sustainability, reliability and maintenance are as important as the initial construction. These factors, however, are often impeded by deteriorating effects due to age and long-term service and exposure to harsh environmental conditions such as wind and earthquakes. In order to mitigate this fast deteriorating effect of bridges, the science of health monitoring emerged.
It is not a gainsaying that the role of infrastructure is critical, and serves as a significant index, in measuring the development of a nation. However, modern challenges in infrastructure development transcend merely building of roads, bridges, and other social facilities, but rather exploring means to mitigate the deterioration. Certainly, the rate of deterioration of US infrastructure, for example, has been a subject of concern amongst politicians, engineers and the public at large [1]. According to the ASCE 2017 report card, the US infrastructure received a cumulative grade of D+ (i.e. fair condition) with the bridges averaging a grade of C+ (i.e. good) [1] [2]. Bridges, just like other infrastructure, are very costly to construct, and their failure cannot be allowed. To prevent sudden failure, bridges are now routinely inspected for member or connection failures and their performances are measured or predicted regularly, with necessary actions taken to reduce their rate of deterioration. Bridge condition assessment techniques that have been explored include to not limited to: non-destructive techniques e.g. ultrasonic pulse velocity method, impact-echo/impulse-response method, acoustic emission method, radiographic method, eddy current method, eddy current method, and infrared thermographic methods; and dynamic characteristics-based methods e.g. use of natural frequency changes, use of modal damping changes, use of FRF changes, use of mode shape curvature changes, use of modal strain energy changes, and flexibility changes [5]. More recently, the use of techniques, such as Digital Im- Deep learning is a machine learning technique that utilizes the deep neural network [6]. It allows computational models of multiple processing layers to learn representations of data with multiple levels of abstraction [7]. By harness-ing this feature of machine learning, it is possible to use a well-trained neural network to detect, and classify, defects in concrete structures such as bridges thereby aiding engineering judgements of the conditions of bridges. The Objectives of this paper are to develop a framework that can be used for automation of bridge inspection, train a network that can be used for concrete crack classification, develop an algorithm to obtain info. of crack size and location in structure and build a 3D crack visualization model to assist maintenance engineer to determine whether the crack needs immediate attention.

Literature Review
Many researchers developed different image processing algorithms to detect infrastructural health monitoring. But most of the algorithms are limited to analyze a single picture instead of real-time video analysis. Tsai, Y. C. et al. (2009) analyzed Sobel model and Otsu method for pavement crack detection [8]. After comparing Sobel and Otsu method with the segmentation based on fractal method, they found fractal method is better than Sobel and Otsu method to analyze pavement crack surface. They analyzed 1280 × 1024 size pavement crack surface image to compare the results. The Sobel method includes edges due to differential operations and noises. The Otsu method produces error segmentation due to uneven grey-level distribution on background and lower contrast of grey level between crack and background. Some of the existing algorithms with their objectives are discussed in Table 1.
We used Sobel, Canny edge, Roberts, Prewitt, and Laplacian of Gaussian filter detection algorithms to detect crack from a hand mobile phone capture image. But all the algorithm failed to produce satisfactory result to identify the crack as shown in Figure  In summary, the available image processing techniques are impractical to use crack detection because of their lack of identification capabilities. Moreover, most of the techniques are limited by a single image while in practice the technique should perform on a sequence of images rather than a single image. In this research a state-of-art technique named as a convolution neural network to produce useful detection information.

Data Acquisition
In this work, we used UAV (i.e. drone) to get access to bridge or anywhere we need to collect images from. With the drone a FLIR 6.8 mm f/1.3 thermal imaging camera were incorporated to capture the thermal images, damages inside Journal of Transportation Technologies Table 1. Comparison of image processing models [1].

Researcher
Methods Objectives of the methods Ying and Salari (2010) [9] Beamlet Transforms (BT) Extraction of curvilinear features of crack from pavement images.
Pallotta (2014) [15] Zernike moments operator Compute all the edge parameters for subpixel detection.   structural components can be identified, classified and evaluated. A short tutorial was made for the demonstration of data acquisition in this YouTube video link (https://www.youtube.com/watch?v=q4ZDH7MDh9E). A total of 2088 images of crack and not-crack were collected [37]. All the images were split into two groups of images: one for training and the other for testing or validation purpose. In the next step, a neural network model was developed and compared it with a commercially available deep convolutional Neural Network by using the labeled training images. Finally, the trained network applied to classify unknown images. Figure 3 shows the conceptual framework of data acquisition and model training. Figure 4 and Figure 5 show the DJI Panthom 4 pro Drone and the infrared thermal camera respectively. The difference between the infrared image and the naked eye image were shown in Figure 6.

Data Pre-Processing
All the images were labeled and annotated using LabelImg Software. The images     were annotated as crack and non-crack image types. Additionally, PIX resizer software was used to reduce the image dimension to 838 × 600. Image data augmentation technique was implemented to enlarge the training and validation data sets. Figure 7 displays the sprite image of all 2086 images (before augmentation) used in this study.

Model Development
A 22-layer (20 hidden layer, one input layer and one output layer) convolution model was developed names as Visual Geometry Group (VGG 22). During the training process, different ratios of train vs. test images were implemented. One example is 1668 images used for training the model and 418 images used for validating purpose. The input size of images was maintained as 224 × 224 × 3 (height × width × channel). In Figure 8, the VGG 22 model architecture is described.

Model Training
To train the model, a high-performance supercomputer in UTC SimCenter was used to run the deep learning simulations. Python is the computational tool used to implement the deep learning for damage assessment. During the VGG-22 model training, we tried multiple combinations of epoch and batch sizes. Ten  percent of all images were used for the testing/validation and the rest images were used for training purpose. Data augmentation techniques were deployed to enlarge the training image dataset. Figure 9 represents a feature map of the crack image in hidden layer 3 after the application of 32 filters. Stochastic Gradient Descent (SGD) optimizer had been used to update the network. We get to a validation accuracy more than 78% (Figure 10), which is pretty good considering that the images are noisy and limited number of images are used for training the network. The gap between the training and validation accuracies because of overfitting. It is observed from Figure 11 that the loss tends to be relatively stable after 20 epochs. Figure 12 demonstrates the confusion matrix which is erroneous detection of the model. It measures the performance of an algorithm to detect the accurate class.
Although the VGG-22 model has 78% validation accuracy, faster RCNN model was chosen to increase the validation accuracy. The architecture of Faster RCNN has two network modules: first module is region proposal network (RPN) and the second module is used to generate bounding box for the detected objects [38]. With Faster RCNN, Inception v2 was used as a feature extractor.   The feature extractor defines as the pattern recognition image processing algorithm. The input image size was taken as 838 × 600. The model was trained up to 60 k global steps. After the training, the model reached almost zero loss as shown in Figure 13. In Figure 14, the localization loss and in Figure 15 localization loss were shown compared to number of global steps.

Crack Location
Google Earth and Geotag software were used to retrieve the GPS coordinates Journal of Transportation Technologies    information. Figure 16 shows the Google Earth view and the crack location in the UTC old garage building. To find the GPS coordinate of crack images, an open-source software, named as Geotag, was used to extract the image latitude, longitude, altitude, county name, city name, and province. The software can generate GPS coordinate based on the image metadata. In Figure 17, the red marked box shows the sample of the GPS coordinate found from our collected image data.

Crack Length
3D Drone mapping was used to determine the area and the length of cracks. Pix4D software was used to process the aerial photogrammetry images into 3D modelling. Using Pix4D software dimension tool the cracks measurement was completed as shown in Figure 18.

Results
The VGG-22 model shows 80.35% validation accuracy and 70.03% training accuracy at 19th global steps. The faster RCNN with Inception V2 shows 99% confident at 60 k global steps. A test image was provided in the faster RCNN and the model successfully detected the crack and non-crack structure with an average confident 99% as demonstrated in Figure 19. The 99% confidence means the trained model is 99% confident to classify the crack and non-crack image which is the best confident model percentage.

Conclusions
The bridge girder health assessment using deep learning with infrared thermography and drone is pushing the technology of traditional health monitoring. This method can save time, cost and make the structural health assessment more robust and efficient. Journal of Transportation Technologies