Damage Detection in Simply Supported Beams Using 1D Convolutional Neural Networks with Vibration Signals ()
1. Introduction
Civil engineering structures, such as bridges and buildings, are indispensable components of modern infrastructure [1]. Throughout their service life, they are inevitably subjected to gradual deterioration due to environmental factors, material aging, and unexpected events like earthquakes or impacts. This degradation, if undetected and unaddressed, can compromise structural integrity and potentially lead to catastrophic failures, resulting in significant economic losses and threats to public safety [2]-[4]. Therefore, the development of efficient and reliable SHM systems has become a paramount research focus, aiming to provide early-stage damage detection and facilitate timely maintenance decisions.
Vibration-based damage detection methods form a cornerstone of modern SHM. These techniques operate on the premise that damage, such as the formation of cracks or stiffness reduction, alters the physical properties of a structure, which in turn modifies its dynamic response characteristics (e.g., natural frequencies, mode shapes, and damping ratios). Traditional methods extensively rely on extracting these modal parameters and using them as damage-sensitive features. While successful in certain applications, these methods face several inherent limitations. The process often requires considerable expert knowledge for feature selection and is susceptible to environmental and operational variations. Furthermore, modal parameters, especially lower-order frequencies, are globally sensitive and may lack the necessary resolution to identify small-scale or localized damage, making precise localization and quantification a challenging task.
In recent years, the advent of data-driven approaches and the rise of deep learning have opened new frontiers in SHM [5]. Among various deep learning architectures, Convolutional Neural Networks (CNNs) have demonstrated exceptional performance in pattern recognition tasks, particularly in automatically learning hierarchical features from raw data. While 2D-CNNs have been widely applied to image-based damage detection, their application to one-dimensional vibration signals often requires pre-processing, such as converting time-series data into time-frequency representations (e.g., spectrograms) [6]. This conversion, however, may introduce information loss or artifacts. In this context, one-dimensional CNNs (1D-CNNs) emerge as a more natural and efficient alternative. They can directly operate on raw temporal vibration signals, automatically extracting discriminative features at multiple scales through their convolutional and pooling layers, thereby bypassing the need for manual feature engineering or signal transformation.
As one of the most fundamental structural elements, the simply supported beam is a common idealization for many bridge decks and structural components. Its simplicity makes it an excellent starting point for developing and validating new damage detection methodologies before applying them to more complex structures. Despite the potential of 1D-CNNs, their application for direct, end-to-end damage diagnosis (encompassing detection, localization, and severity assessment) in simply supported beams using only raw acceleration signals warrants further in-depth investigation, particularly regarding their robustness and performance under noisy operational conditions.
To bridge this gap, this paper proposes a novel damage detection framework for simply supported beams based on a 1D-CNN model. The main contributions of this work are fourfold:
1) An End-to-End Framework: We develop a 1D-CNN model capable of performing automated damage detection, localization, and severity assessment directly from raw acceleration time-history data, eliminating the dependency on manually crafted features or modal analysis.
2) Comprehensive Damage Identification: The model is designed not only to detect the presence of damage but also to precisely identify its location and quantify its severity, providing a more comprehensive diagnostic solution.
3) Robustness Validation: The performance of the proposed method is rigorously evaluated under different levels of simulated measurement noise, demonstrating its superior robustness and practical applicability.
4) Benchmarking Performance: The diagnostic accuracy and efficiency of the proposed 1D-CNN approach are compared with traditional vibration-based methods, highlighting its significant advantages.
The remainder of this paper is organized as follows. Section 2 details the methodology, including the finite element model of the simply supported beam, damage scenario simulation, and the architecture of the proposed 1D-CNN model. Section 3 presents and discusses the experimental results and analysis. Finally, Section 4 concludes the paper by summarizing the key findings and outlining potential directions for future research.
2. Methodology
2.1. Finite Element Model and Damage Simulation
A simply supported steel I-beam (Figure 1) with a span of L = 10 m was modeled using commercial finite element software (ABAQUS). The beam was discretized using a sufficient number of 3D solid or beam elements to ensure accurate dynamic response calculation. The material properties were assigned as follows: Young’s modulus E = 210 GPa, Poisson’s ratio ν = 0.3, and density ρ = 7850 kg/m3.
To simulate damage, a common approach of reducing the local bending stiffness (EI) was adopted. In this study, damage was represented as a single transverse crack at various locations along the beam span. The crack was modeled by reducing the cross-sectional area and the moment of inertia of the elements within a specific, short segment of the beam. The severity of the damage was quantified by the percentage reduction in the bending stiffness (ΔEI) at the damaged location, denoted as α (e.g., α = 10%, 20%, 30%).
A total of N = 21 damage scenarios was simulated, including the healthy (baseline) state. The scenarios covered: Damage Location (L~d~): The crack was introduced at P = 5 different equidistant points along the beam span (i.e., at 0.2 L, 0.4 L, 0.6 L, 0.8 L, and midspan). Damage Severity (α): For each location, S = 4 different severity levels were simulated, corresponding to stiffness reductions of α = 10%, 20%, 30%, and 40%.
Figure 1. Simply supported steel I-beam. (a) Simply supported beam FE model with loading and measurement points; (b) Damage location schematic; (c) Damage severity levels.
For each scenario, a transient dynamic analysis was performed. The beam was subjected to Gaussian white noise excitation applied vertically at a point to simulate ambient vibration. The vertical acceleration responses were collected from M sensor locations along the beam’s length at a high sampling frequency (fs) for a duration of T seconds.
2.2. Data Pre-Processing and Dataset Construction
The raw acceleration time-history data obtained from the FE simulations were processed to create a suitable dataset for the 1D-CNN. The following steps were undertaken:
1) Noise Injection: To enhance the model’s robustness and mimic real-world measurement conditions, Gaussian white noise was added to the pristine simulation signals. The signal-to-noise ratio (SNR) was varied (e.g., from 20 dB to 40 dB) to create datasets with different noise levels for training and testing.
2) Signal Segmentation: Long time-history signals were segmented into shorter samples using a sliding window approach with a window length of W = 1024 data points and an overlap of O = 512 points (i.e., 50% overlap). This data augmentation technique significantly increased the number of training samples and improved the model’s generalization ability. Each segmented sample thus formed a 1D input vector of length 1024 for the 1D-CNN model.
3) Labeling: For this study, a classification-based approach was adopted. The problem was discretized, and each data sample was assigned a label using a one-hot encoded vector representing the combined class of damage state (e.g., “Healthy”, “Damaged at Location X with Severity Y”). This integrated label structure allows the model to simultaneously perform damage detection, localization, and severity assessment within a unified classification framework. (Note: A regression-based output, e.g., using a continuous vector like [Damage_Indicator, Location_Parameter, Severity_Parameter], was considered as an alternative but was not pursued in this work).
4) Dataset Splitting: The entire dataset was randomly shuffled and divided into three subsets: a training set (e.g., 70%), a validation set (e.g., 15%), and a test set (e.g., 15%). The training set was used for model learning, the validation set for hyperparameter tuning and preventing overfitting, and the test set for the final, unbiased evaluation of the model’s performance.
2.3. 1D-CNN Architecture and Design
The architecture of the proposed 1D-CNN model is designed to effectively extract hierarchical spatial-temporal features from the segmented acceleration signals at multiple scales. The design rationale for key components is as follows:
1) Kernel Sizes: The kernel sizes were chosen to capture features at different temporal resolutions. A larger kernel size (64) in the first convolutional layer allows the network to learn broader, low-frequency patterns and overall signal trends. Subsequent layers employ progressively smaller kernels (32, 16) to focus on extracting more localized, high-frequency features and finer details indicative of damage.
2) Number of Filters: The number of filters increases with the network’s depth (64, 128, 256). This common design pattern in CNNs allows the initial layers to learn a wide set of basic feature detectors (e.g., specific waveforms or slopes). As the signal is down-sampled by pooling layers, the deeper layers with more filters can then learn to combine these basic features into more complex and abstract representations relevant for the final diagnosis task.
3) Global Average Pooling: A Global Average Pooling layer is used instead of a fully connected layer at the end to significantly reduce the number of parameters, mitigate overfitting, and force the network to learn robust feature maps for each class.
The detailed structure is summarized in Table 1.
Table 1. The architecture of the proposed 1D-CNN model.
Layer (Type) |
Output Shape |
Kernel Size/Stride |
Activation |
Param # |
Input Layer |
(W, 1) |
- |
- |
0 |
Conv1D-1 |
(W/1, 64) |
64/1 |
ReLU |
128 |
MaxPooling1D-1 |
(W/2, 64) |
2/2 |
- |
0 |
Conv1D-2 |
(W/2, 128) |
32/1 |
ReLU |
656,384 |
MaxPooling1D-2 |
(W/4, 128) |
2/2 |
- |
0 |
Conv1D-3 |
(W/4, 256) |
16/1 |
ReLU |
1,049,088 |
Global Average Pooling1D |
(256) |
- |
- |
0 |
Dropout (0.5) |
(256) |
- |
- |
0 |
Dense (Output Layer) |
(N_output) |
- |
Linear/Softmax |
Varies |
2.4. Model Training and Implementation
The model was implemented using the TensorFlow and Keras deep learning frameworks. The Adam optimizer was chosen for its adaptive learning rate capability. The learning rate was initially set to 0.001. The loss function was selected based on the output type: Mean Squared Error (MSE) for regression and Categorical Cross-Entropy for classification. The model was trained for a maximum of 200 epochs with a batch size of 64. To avoid overfitting, an early stopping callback was used, which halted the training if the validation loss did not improve for 15 consecutive epochs. The model’s performance was monitored using accuracy and loss on both the training and validation sets.
3. Results and Discussion
This section presents a comprehensive evaluation of the proposed 1D-CNN model’s performance. The results are analyzed from three perspectives: the overall diagnostic accuracy, the model’s robustness to noise, and a comparative study with a traditional method.
3.1. Performance Evaluation of the 1D-CNN Model
The model was trained and tested on the dataset described in Section 2.2. As outlined in Section 2.2, a multi-class classification setup was employed for this study. The model’s output is a single health state class (e.g., “Healthy”, “Damage at 0.3 L with 20% severity”), which provides an integrated diagnosis encompassing detection, localization, and severity assessment.
Table 2 summarizes the overall performance of the model on the test set under a moderate noise level (SNR = 30 dB). The model achieved a remarkable overall classification accuracy of 99.2%, demonstrating its powerful capability to distinguish between different damage scenarios.
Table 2. Overall classification accuracy on the test set (SNR = 30 dB).
Model |
Overall Accuracy |
Precision |
Recall |
F1-Score |
Proposed 1D-CNN |
99.2% |
0.992 |
0.992 |
0.992 |
To provide a more detailed view, the normalized confusion matrix is shown in Figure 2. The near-diagonal pattern indicates that the model makes very few misclassifications. Most confusion, where it occurs, is between adjacent damage locations or similar severity levels at the same location, which is a challenging task even for expert analysis. The model shows perfect precision and recall for the “Healthy” class, meaning it never mistakes a healthy state for a damaged one or vice versa, a critical feature for practical SHM.
Figure 2. Normalized confusion matrix for the test set. The x-axis is the Predicted Label, and the y-axis is the True Label. Cells along the main diagonal should be dark blue (high values), indicating correct predictions.
Beyond mere classification, the model’s ability to generalize was further investigated by testing it on damage severities not seen during training (e.g., a 15% severity level when only 10% and 20% were trained). The results showed that the model could interpolate effectively, predicting the correct location and a severity close to the unseen one, showcasing its strong feature learning and generalization potential.
3.2. Comparative Study with a Traditional Method
To further highlight the superiority of the proposed method, a comparative study was conducted with a well-established traditional vibration-based method: Damage Detection based on Modal Frequency Shift (MFS).
The traditional method involved:
1) Extracting the first three natural frequencies from the acceleration signals for each scenario using Fast Fourier Transform (FFT).
2) Using the relative frequency shifts between the damaged and healthy states as the damage-sensitive feature.
3) Training a Multi-Layer Perceptron (MLP) classifier on these 3-dimensional feature vectors to perform the same classification task.
Table 3. Performance comparison between the proposed method and the traditional method.
Table 3. Presents a comparative summary of the two methods tested on the same dataset (SNR = 30 dB).
Method |
FeatureExtraction |
OverallAccuracy |
DataEfficiency |
Robustness(at SNR = 20 dB) |
Traditional (MFS + MLP) |
Manual (FFT) |
91.5% |
Lower |
83.2% |
Proposed (1D-CNN) |
Automatic |
99.2% |
Higher |
96.5% |
The results clearly show that the proposed 1D-CNN method significantly outperforms the traditional MFS-based method in all aspects:
1) Higher Accuracy: The 1D-CNN achieves a ~7.7% higher accuracy. This is because the 1D-CNN leverages the entire time-domain signal, capturing subtle, non-linear patterns that are lost when the data is compressed into only three modal frequencies.
2) Automatic Feature Extraction: The traditional method requires expert knowledge for modal analysis and feature selection (e.g., choosing which frequencies to use). In contrast, the 1D-CNN is completely end-to-end, eliminating this manual and potentially subjective step.
3) Superior Robustness: The performance gap widens under noisy conditions. The accuracy of the traditional method drops significantly to 83.2% at SNR = 20 dB, while the 1D-CNN remains highly reliable at 96.5%. This is because noise distorts the FFT spectrum, making frequency extraction less precise, whereas the CNN learns to filter out noise during its feature learning process.
3.3. Discussion
The outstanding performance of the proposed 1D-CNN model is a direct benefit of its data-driven, end-to-end architecture. Unlike traditional methods that rely on pre-defined features, the 1D-CNN learns a hierarchical feature representation directly from the raw data. While the explicit content of these learned features is not visualized in this study, the model’s ability to outperform methods based solely on global modal parameters (e.g., natural frequencies) strongly suggests that it successfully leverages more nuanced information present in the signal. This likely includes local transient responses and higher-order harmonics, which are sensitive to localized damage but are not conventionally extracted in manual feature engineering.
The study confirms that deep learning models, particularly 1D-CNNs, are capable of surpassing the limitations of traditional SHM methods. The automation of feature extraction reduces dependency on specialist knowledge and makes the system more accessible for routine monitoring. Furthermore, the demonstrated robustness to noise is a crucial step towards deployment in real-world environments where signal quality is often compromised.
A limitation observed, albeit minor, is the model’s occasional difficulty in distinguishing between very closely spaced damage locations with identical severity. This is a fundamental resolution limit tied to the sensor density and the dynamic characteristics of the beam itself. In practical terms, a sparser sensor arrangement would be expected to further degrade the spatial resolution of damage localization. With fewer measurement points, the unique dynamic fingerprint of damage at a specific location becomes less distinct, increasing the likelihood of misclassification between adjacent, un-instrumented segments. This underscores the importance of strategic sensor placement and highlights a inherent trade-off between system cost (number of sensors) and diagnostic precision. Future work could explore sensor fusion techniques, optimization of sensor placement, or more advanced network architectures to mitigate this challenge and maximize performance under constrained sensing resources.
4. Conclusions
This study has successfully developed and validated a novel, data-driven framework for damage detection in simply supported beam structures based on a one-dimensional Convolutional Neural Network (1D-CNN). The core of the proposed method lies in its ability to perform end-to-end damage diagnosis directly from raw acceleration response signals, completely bypassing the need for manual feature extraction or modal analysis. Through extensive numerical simulations encompassing various damage locations and severity levels, the key findings of this research are summarized as follows:
1) The proposed 1D-CNN model demonstrated exceptional performance, achieving an overall accuracy of 99.2% in the multi-class health state classification task, which successfully identifies the healthy state, precisely damage, and quantifies damage severity.
2) The model exhibited remarkable robustness to simulated measurement noise. Even under high-noise conditions (SNR = 20 dB), it maintained an accuracy above 96.5%, significantly outperforming traditional vibration-based methods whose performance degraded substantially with increasing noise.
3) A comparative study with a traditional method based on Modal Frequency Shift (MFS) confirmed the superiority of the automatic feature learning capability of the 1D-CNN. The model’s ability to learn discriminative features directly from the raw time-domain data led to a substantial improvement in accuracy (+7.7%) and reliability compared to methods relying on handcrafted features.
Acknowledgements
This study was partially supported by the Talent introduction program of Guangzhou Railway Polytechnic (No. GTXYR2431), the Natural Science Foundation of Henan Province (No. 252300423474), and Guangdong Basic and Applied Basic Research Foundation (No. 2025A1515010155).