Fusion of Model-Based and Data Driven Based Fault Diagnostic Methods for Railway Vehicle Suspension

Transportation of freight and passengers by train is one of the oldest types of transport, and has now taken root in most of the developing countries especially in Africa. Recently, with the advent and development of high-speed trains, continuous monitoring of the railway vehicle suspension is of significant importance. For this reason, railway vehicles should be monitored continuously to avoid catastrophic events, ensure comfort, safety, and also improved performance while reducing life cycle costs. The suspension system is a very important part of the railway vehicle which supports the car-body and the bogie, isolates the forces generated by the track unevenness at the wheels and also controls the attitude of the car-body with respect to the track surface for ride comfort. Its reliability is directly related to the vehicle safety. The railway vehicle suspension often develops faults; worn springs and dampers in the primary and secondary suspension. To avoid a complete system failure, early detection of fault in the suspension of trains is of high importance. The main contribution of the research work is the prediction of faulty regimes of a railway vehicle suspension based on a hybrid model. The hybrid model framework is in four folds; first, modeling of vehicle suspension system to generate vertical acceleration of the railway vehicle, parameter estimation or identification was performed to obtain the nominal parameter values of the vehicle suspension system based on the measured data in the second fold, furthermore, a supervised machine learning model was built to predict faulty and healthy state of the suspension system components (damage scenarios) based on support vector machine (SVM) and lastly, the development of a new SVM model with the damage scenarios to predict faults on the test data. The level of degradation at which the spring and damper becomes faulty for both primary and secondary suspension system was determined. The spring and damHow to cite this paper: Ankrah, A.A., Kimotho, J.K. and Muvengei, O.M. (2020) Fusion of Model-Based and Data Driven Based Fault Diagnostic Methods for Railway Vehicle Suspension. Journal of Intelligent Learning Systems and Applications, 12, 51-81. https://doi.org/10.4236/jilsa.2020.123004 Received: November 26, 2019 Accepted: August 10, 2020 Published: August 13, 2020 Copyright © 2020 by author(s) and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Open Access


Introduction
Rail transport plays an important role in today's global economy and also the most efficient land-based mode of transport for freight and the most reliable commuting method for passengers. Given the global pervasiveness of the railroads, making this transportation mode even more reliable, efficient and safe is of significant importance. The suspension system is the most crucial part of the railway vehicle, which support the car-body and the bogie, isolates the forces generated by the track unevenness at the wheels and also controls the attitude of the car-body with respect to the track surface for better ride comfort, reliability and safety of the railway vehicle [1] [2]. monitoring for railway vehicles present numerous benefits to the railway operation and system. Detection of faulty components at their former stages will avoid further deterioration in vehicle performance and improve vehicle safety. Timely repairs or replacement of faulty components, leads to increase in operational reliability and availability. Scheduled maintenance and its related cost can be significantly reduced, because maintenance in the future may be carried out on demand [3] [6]. Fault detection and isolation is the process of identifying operational fault and finding their root cause. Several conventional approaches which includes the use of ultrasonic, electromagnetic ultrasound, acoustic wayside detectors and excessive heat detection system to detect fault in the wheels, axles and roller bearing of the railway vehicle have been studied but still the issues of safety, reliability and availability of the system still lingers around [7]. The emergence of reliability-based, data-driven and model-based approaches over the decades has remedy the lingering issues of conventional approaches. Each category of methods has its own merits and demerits which are often combined in practical applications called hybrid model. The accuracy of data-driven approaches is highly dependent on the amount of condition monitoring data available and they also require huge amount of data for training. Deriving models under model-based from real physical systems is very challenging due to system complexity and stochastic degradation behavior of components and its accuracy depends on correct modeling of the system. Multiple case studies have validated the use of modern approach which includes data-driven, reliability and model based as presented in [2] [6] [8] [9] [10] [11] [12].
In 2008, Ding and Mei [13] carried out a study on fault detection and isolation of the bogie suspension components, particulary on the dampers of the primary and secondary suspension system. A cross correlation function approach was used to detect changes from the acceleration signals. The authors proposed approach was effectively proven on a simulink dataset. Li and Goodall [8] applied the model-based approach for condition monitoring of the railway vehicle system by modelling the railway vehicle dynamics of the car-body and the bogie thus the primary and the secondary suspension. The authors considered the lateral and the yaw models for the dynamic modelling. The fault detection and isolation were done with a Kalman filter approach to generate the residual for diagnostics. Based on the calculation of the power spectral density (PSDs) and the rms, fault isolation was achieved by comparing the PSDs or the rms, calculated using the innovation data before and after the fault alarm. Mori and Tsunashima (2010) [7] demonstrated the possibility to detect faults in suspension system of a railway vehicle using interacting multiple-model (IMM) approach. Measurement data were generated by full-vehicle model simulation for fault detection.
IMM model was integrated with kalman filter (KF) to improve the performance of detcetion. And it was concluded that proposed approach could efficiently detect suspension system faults.
Alfi et al. (2011) [14] developed condition monitoring approach for bogie suspension component fault detection and isolation using model-free and mod-el-based methods. The model-free approach is a data-driven method based on Random Decrement Technique (RDT) signal processing technique. The authors extracted features from the lateral acceleration signal for bogie incipient failure detection which was virtually measured by simulation. A model-based approach is a combination of extended kalman filter (EKF) and bayesian statistics. The railway vehicle suspension system model was built and simulated to generate different faulty virtual measurements. Both proposed methods performed well in detecting incipient faults. Wu et al. (2015) [15] investigated incipient failure detection and estimation of a closed-loop secondary suspension system for high-speed trains. They developed a dynamic model of the suspension system and then proposed total measurable fault information residual (ToMFIR) estimation method. The suspension system was simulated by Matlab/Simulink with external disturbance parameters to generate faulty signals for their study. The authors proposed approach was able to detect and estimate propagating faults. Melnik et al. (2014) [16] presented an on-line monitoring system for suspension fault detection. Different acceleration signals were simulated by artificially creating faulty signals by altering the stiffness and damping parameters of the suspension system. For real dataset, they acquired acceleration signals for primary suspension via sensors located on the frame of bogie and for secondary suspension via sensors installed on the car body. Extracted features from both scenarios were used in fault detection by calculating Euclidean distances between faulty signal and a normal one for suspension diagnostics. Semi-supervised fault detection approach was used to detect faults in vehicle suspension system with one-class multi-sensor data in (Peng and Jin, 2018) [6]. The authors modelled the suspension system to generate multi-sensor data by using SIMPACK. Features were extracted from the data using semi-supervised learning method integrated with physical-based domain knowledge to improve the accuracy of the fault detection. One-class SVM classifier was used to detect fault in each subsystem and the efficacy of their proposed approach was demonstrated on a rail vehicle suspension system.
In all the reviewed works above, the authors did not consider the level of degradation where each spring and damper in the primary or the secondary suspension system becomes faulty. To merge the merrits of both data-driven and model-based approaches in order to improve the accuracy, this study presents a hybrid model to detect and isolate faulty regimes or components in railway vehicle suspension system and also quantify the level of degradation where each components (springs or dampers) becomes faulty. The main contribution of this work is quantifying the level of degradation of the springs and dampers of the primary and secondary suspension, identifying the nominal parameter values of the railway vehicle based on the measured data and increased in the prediction accuracy of the proposed approach.
This paper is structured as follows: the railway vehicle suspension description and it governing equation are presented in Section 2. Section 3 describes the vertical track input and it's modeling into the railway vehicle. The proposed ap- proach is described in Section 4. In Section 5, the results and discussions are presented, and finally conclusions are presented in Section 6.

Model Description of Railway Vehicle Suspension
Railway vehicle consist of many components for a vehicle-track dynamics simulation representing the mechanical properties of the main components are of interest. The main sub-division of the vehicle components can be made into body components and suspension components [1]. The dominating body components are the car-body, bogie frame and wheelset, which essentially holds the vehicle mass (weight). The main suspension components are various springs and dampers whose forces essentially are related to the displacements and velocities of the components. The model consists of a carbody, bogies and a set of wheelset interconnected via force elements. The primary suspension connects the bogie frame and the wheelset while the secondary suspension connects the carbody and the bogie frame [1]. The springs and dampers of the primary and secondary suspension system are the main critical components considered in this work. The bogie and suspension systems of a conventional bogie vehicle system is presented, followed by the development of a mathematical model. Figure 2 depicts the body and suspension system of a railway vehicle.
Parameters of the rigid bodies and other parameters of a conventional bogie model [17] which are relevant for the simulation of the railway vehicle are presented in Table 1.
The vertical vehicle suspension of the railway vehicle is considered in this work. The vertical dynamic model is designed to study the dynamic response of the vehicle to track irregularities. Nine-degree of freedom (9-DoF) model of the dynamics of the railway vehicle is considered. Motions that are directly related to the vertical suspensions are considered, including the vertical accleration, roll and pitch motion of the car-body and the two bogies. The equations governing the dynamics behaviour of the railway vehicle suspencion was derived by using Newton's second law of motion with the help of the schematic diagram (lateral view) in Figure 3.   The 3 DoF of the car-body is described as where z, z 1 and z 2 represent the vertical displacement of the carbody, the front and rear bogie respectively. φ denotes the pitch angle of the centre of gravity. θ denotes the roll angle of the centre gravity for the masses. The front bogie is described as where 1r d and 1l d denote the vertical displacement of the right and left wheel in the leading wheelset, 2r d and 2l d denote the vertical displacement of the right and left wheel in the trailing wheelset.
The rear bogie is described as 3r d and 3l d denote the vertical displacement of the right and left wheel in the leading wheelset, 4r d and 4l d denote the vertical displacement of the right and left wheel in the trailing wheelset.

Modelling of the Vertical Track Input
The vertical and it's related motion of the railway vehicle are considered in this work. The displacement and it's derivates to each of the wheels of the railway vehicle are provided by the rail track. The track inputs are irregular as a result of the track misalignment. The track imperfections are mainly the cause of dynamic wheel load of moving railway vehicle. The gaussian stochastic process has been used by many researchers [4] [17] [18] in modeling of the road or track profile. The power spectral density (PSD) of the vertical track input irregularity can be modeled as Considering the stationary gaussian random process of the track irregularity, the wave number ( ) rad m Ω , which denotes the rate of the cycle change with respect of the distance is presented. Figure 4 shows the vertical track input from band limited noise block with a selected random seed as a time series where

Experiment and Simulation Setup
The suspension system under investigation in this work is as shown in Figure Figure 6 shows the modelling process of the suspension system.   The modelling of the suspension system in Matlab-Simulink was done in three (3) section as shown in Figure 6. The first section is the input, where the system parameters which includes the masses, stiffness, damping coefficient and the inertia of the suspension system serves as the initial parameters for simulation, the track irregularities or the track profile also serves as the input to each of the wheelset that is 1 2 3 , , d d d and 4 d as shown in Figure 4 and lastly under input section is the speed of the railway vehicle (V). The second section is the process, where equations governing the dynamic behavior of the railway vehicle suspension (Equations (1)-(9)) are represented by blocks in Matlab-Simulink library. Lastly is the output section, where the vertical accleration, ptich, and roll of each rigid bodies (carbody and the bogies) are modelled but the vertical acceleration of the each are considered in this work.

Data Description
The data provided by [19] [19]. Table 2 gives a detailed description of the data. The vertical acceleration measurement from the data was used in identifying the parameters of the suspension system. Journal of Intelligent Learning Systems and Applications

Parameter Identification or Estimation Scheme
The parameters of the model are identified by parameter estimation toolbox in matlab-simulink. The non-linear square optimization method which reduces the distance between the model and the measured response and the trust region is used as the identification algorithm to minimize the objective function under the toolbox. The objective function is given by where SSE is the Sum Square Error, m y is the measured value and p y is the predicted value. The trust region algorithm works by formulating a quadratic model for the area within a given radius or trust region. The convergence of the algorithm occurs when the norm of the step fall below specified tolerance values. The tolerance value for the cost function and the parameters is 0.001. The trust region has the ability to converge fast, step out of the local minima and also requires fewer function or gradient evaluation. The parameters to be identified are stiffness ( s K ,  Figure 7 shows the parameter identification scheme. The measured vertical acceleration signal [19] for the carbody (azs_1) was compared with the vertical acceleration signal (carbody) of the simulink model to estimate the vehicle parameters by applying an optimization algorithm to the objective function that is to reduce sum square error (SSE) between the measured and the simulated. Several numerical integration is done on the objective function to obtain a minimal SSE. The vehicle parameters for the least SSE value is then updated to the simulink and that becomes it identified parameter values or the adopted values.

Proposed Methodology
In this Section, a fault detection method with a hybrid model is discussed in details.  The hybrid model is a combination of the model-based and data-driven approach to detect and isolate faulty components in the primary (between the wheel sets and the bogie frame) and the secondary (between the bogie frame and the vehicle body) suspension system of a railway vehicle as shown in Figure 8. . Faults were induced into the simulink model by reducing the stiffness and damping coefficient of the springs and dampers of the primary and secondary suspension system. An SVM model was developed with the measured data in [19]. The health state of each the damage scenarios, that is, the stiffness and damping coefficient sequential loss of 10% -80% reduction of the nominal parameter values was determined by the developed SVM model. Features were extracted from the vertical acceleration of the simulink model for each damage scenarios. The extracted features were then fed into ML with its targets (health state) to build a new classifier (SVM model). The new SVM model was then used to predict on the test data in [19] based on selected features from the measured vertical acceleration.

Diagnostic Model Development
Fault diagnostic involves the process of identifying and isolating fault (failed component), the failure mode (cause of failure) and the degradation level in condition monitoring. Figure 9 and Figure 13 illustrate the diagnostic approach   and the work flow of the health state evaluation approach to diagnostic. A diagnostic model development to detect faults in the suspension system is proposed as follows. The layout procedure is provided in Figure 9. 1) Data-preparation: The time and frequency domain features are generated. The raw acceleration measurements are pre-processed to extract time and frequency features from a data set that contains labeled and unlabeled data in this work.
2) Feature selection: This is where extracted features are selected based on their importance to help improve classification accuracy and also to remove irrelevant features using the backward approach. The detailed procedure will be discussed in Section 6.5.
3) Fault detection: Support vector machine (SVM) classifier using extracted features from the labeled data was used to determine the normal and faulty condition of the data set. The trained SVM model was used to predict the state of unlabeled data set. The reason for SVM and the procedure will be discussed in Section 6.6.

Feature Extraction
Feature extraction is a usual step in all diagnostic and prognostic approaches. It is the process of obtaining time, frequecy and time-frequency domain features from raw signal data by sensors. Feature extraction reduces the dimensionality of the data, remove redundancy and hence minimizes the complexity and the computational requirements of the machine learning algorithm [20] [21] [22]. A total of 12 features where extracted from each signal, that is 9 features from time domain and 3 from frequency domain as shown in Table 3 and Table 4 respectively.        Table 4 shows the frequency domain feature used in this work to identify various types of faults.

Feature Selection
Feature selection is one of the dimensionality reduction approaches. It helps to remove irrelevant or redundant features and also increase the classification accuracy. Feature selection techniques obtain a new generated feature set from the original set [21]. Figure 10 shows the procedure for evaluation and selection. Given that: (  )   1  2  3  4 , , , , , n f X X X X X = where 1 2 3 4 , , , , , n X X X X X are the extracted features.
( ) where   1  2  3  4 , , , , , m X X X X X ′ ′ ′ ′ ′ are the selected features. The subset feature ( f ′ ) obtained gives the optimum performance due to some objective function and significant criteria. The backward feature selection algorithm was used in this work.
Full or complete feature set is fed into the serach algorithm from the training data set to outcome a feature subset. The feature subset is then used with the ML algorithm and the prediction accuracy was obtain. Features with smallest impact on error are drop until the whole module is completed. After that the final feature subset is then fed in the ML algorithm for training.

Support Vector Machine (SVM)
Support vector machine (SVM) is a supervised learning algorithm that can be used for binary data classification problems. The binary classification problem is solved by constructing an optimal hyperplane as a decision surface such that the margin of seperation between the two classes in the data is maximized as shown in Figure 11. Data points that falls on the boundary plane are called support where W is a vector orthogonal to the hyperplane and b is a constant. The hyperplane (W, b) that seperate data is described by the function; for a good classification of data [20]. The following constarints are as a results of choosing a hyperplane such that the boundary planes are as funtional distance of at least 1 from the hyperplane.
SVM problems involve maximizing the margin seperating the two classes of data as shown in Figure 12. The margin between the two data class is given by; The kernel function allows the construction of a hyperplane in the higher dimensional feature space without explicitly performing calculations in this feature space. Typical kernel functions used include: Linear, Polynomial, Radial basis function (RBF) and Multi-layer perceptron.
Training phase, where the machine classification algorithm (SVM) uses labeled data to learn the underlying behavior between different health states. This results to a classification model with the necessary weights, biases and parameters. Testing or online phase where unlabeled features from condition monitoring data of a similar unit are used as the input to the algorithm for classification ( Figure 13).

Faults in Suspension System
The suspension fault types considered in this work are the sequential stiffness loss and damping coefficient loss of the secondary and primary suspension system. Faults associated with dampers and springs are leakages, seal wear, fatigue crack propagation and material deformation (yielding or creep) [5]. The failure mechanism considered in this study are leakages and fatigues crack propagation of dampers and springs respectively. This failure mechanism leads to the reduction in stiffness and damping effect and as a result changes the dynamic behavior of the railway vehicle. Suspension faults were artificially introduced into the    Table 5.
The dynamic behaviour of each damage scenarios that is the vertical acceleration of the carbody and the bogie was extracted. Features were extracted from each scenarios vertical acceleration. The selected features were then fed into the classifier to determine the state of each. After the state of each damage scenarios was determined, the data was then trained using the extracted features from the signal (vertical acceleration). The trained model was used to predict the state and possible failure mode of the testing data.  Figure 14 represents the vertical displacement of the track input obtained from the simulated data in Figure 4 which serves as an input to the simulink model and Figure 15 shows it first-order derivative at the vehicle speed of 180 km/h to the simulink model and the magnitude of the track input depends on the vehicle speed. The PSD of a typical road profile [26] matches the output of the simulated in 14 with displacement amplitude of 19 cm which shows a good correlation with a typical road profile. The measured output or the simulated is the processed output of the developed simulink model in Figure 4. The deviation in SSE is as a result of incorrect system parameters use for simulation in respect to the measured data [19]. Less SSE correspond to better or accurate system parameters and vice-versa. Figure 17 shows how the simulated and measured match after the optimization algorithm. A minimum SSE of 24.459 was obtain after the algorithm which shows a better system parameter values of the railway vehicle suspension as compared to the initial system values in Table 1 and aslo the obtained system parameter values are relevant for condition monitoring of the suspension system.

Parameter Identification or Estimation
The SSE after the algorithm was 24.459 as shown in Figure 18 with 31 iterations. It can also be seen in Figure 17 that the measured match the simulated in all peaks and also indicates a good correlation.      Table 6 illustrate the estimated nominal values of the suspension system based on the measured data extracted from the Figure 19.    Figure 20 and Figure 21 depict the model prediction diagram for the secondary and primary suspension system with different features with the measured data.

Support Vector Machine Trained Model
The normal or healthy class is in red, the faulty class is in blue and × represent a wrong prediction of the model. The data from normal conditions are clustered together in the extracted feature space. However, the data from faulty conditions are in the same cluster too. The normal and faulty samples are more separable in

Model Evaluation
The metric to assess the performance of the model accuracy is described by; The number of experiments correctly predicted Total number of testing experiments (17) Figure 22 illustrates the confusion matrices based on the model prediction accuracy of 0.988 and 0.989 for the secondary and primary suspension system respectively.
This SVM model detects anomaly in the data, it classifies the data as healthy and faulty but does not explain the real cause of fault, if its whether the damper or spring. Therefore, it is necessary to develop a model that explains the origin of the faulty or healthy output as it will be presented later on.

Suspension Fault Conditions
The secondary and primary suspension is crucial for the comfort of the vehicle. Secondary suspension has the function to reduce the effect of vibration from the bogie frame to the car body and also the primary suspension ensures the guidance of the wheelset on the track and reduces vibrations transmitted from the wheelsets. Figure 23 and Figure  for primary suspension. At nominal condition the railway vehicle moves with a velocity of 180 km/h. The vibration effect on the carbody is very minimal and also the natural frequency falls within the range in which human beings are most sensitive to. Low vibration effects correspond to healthy system component and vice-versa. Table 7 shows the sequential damping coefficient loss and the sequential stiffness loss of the secondary and primary suspension at 0.8 and 0.3 reduction of the nominal parameter values. It illustartes the peak value, maximum FFT amplitude and the frequency at maximum FFT amplitude. Table 8 shows the prediction of damage scenarios of the damper and the    spring degrade by 50% and 40% respectively of the nominal value they becomes faulty when the suspension system components are in use. Table 9 shows the prediction of damage scenarios of the damper and the Journal of Intelligent Learning Systems and Applications  Table 9. Damage scenarios prediction using the trained SVM model. spring in the primary suspension system and it indicates that when the damper and spring degrade by 30% and 40% respectively of the nominal value they becomes faulty when the suspension system components are in use. Figure 25 and Figure 26 illustrate the model prediction diagram for the secondary and primary suspension system with different features from the simulated data. The yellow class represents the healthy or normal sample (healthy spring and damper), the blue class represents faulty spring, the red class represents faulty damper and × represents a wrong prediction of the model. The model predicts accurately the faulty and healthy classes, having a low prediction error of 0.156. Low prediction error or misclassification is as a result of the model being unable to detect efficiently the differences between the healthy and faulty samples. The comparison of prediction accurarcies with the same features among several classifiers is presented in Table 10. It shows that the proposed approach (SVM) can predict correctly most of the experiment with an accuaracy of 0.844. Figure 27 illustrates the confusion matrices for the SVM model prediction in Figure 25 and Figure 26 based on the model prediction accuracy of 0.844 (using Equation (17)) with a minimum prediction error of 0.156 for the primary and secondary suspension system. Journal of Intelligent Learning Systems and Applications     predict on the test data. Based on the results obtained, the following conclusions were drawn:

Conclusions
1) The developed simulink model of the suspension system was able to assist in identifying the nominal parameter values through the comparison of the simulated and the measured data with an SSE of 24.459.
2) The proposed models for the primary and secondary suspension were able to predict anticipated faults and the level of degradation of each component with an accuracy of 0.844.
3) The spring and damper becomes faulty when the nominal values degrade by 50% and 40% and 30% and 40% for the secondary and primary suspension system respectively.
4) The proposed hybrid model was able to predict on the test data, the exact fault in the suspension system.