Hybrid Methodology for Structural Health Monitoring Based on Immune Algorithms and Symbolic Time Series Analysis

This hybrid methodology for structural health monitoring (SHM) is based on immune algorithms (IAs) and symbolic time series analysis (STSA). Real-valued negative selection (RNS) is used to detect damage detection and adaptive immune clonal selection algorithm (AICSA) is used to localize and quantify the damage. Data symbolization by using STSA alleviates the effects of harmful noise in raw acceleration data. This paper explains the mathematical basis of STSA and the procedure of the hybrid methodology. It also describes the results of an simulation experiment on a five-story shear frame structure that indicated the hybrid strategy can efficiently and precisely detect, localize and quantify damage to civil engineering structures in the presence of measurement noise.


Introduction
Structural health monitoring (SHM) is a vast, interdisciplinary research field whose literature spans several decades.The focus of SHM research is the detection, localization, and quantification of damage in a variety of structures.Broadly speaking, SHM techniques for detecting, localizing, and quantifying damage rely on measuring the structural response to ambient vibrations or forced excitations.Ambient vibrations can be caused by earthquakes, wind, or passing vehicles, and forced vibrations can be delivered by hydraulic or piezoelectric shakers.SHM techniques infer the existence, location and severity of damage by detecting differences in local or global structural responses before and after the damage occurs.
Some success has been achieved with various heuristic optimization algorithms.The annealing algorithm (SA) and genetic algorithm (GA) methods have been used to accurately describe the dynamic behavior of structures [1].Cunha & Smith used GAs to identify the elastic constants of composite materials [2].Particle swarm optimization (PSO) has been used to estimate the severity of damage and identify the parameters of shear frame building structures [3].An improved clonal selection algorithm (CSA), called adaptive immune CSA (AICSA), has been used for structural damage localization and quanti-fication [4,5].Moreover, recently, a novel pattern identification technique, called symbolic time series analysis (STSA), was developed.The core concept of STSA is the identification of statistical patterns from symbol sequences generated by coarse-graining of time series data.STSA for anomaly detection in complex systems [6] has the potential to deal with noise.Several case studies [7][8][9] have shown that STSA is more effective at anomaly detection than pattern recognition techniques such as principal component analysis and neural networks.STSA has also been used for fault detection in electromechanical systems, e.g., three-phase induction motors [10].
In this paper, we propose a hybrid methodology combining immune algorithms (real valued negative selection (RNS) and AICSA) and symbolic time series analysis (STSA) for detection, localization and quantification of damage to structural systems.In this methodology, RNS detects damage, and AICSA localizes and quantifies it by minimizing the Euclidean distance between the state sequence histogram (SSH) that STSA gets by transforming the raw acceleration data.We mathematically show that STSA improves noise immunity, and our experimental results show that this hybrid strategy can efficiently and precisely detect, localize and quantify damage to civil engineering structures in the presence of measurement noise.

Real-Valued Negative Selection and Adaptive Immune Clonal Selection Algorithm
The negative selection (NS) algorithm [11] was inspired by observation of the activity of the human immune system, in particular, the selection process that takes place inside the thymus.In this process, T-cells that recognize the body's own cells (self cells) are eliminated; this guarantees that the remaining T-cells will recognize only foreign molecules.Gonzalez et al. [12] proposed a new negative selection algorithm that uses a real-valued representation of the self/non-self space.RNS tries to alleviate some of the drawbacks of NS while using the higherlevel-of-representation real space to speed up the detector generation process.Refer to [13] for the detailed procedure of RNS for damage detection.
Inspired by the clonal selection principle (CSP), the clonal selection algorithm (CSA) has been used to deal with optimization problems, thanks its superior search capability compared with classical optimization techniques [14].CSP explains how an immune response is mounted when a non-self antigenic pattern is recognized by B cells.In natural immune systems, only the antibodies that can recognize the intruding antigens are selected to proliferate by cloning [15].Hence, the main idea of CSA is that those cells (antibodies) capable of recognizing the non-self cells (antigens) will proliferate.Although CSA has great advantages over GA, it is still difficult to solve complex problems with it.In order to solve complex problems, AICSA embodies three strategies: secondary response, adaptive mutation regulation, and vaccination to speed up CSA's convergence and ability to find the global optimum.For more information about AICSA, please refer to [16].

Symbolic Time Series Analysis
It may be appropriate to say that, while classical data analysis focuses on individuals, symbolic data analysis deals with concepts, a less specific type of information.The original time series signals are converted into sequences of discrete symbols, and the statistical features of the symbols can be used to describe the dynamic statuses of a system.Consider a structural system .The raw acceleration data can be recorded by using sensors.A section of this data, denoted as    0 1 1 , can be obtained by sliding a rectangular window of length T along the time series of the raw acceleration data.The first step is to transform the raw acceleration data into a binary sym- . We can then derive the statistics of the symbolic state, i.e., compute the vector of the observed state frequencies , where

Introduction
In our methodology, the self (non-self) element is an SSH gotten by STSA from raw acceleration data from a healthy (damaged) structure.The detector is represented as a redistribution of the states of self elements.
We introduce an index, the relative state sequence histogram error (RSSHe), to measure the distance between two histograms: where i d a b is the frequency of state i in or .

Procedure
Our methodology has two stages: damage detection using RNS and damage localization and quantification using AICSA.The procedure is as follows (see Figure 1 to create a corresponding .j b) To create the self set, the distances RSSHe be- tween the jth SSH and all previous in the self set is calculated.If one of the distances is less than the predefined self radius SSHs s r , the jth is discarded.Otherwise, it is stored in the self set as a new self element.This procedure is repeated until the predefined criteria is reached.SSH c) RNS is applied to the self set to generate detectors.
2) Detection phase a) New raw acceleration data of a structure is symbolized into a corresponding  j SSH by using the same method as in step a) of the training phase.b) When matching  j SSH with the detectors generated in the training phase, if any detectors are activated (the distance between them is less than a certain value), a signal indicating the occurrence of the abnormal structural state will be given.
Stage 2: Damage localization and quantification.
In the research field of structural parameter identification, the time response of the system is usually compared with that of a parameterized model using a norm or some performance criterion to give us a measure of how well the model explains the system.
Suppose that damage to a structure has been detected by the above procedure.Moreover suppose that there is a parameterized model able to capture the behavior of the physical system, and this model depends on a set of n parameters, i.e., , , , Given a candidate parameter value x and a guess 0 X of the initial state,     , the value of the parameterized model, i.e., the identified system at the ith discrete time step, can be obtained.Hence, the problem of system identification boils down to finding a set of parameters that minimize the prediction error between the system output   i y t , which is the measured data, and the model output  , ˆ, i y x t which is calculated at each time instant .

i t
Usually, our interest would lie in minimizing the predefined error norm of the time series outputs, e.g., the following mean square error (MSE) function, where 2  represents the Euclidean norm of vectors.Formally, the optimization problem requires one to find a set of parameters n n x R   so that a certain quality criterion is satisfied, namely, that the error norm   • f is minimized.The parameters of the model are taken to be equivalent to the parameters of the structure.
In our methodology, instead of comparing raw acceleration data directly, the Euclidean distance of SSHs from the structure and model is used as a measure of the distance between the system output and model output .
a SSH b H SS

Generation of Detectors
If window length is and word length is , the number of states in is 2 and the minimum change- , then the total number of possible 1   distributions SSH of SSH is equal to one classic combination problem, which is 'put same balls in different boxes, and the combinatorial number is: SSH  .Supposing that the can capture SSH the dynamic features of a structure, the normal state of the structure is represented by (self), the damage to that structure can be expressed by other distributions of states in the (non-self).Instead of randomly generating candidate detectors (which may result in creating an impossible distribution of states in the ), a novel procedure to create candidate detectors on the basis of self elements is proposed.The procedure is described below ( is the number of self elements).

Noise Immunity
Usually the raw acceleration data is chosen as input, and the Euclidean distance of raw acceleration data is used as the damage index or objective function; the problem is that the possible number of representations of raw aceleration data is infinite for any Euclidean distance.In other words, the raw acceleration data is easily affected by noise.STSA is a coarse graining process that is robust against measurement noise.It is expected that small changes in time series data do not affect the symbolized data.Therefore, it can be assumed that a certain band of states represents similar dynamic status of the dynamic structural system, and instead of having an infinite number of representations of the Euclidean distance in the case of using raw acceleration data as input, the representation is finite.

SSH
Taking a SDOF (single degree of freedom) system as an example, the dynamic equation can be represented as: where , , and are respectively the mass, damping ratio, and stiffness of the system, F is the force linked to the ground acceleration.By dividing both sides by the mass , the equation of motion becomes is called the critical damping.Acceleration x  is: At time , t t x  can be obtained as Since the window length is and word length is , all the raw acceleration data in the window is x       .The mean value of x  is: This mean value of x  t is for the noise-free case.In case of raw acceleration data polluted by noise, the acceleration data at time is where t  is the value of noise at time .The partition line (the mean value of For a sufficiently long window length and word T length , r , which means that noise will not affect the partition line. A random sample of SSH is defined as a .From Equation (2), any other (i.e., ) will be some distance away from . where where 0, 2 1   is an even number.
From reference [18], since the minimum unit of change and the dimension of is SSH 2 r , there is a possibility that not only one but many different will have the same as .This possibility occurs when Equation ( 13) is satisfied.

SSHs
, where  is the number of different units between and .The number of different values of ε is .For every certain value of ε can be called as a representative band, then the number of representative band is . Also, we can see that for a certain combination of , when α SSH  s is the number of elements for which For example, when and 9 r   is the minimum 2, the combinatorial number is 130816, which is a big number, and we should note this is only the minimum one.The conclusion is that STSA has very good fault tolerance.
In addition, since the total number of units in is , In the damage localization and quantification stage, the severity of damage is defined as where i and r are the identified stiffness and real stiffness of the th story, respectively.

Description of Model
For simplicity and generality, we used a five-story shear frame structure as a representative case to verify the performance of our methodology, and we modeled it as a five degree-of-freedom lumped mass system (Figure 2).Table 1 lists the structural and modal parameters of the structure.
The dynamic equation is [17]:  is the force vector linked to the ground acceleration., , and X  X are respectively relative acceleration, velocity, and displacement response.The sampling frequency was 100 Hz.In the simulation, the input signal was Gaussian white noise (Figure 3).To test the noise immunity of our method, noise at levels of 5% and 10% was added to the raw acceleration data.

Results of Damage Detection
The training data sets that were to generate detectors were acceleration time histories from the top story of the healthy shear structure under ground motion following a pattern of randomly generated Gaussian white noise.In the test phase, the test data sets (self and non-self) were obtained from the top story of the healthy and damaged shear structure under ground motion following another pattern of randomly generated Gaussian white noise.The sampling frequency was 100 Hz, and all time histories were normalized.
As is shown in [18], the word length and window length will greatly affect the performance of the methodology.Larger word and window lengths yield better performance.The reason is that, a longer word or window can symbolize the raw acceleration data much more accurately than a shorter one.As the word length and window length increase, much more dynamic information about the system is captured, and the representation space of the problem becomes more accurate.In the simulation, the word length was set as 9 and window length was set as 3.0E+03.
Several tests were performed by simulating stiffness reductions occurring in different locations and to different degrees.The simulated cases included stiffness re- ductions on a single story (2nd story or 4th story) and on two stories (2nd story and 4th story).The degree of stiffness reduction was 10% or 20%.Table 2 lists the most pertinent results, and Figure 4 compares the DR and FAR values of the different damage cases.
The results show that, by employing our methodology, perfect results can be obtained for the noise-free case, DR is 100% and FAR is which means all the damage can be detected and no SSH from the healthy structure is misclassified.For the noise polluted cases as well, a high DR and low FAR can be obtained no matter the location or degree of stiffness reduction.Even as the noise level increases and the detection rate decreases, the extent of misclassification remains very small.This means that noise is not problematic in our methodology.Indeed, the good results probably reflect the fact that the RSSHe is much more sensitive to changes in the structure itself than to the environment.

Numerical Simulation of Damage
Localization and Quantification

Description of Procedure
In this simulation, the system to be identified was the same five-story shear frame structure as stated before; the mass distribution and damping parameters were assumed to be known, and the stiffness of each story was set as the objective parameters that needed to be identified.First, the parameters of the healthy structure were identified as a reference.AICSA was performed when the abnormal SSH of the damaged structure was detected.
The abnormal SSH was the output of the damaged structure.The input ground acceleration data corresponding to the abnormal SSH was used as the input of the candidate model, and the output of the candidate model was symbolized using the procedure in Section 2.2.The Euclidean distance of system and candidate SSHs was used as the objective function to be minimized.After the parameters of the damaged structure were identified, the severity (Equation ( 6)) showing the location and degree of the  damage was calculated.

Identification Results for Healthy and Damaged Structures
The input signal was Gaussian white noise, as in the previous simulation.The window and word lengths were the same as before, and the full output of the structure was used.Table 3 lists the identified stiffness of each story of the healthy and damaged structures.The damage cases are the same as those in the damage detection stage.Severity of damage was calculated for each damage case using the results in     The results show that, AICSA combining STSA can identify the parameters of a structure accurately regardless of whether the structure is healthy or damaged.Even error of severities of each story increase slightly as the noise level of the raw acceleration increase, the location and severity of the damage can be identified distinctly.

Conclusion
We proposed a hybrid methodology based on immune algorithms (IAs) and symbolic time series analysis (STSA) for structural health monitoring (SHM).In this methodology, RNS is used detect damage, and AICSA is used to localize and quantify the damage to the structure.Data symbolization by using STSA alleviates the effects of noise in the raw acceleration data.We explained the mathematical basis of STSA and described a simulation experiment on a five-story shear frame structure.The results showed that this hybrid methodology can efficiently and precisely detect, localize and quantify damage to civil engineering structures in the presence of measurement noise.

): Stage 1 :Figure 1 .
Figure 1.Procedure of IA and STSA for structural health monitoring. dist

5 .
For each representative band  , there will be many representations of b ; in other words, a small change in the raw acceleration data (say, due to noise) will not affect the between and .Indexes Used in the Performance EvaluationTwo indexes are used to evaluate the classification accuracy: the detection rate (DR) and the false alarm rate (FAR).DR is the ratio of correctly classified negative elements to the total negative elements, and the FAR is the ratio of incorrectly classified positive elements to the total negative elements.Four values are needed to calculate DR and FAR: the number of true positives (TP, positive elements identified as positive), true negatives (TN, negative elements identified as negative), false positives (FP, negative elements identified as positive), and false negatives (FN, positive elements identified as negative).DR and FAR are calculated as

Figure 4 .
Figure 4. DR and FAR for damage detection.
is the radius of existing detectors; o is the overlap rate, defined in 3.4), delete it.If not, store it as a new detector.Set as the radius of the new detector.

Table 3 ,
and Figures 5, 6 and 7 plot the severity for the noise-free, 5% noise and 10% noise cases, respectively.