^{1}

^{1}

^{1}

^{1}

A comparative analysis of the effectiveness of prediction using PRT and RIA approaches, using, respectively, exclusively the linear component of long-term memory and, along with the linear, non-linear component, is given. The noise immunity of prediction is considered in both approaches in the presence of ad-ditive noise with a normal or uniform distribution.

In the analysis of random signals typical situation information, the distorted noise of various natures, the most common is the case when random signals generated by the originating system are available to the observer only with distortions due to errors used in recording signals using measuring equipment, due to the presence of various kinds of interference in transmitting information over the radio channel, and due to rounding off when digitizing data, and for other reasons. In all these cases, it is necessary to extract useful information from the received oscillation, which represents the distorted signal.

In describing random processes generated physiological systems in different parts of autonomously regulated rhythms (e.g., heart rate [

In this article, using the example of repeated heartbeat intervals, we analyze the effectiveness of methods for predicting emissions of dynamic series with fractal properties using information about the linear and nonlinear components of the long-term dependence in the presence of additive noise such as white noise and examine the stability of these methods at various levels of interference [

To solve the problem of predicting the release of a random signal with fractal properties, two main approaches are possible [

The class of processes described by the first approach includes monofractal processes with linear long-term correlations (linear long-term correlation (LTC)) for which the autocorrelation function (AСF) C(s) obeys the power law C ( s ) ~ S − γ ( 0 < γ < 1 ) with a unique indicator h ( 2 ) = 1 − ( γ | 2 ) , describing process fluctuations in the time window of length S [

The second approach is based on the analysis of the nonlinear component of the long-term dependence. Within this approach, the mathematical apparatus of interval statistics is used to predict random signals with fractal properties when the probability W Q ( t ) of exceeding the threshold Q by a random signal y i at the next time t is estimated based on the time t that has elapsed since the last such excess, based on the expression.

W ( t , Δ t ) = [ C Q ( t + 1 ) − C Q ( t ) ] / [ 1 − C Q ( t ) ] (1)

where C Q ( t ) is a function of the probability distribution between outliers of a random signal above a given threshold [

The class of processes described by the second approach includes the mulfractal processes. In contrast to monofractal records (maximum values of the process y i ) for which a single Hurst index H describes fluctuations on any scale, for multifractal records each moment value q used to calculate the generalized fluctuation function F q ( s ) in the method of multifractal fluctuation analysis (MFFA) [

This approach, using interval statistics, called the return interval approach (RIA), is most suitable for real-time forecasting of experimental events using nonlinear memory [

However, it should be borne in mind that the problem of forecasting in the above works was considered on the assumption of undistorted random signals, in particular without taking into account the effects of noise, while analyzing the recorded random signals generated by physiological systems (in particular, the cardiovascular system (CVS)), the impact of noise factors during removal and measurement error is very significant and the assumption of the absence of noise can lead to a decrease in the reliability of the result. Heart rate, like many other physiological rhythms, is a random signal with pronounced both linear and non-linear components of long-term dependence. When analyzing the efficiency of prediction of an artificially synthesized random signal (generated using a multiplicative cascade mathematical model (MRC) with parameters simulating the dynamics of the heart rhythm in norm), the results indicate a comparable prediction efficiency of both the PRT and RIA approaches, which is consistent with the data [

On the other hand, when analyzing real heart rate records, a forecast based on RIA and using additional analysis (with the involvement of the sensitivity operator Sens, showing the frequencies of correct predictions of Q events, i.e. exceeding the threshold Q, and the specification operator Spec, indicating the frequency of correct predictions for non-Q events, i.e. not exceeding the Q threshold.

Such an analysis, called “receive operator characteristic” (ROC) analysis in [

Denote by N_{11} and N_{01} the number of correct and erroneous predictions of Q-events and by N_{00} and N_{10}―the number of correct predictions of non-Q-events. Then the number of erroneous predictions

D = N 11 / ( N 11 + N 01 ) и α = N 10 / ( N 00 + N 10 ) (2)

will be equal to the proportion of correct predictions of Q events and the proportion of erroneous predictions of non-Q-events, respectively. The plot of D on α is called the ROC curve [

As the most significant factor that could affect the discrepancy in the results, it should be noted the measurement errors of the intervals between heartbeats in the analysis of electrocardiogram, since the ECG during outpatient monitoring is usually carried out in a complex noise situation.

In this regard, we will assume that the random signal under investigation, obtained by synthetic MRC or by real observations of the heart rhythm, is distorted (noisy) by additive white noise. Noise affecting observed signal records may have two main reasons. The first of these is the possibility of the random nature of the measured process itself. These are, in particular, the heartbeat intervals in atrial fibrillation syndrome [

It is usually assumed that the noise distribution is Gaussian (normal), since, according to the central limit theorem, it describes measurement errors that are a superposition (superposition) of many factors (external noise, accuracy of measuring equipment, etc.). In addition, the discretization of data generates a uniform distribution of the obtained observations. In fact, random processes generated by complex systems are characterized by distributions that form a much wider class than noise distributions [

In the following, we will assume that the noise is white, thereby assuming that it has a flat power spectrum and that it is uncorrelated. This is not always true due to the preliminary filtering of instrumentation, which results in “color” noise, characterized by a finite power spectrum. However, the measurement noise correlation time is usually shorter than even the length of the precursor pattern used in the PRT approach, and much less the time between extreme events used in the RIA analysis, so the assumption about white noise can be considered fair.

For uncorrelated source data { X i } (i.e., at С x ( s ) = 0 , s > 1 , where С x ( s ) is the autocorrelation function of the data), repeated intervals will also be uncorrelated by С Q ( s ) = 0 , s > 1 . For data with long-term correlations, repeated intervals have an autocorrelation function (ACF), obeying the power law С Q ( s ) ~ s − β , where β = γ , γ -correlation exponent for C x ( s ) ~ s − γ . However, for repeated intervals with a large period R Q , deviations from a power law with a constant constant β are observed [

For purely multifractal source data without a linear correlation ( С x ( s ) = 0 ), repeated intervals have ACF С Q ( s ) ~ s − β ( Q ) [

An interesting fact is that for multifractal data in the absence of their linear correlations, repeated intervals will have linear long-term correlations. Linear and nonlinear correlations present in the source data have their contribution to the linear correlations of repeated intervals, and even in the absence of linear correlations of the original data, repeated intervals can have long-term correlations [

It is obvious that nonlinear correlations of multifractal initial data will induce long-term correlation of repeated intervals [

Let us compare the effectiveness of RIA and PRT analysis in predicting emissions from information on the linear component of long-term dependence based on synthetic data obtained from the MRC model with additive Gaussian interference and averaged over 20 different Gaussian interference distributions. In this case, for the length of the series obtained, we take L = 2^{21} with the values h(2) = 0.6, 0.8 and 0.98 (γ = 0.8, 0.4 and 0.04), respectively, with the length of the precursor pattern in PRT k = 2, 3 and 4, the number l in the MRC will be chosen so that the total number of patterns l k is equal to 10^{4}. In Figures 1(a)-(c), we present the graphs of ROC curves for α ≤ 0.35 for R Q = 70 , and in Figures 1(d)-(f) for R Q = 500 .

To increase the values of h(2) and, consequently, decrease the value of γ = 2 ( 1 − h ( 2 ) ) in the EKQ, it is enough to apply a procedure for “thinning” the data by dropping recurring records.

As one would expect, the prediction efficiency both in PRT and RIA analysis improves with an increase in the Hurst index due to the enhanced persistence property.

From

values of the indicator D for correct predictions of Q-events are obtained for PRT compared to RIA analysis, demonstrating that in the case of using information about the linear component of the long-term dependence and in the absence of white noise, the PRT-technique has an advantage over the RIA-analysis in predicting extreme events. In addition, from

We now consider a monofractal model with an extremely linear long-term dependence with a normal distribution of signal and noise. Let us dwell on random signals characterized by the indicators h(2) = 0.6 (weak long-term dependence), 0.8 (intermediate value), and 0.98 (pronounced long-term dependence).

The value of S/N is usually taken as the correct ratio of the amplitude S of the signal to the standard deviation of noise σ_{N}. Since we are interested in events exceeding Q, we can assume that S/N = Q/σ_{N}. From

The results show that in the case of records with linear long-term correlations, the best way to obtain predictions using PRT using short-term memory is more interesting than predicting experimental events using RIA using long-term memory due to high linear persistence of records with linear long-term correlations. It should be noted that similar conclusions are obtained in the presence of uniformly distributed (“digitalized”) additive noise after their discretization. And in this case, the PRT forecasts are superior to the RIA forecasts.

Consider the predictability of extreme events with multifractal records. In Figures 3(а)-(с) are shown ROC-curves, which characterize the efficiency of forecasts obtained using MRC records of length L = 2 21 with h ( 2 ) = 0.5 , 0.8 and 0.98 at R Q = 70 using RIA and PRT analysis (with the same parameters, as in the case of linear long-term correlations (LTC) data). Similar results at R Q = 500 are presented in Figures 3(d)-(f).

Since for MRC records, predictability has higher values,

From _{Q} (

Let us now consider the predictability of multifractal records in the presence of additive noise, focusing again on the Gaussian distribution of noise and the length of the losses at k = 2 in the PRT. _{N}.

From

When S/N approaches unity, the prediction results of both methods become worse. It should be noted that RIA-forecasts in this critical S/N value occupy an intermediate phase (“phase transition”) between the phase with high correct prediction dominated by linear and nonlinear persistence, and the phase with low correct prediction when noise dominates. The shape of this movement is more pronounced with increasing nonlinear memory from h/(2) = 0.98, when linear memory dominates, to h/(2) = 0.5, when only nonlinear memory exists (

In

Next, we analyze the situation when the data is characterized by a narrower class of distributions. We impose on the MRC-records Gaussian interference in ascending order of their values.

In this case, when added to Gaussian noise, the RIA and PRT forecasts show a smooth shift between the two phases, and there is no superiority between these

forecasts. When added with a uniform distribution, the shift of the noise-resistant characteristics (NRC) in the RIA-forecasts acquires an explicit form, while for the PRT-forecasts these characteristics remain unchanged. It follows that in the case when the distribution of data is much wider than the distribution of additive noise (which is most typical for complex systems), the superiority of RIA-forecasts becomes significant.

The main discrepancy between the two prediction methods is that the RIA exclusively uses information about previous events exceeding Q, which are relatively less dependent on noise than the precursors used in the PRT, which are significantly lower than Q for large R_{Q}, except for records with very high strong persistence. Consequently, with an increase in the level of additive noise, a shift from a phase with good predictability to a phase with weak predictability indicates that noise generates quite a lot of false extremes (“false extremes”). Detecting “false extremes”, the RIA forecast is obtained with high levels of risk for the following time units, leading to an increase in the false alarm indicator α for the same value of the test threshold Q_{p}. Keeping α constant and choosing high Q_{p}, we come to the modified value of the indicators of the correct forecast D. However, this situation is atypical for the observed predictions, since we are usually interested in high thresholds Q and therefore the signal-to-noise ratio becomes much higher than the phase shift point when the RIA has superiority.

Since the information used in the PRT and the RIA complement each other, it is interesting to test possible combinations of these two methods for predicting MRC records. Check the following combinations for the risk probability W_{c}(i) as the product of the risk probabilities obtained in each of these methods

W C = W P R T ⋅ W R I A ;

(ii) weighted sum of risk probabilities

W C = a ⋅ W P R T + ( 1 − a ) ⋅ W R I A , 0 < α < 1 ;

(iii) switching (switching) between two risk assessments at a certain point in time t_{S}, i.e.

y = { W P R T , t ≤ t S W R I A , t > t S и y = { W R I A , t ≤ t S W P R T , t > t S 1 ≤ t S ≤ 20

In all cases, in the absence of additive noise, no significant improvement in the prognosis was found in comparison with each method separately. In the presence of additive noise in all three cases, a smoother phase shift occurs than in the RIA. Therefore, the combination of risk assessments does not always exceed the risk estimate at intermediate noise levels.

Thus, it is possible to make the following conclusion: the RIA forecast is better than the PRT-equipment forecast in the following cases: 1) records contain a strictly non-linear memory component; 2) records contain an additive random noise component with distribution from a narrower class than the distribution of the original data. In all other cases, the RIA forecast is either comparable or worse than the PRT forecast. The combination of these approaches does not improve the forecast. An important advantage of RIA analysis is that it requires only information about the time of occurrence of previous events and can be easily realized to predict time series when it is necessary to detect special events (with a large outlier) from one or several observations and get predicted values for each initial moment of time.

The authors declare no conflicts of interest regarding the publication of this paper.

Abdullayev, N.T., Dyshin, O.A., Ibrahimova, I.D. and Ahmadova, Kh.R. (2019) Analysis of the Noise Immunity of Emission Prediction in the Dynamics of the Heartbeat Using Information about Short-Term and Long-Term Dependencies. Open Access Library Journal, 6: e5888. https://doi.org/10.4236/oalib.1105888