New Ecg Signal Compression Model Based on Set Theory Applied to Images

Cardiovascular diseases are the origin of many causes of death worldwide. They impose on practitioners optimal diagnostic methods such as telemedicine in order to be able to quickly detect anomalies for daily care and monitoring of patients. The Electrocardiogram (ECG) is an examination that can detect abnormal functioning of the heart and generates a large number of digital data which can be stored or transmitted for further analysis. For storage or transmission purposes, one of the challenges is to reduce the space occupied by ECG signal and for that, it is important to offer more and more efficient algorithms capable of achieving high compression rates, while offering a good quality of reconstruction in a relatively short time. We propose in this paper a new ECG compression scheme that is based on a subset of signal splitting and 2D processing, the wavelet transform (DWT) and SPIHT coding which has proved their worth in the field of signal processing and compression. They are exploited for decorrelation and coding of the signal. The re-sults obtained are significant and offer many perspectives.


Introduction
Telemedicine and medical imagery now generate an impressive amount of data, which now requires making imperative use of powerful signal and image compression software for massive information backup and storage [1]. The transmission of this data by telecommunications channels would be more optimal as How to cite this paper: Kabiena the size of the data to be transferred is minimal. The problem that we are trying to solve is that of improving the transmission times by reducing the size of the data to be transmitted as well as optimizing the available storage space. Compression is therefore an alternative to overcome this problem and this has been presented by Ntsama E.P and Kabiena in [2] and by Istepanian Robert and A. P. Arthur in [3]. The challenge is to have higher compression rates while offering a faithful reconstruction, and to avoid the slightest degradation which could lead to a fatal diagnostic error. There are several techniques in the literature for compressing biomedical signals, in [4], S. Olmos, M. Millan, J. Garcia and P. Laguna have proposed the use of ECG data compression based on Karhunen-Loeve transform. K. Uyar and Y. Z. Ider in [5] proposed a development approach of a compression algorithm suitable for exercise ECG data. And in [6] L. V. Batista et al proposed the compression of ECG signals based on the optimum quantization of discrete cosine transform coefficients and Golomb-Rice coding. We will use in this test, the methods by transforms which make it possible to avoid the redundancy errors, and to approach the smallest element, they are therefore better suited for ECG signals [7] [8]. Also, we propose a new method of compressing the ECG signal by exploiting the change of the 1D signal in 2D. The image thus formed is processed by the TOD and then coded by the SPIHT algorithm. The new 2D layout model that we have implemented makes it possible to deal with the specificities and complexities of the ECG signal, (P wave, U wave, T wave, QRS complex, etc. Figure 1 is an illustration), with good prospects for local filtering and

General Information on Electrocardiographic Signals
The electrocardiogram is the most effective examination for the diagnosis of cardiac pathologies, which affect the human heart [9]. This observation examination of the activity of the cardiac muscle tissue, given its limited function(its duration does not exceed a few seconds, carried out only during the subject's rest, etc.), has evolved into a much more effective technique which is the Holter test. The Holter represents the recording of the electrocardiogram signal for 24 hours or more and therefore contains the maximum amount of information on the cardiac activity of the different patients. A normal ECG recording represents a normal heart cycle; it is characterized by a spectrum of frequencies distributed below 40 -50 Hz and by waves and time intervals which repeat in a quasi-periodic manner as presented in Figure 2.
This figure shows that a normal ECG has four maximums. It is this fundamental property that we will exploit for the development of our new model.

The Theory of Sets or the Interest of Consolidation in Subset
Cantor is the main creator of the set theory which he introduced in the early 1880s [11]. It was while working on problems of convergence of trigonometric series, in the 1870s, that he came to define a notion of derivation of sets of real numbers. His work has had several consequences and applications, particularly in algebra and signal processing. By applying the consequences of set theory to the ECG signal vector, we deduce that there exist within the set P elements of this vector, subsets that can be organized in windows of dimension M times N in accordance to their value (intensity of future pixels); So if we consider P consisting of four windows F i , each window F i grouping elements close to the four specificities of the ECG signal that are the P, T, U waves and the QRS complex, then: The windows will therefore consist of close values, which has the effect of increasing the inter-pixel correlation, with the possibility of performing filtering on complex parts of the signal (P, T, U, QRS waves), this is illustrated in Figure 3.
The noise is considered in this case as an integral part of the image and compressed in the high frequencies during the decompression operation; the effect of the noise is therefore seen lessened, justified by the triangular inequality. Indeed, the data of the vector ECG put in 2D is coded on 8 bits, that is to say values of positive pixels ranging between 0 and 255 (0 and 1 in certain cases), the positive variant of the inequality theorem triangular (F 1 and F 2 positive), defined by Cauchy allows us to write that: By performing an algebraic extension of this equation, we obtain: , the statistical error committed during the reconstruction (decompression) by the method implemented on a set or subset X, We deduce that: This inequality shows that the probability of introducing errors or parasites in a process is reduced if one proceeds by processing piece by piece, rather than detrimental global processing; Hence the need for compression by windows or sub-set, with a quantizing effect on noise (reconstruction error) [12].

Wavelet Transform
Discrete Wavelet Transform (DWT) is a multi-resolution/multi-frequency representation [13] [14]. It makes it possible to efficiently analyze signals where phenomena of very different scales combine. The stages of the transform follow a filtering hierarchy. We then obtain a decomposition of the image into sub-bands with different filters (low pass h, and high pass g), consequently the need for a bi-dimensional DWT. This requires the use of a separable two-dimensional DWT (rows + columns). The input image is broken down each time into four sub-images (approximate image CA, horizontal detail DH, vertical detail DV and diagonal detail DD) with different low-pass and high-pass filters. The reconstruction is done using quadrature mirror filters, represented by their impulse responses (h and g) [15] [16]. This principle is illustrated in Figure 4.

Spiht Coding
SPIHT coding (set partitioning in hierarchical trees), as presented in Figure 5 is an efficient coding and is currently widely used in the specialised literature. The principle is to use a comparison threshold to say a coefficient that is an approximation or a detail. A tree branch hierarchy (in the following, we will write to simplify that i = i, j) is thus set up from a zero coefficient or parent denote X(i). A set noted O(i) represents the first two sons from X(i), all descendants including O(i) belong to the set D(i) and the set L(i) is defined by [18]. Thus, a coefficient resulting from the decorrelation will be considered as an approximation or detail depending on whether it is lower or higher than the comparison threshold given by: k represents the number of decor-related samples.  The coding will be carried out gradually in the following manner:

Compression Scheme
To implement the new approach, we chose to split our ECG signal into four blocks or subsets, corresponding to the four critical parts of an ECG signal (P, T, U, QRS). Thus, the block or subset F 1 will comprise as many P waves as the actual examination sequence will have.
Also, considering I the initial ECG vector, then I can be written as an empirical and non-matrix sum of four blocks or subsets: The algorithm first evaluates the maximums of the basic sequence noted, M 1 , M 2 , M 3 and M 4 , corresponding to the different QRS, T, P, U waves. The data for the entire total sequence will therefore be classified into four blocks according to their proximity to the maximums M i . To ensure the harmonization of the dimensions of the blocks, the following operation is carried out:  (13) i and j being the lines and columns respectively ∑∑ representing an empirical sum in the sense of subsets and not arithmetic. Figure 6 presents the grouping into subsets of four blocks based on QRS, T, P and U. Figure 7 presents the new proposed algorithm and Figure 8 shows the 2D compression scheme using the model in subsets

Evaluation Method
Evaluating the quality of a signal or an image is to associate it with one or more qualifiers allowing the location of its relative position in a frame of reference defined by our sense and according to the envisaged application. The quality of the   signal is not only linked to visual appearance but also depends on the application and use. To assess the results obtained from the proposed new scheme, we used a subjective assessment model, which uses human observers (end users) to assess (or compare) the signal quality according to a well-defined protocol, associated with an objective evaluation which refers to methods based on the analysis and quantitative measurement of the level of degradation using metrics directly related to the physical signal [19]. The compression ratio (TC) is used to control the reduction of the data by reducing the size of the original signal. It plays on two parameters, the wavelet level (generally from 1 to 4) but especially on the bpp (bits per pixel) which takes a value between 0 and 1. The variations of the TC allow one to appreciate the performances of the new method by observing the results obtained from objective and subjective evaluations. The compression ratio is given by: number of compressed signal bits CR 1 100 number of bits original signal

Subjective Evaluation
Considered the most reliable way to measure real quality, subjective evaluation directly involves the human observer. The latter is called upon to judge the quality of the images or signals presented to it according to an evaluation grid with several levels of appreciation. At the end of these tests, a subjective score called MOS (Mean Opinion Score) is obtained. We have chosen to implement the simple stimulus protocol in this trial as presented in Figure 9. The single stimulus method, called "Single Stimulus Continuous Quality Scale (SSCQS)" is used to judge the quality of one stimulus at a time. The images are presented one by one with a lag time between two presentations. This time allows the observer to note the quality of the image or signal. Scales with 5, 6, 7 or 100 levels of appreciation can be used. Table 1 presents an example of a scale, this method is very widely used to validate a quantitative or objective metric [20], and it is what was used in the context of this article.

Objective Evaluation
Compression fidelity was assessed by the percent roots meansquare difference which is proportional to the quality. Its definition and use come from the field of audio/video signals, it is defined as follows: Journal of Computer and Communications

Results and Discussion
The proposed approach was evaluated on real ECG data from the MIT-BIH arrhythmia database [21]. The 4/4 bi-orthogonal wavelet was chosen for TOD. The results were obtained with the signals X100, 101, 219, and 222. The number of samples is 4000. In the following, we will study the variations of the PRD as a function of the compression rate, in order to obtain a synthesis. The reconstructed signals will be submitted to a doctor, and then compared with the results of recent literature. Figure 10 shows the ECG signal X219, reconstructed with a CR = 90%, PRD = 0.2. This is the ECG recording of a patient with a heart condition. Visually, the quality of reconstruction is good. At this rate, no degradation is observed in the reconstruction of the signal X219.
Also for this signal, the degradation remains imperceptible to the human eye. The value of the PRD is 0.13%, which indicates a good quality of reconstruction. The amplitude is around 0.1mV for the error. However, by varying the compression rate from to 92% (Figure 12) to 93% (Figure 13) for the signal (X100) which is an ECG without anomalies but with its P and T waves less important than those of X101, we obtain.    At CR = 93%, the degradations become significant and perceptible for the SVH as shown in Figure 13.
The magnitude of the error is now around 180μV, significantly higher than CR = 92% the PRD has also increased. In the reconstructed signal, the P and T waves are for the most part difficult to distinguish, and the signal is highly noisy.
To identify the optimal compression rate, that is to say that for which the reconstruction is at the maximum limit for good visual quality, we tested several signals around CR = 93% ± ∆ (∆ ≤ 1). For signal X222, at TC = 94% the PRD is equal to 2.2. We have the result: In Figure 14 below, it is clear that the signal is more degraded when the compression rate is increased. However, the extent of the degradation caused varies from signal to signal. Thus, the signal X222, has a better tolerance to the high compression ratio compared to the other signals used for example. Figure 14 and Figure 15 show the evolution of the PRD, as a function of the compression ratio for the records considered.   The lowest curve is that of X219, followed respectively by X101 and X100 for compression rates below 96%. The closer a curve is to the abscissa axis, the more it will have a good tolerance to our algorithm. According to the literature, PRD values lower than one reflect an excellent quality of reconstruction; The plot of the PRD as a function of the CR above shows us that below CR = 93%, and for all the signals considered, the PRD revolves around 1.
However, it remains essential to add to this mathematical analysis, the subjective criterion of the human eye. These results were therefore submitted to the appreciation of a doctor from the Central Hospital of the University of Yaounde (CHUY), according to the simple stimulus protocol and the observations made were recorded in Table 2, where different reconstructed signals were presented to the doctor without details. The goal is to see at what rate it will be unable to interpret the signals. The result of these tests, allows us to make a comparison of the same signals with the results of other authors in the literature for different methods used.
Here the PRD and CR criteria are compared in Table 3.

Conclusion
By exploiting the properties of set theory, we have implemented a robust algorithm for ECG compression which has the particularity of grouping signal data according to its main peaks before compression. This grouping into four subsets helps to reduce errors and introduce bias during reconstruction while increasing the correlation between signal data to achieve high compression rates. It has been presented also a comparison between the results obtained and some results from other researchers. The proposed solution by using PRD and CR criteria performs better than the ones from other authors in the field. In perspective, the model put in place can also allow the clinician to have filtering options for specific and critical parts of the ECG which are the P, T, U, QRS waves which contain the most relevant information.