Electrocardiogram Feature Extraction and Pattern Recognition Using a Novel Windowing Algorithm

This paper presents a Novel Windowing Algorithm for Electrocardiogram Feature Extraction and Pattern Recognition. The work presented here deals with a simple and efficient way of detecting ECG features that are P, Q, R, S and T waves. Windowing method is used to select these waves. Windows are based on varying R-R intervals. It has been tested on ECG simulator data and also on different records of the MIT-BIH arrhythmia database, producing satisfactory results. ECG timing intervals are also required for monitoring the cardiac condition of patients. Hence after feature detections ECG timing intervals like the PR interval, QRS duration, the QT interval, the QT corrected interval and Vent Rate are efficiently calculated using proposed Formulae.


Introduction
An electrocardiogram is a graph depicting the electrical activity generated by the depolarization and repolarization of atria and ventricles.An ECG wave is a periodic wave with one cycle consisting of a P wave, the QRS complex and a T wave as shown in Figure 1.ECG signal analysis is vital for making the data useful in the diagnosis of heart diseases.Thus the development of efficient ECG feature extraction algorithms is of great value and importance.
Most of the clinically useful information in an ECG signal is present in the intervals and amplitudes defined by its features.An ECG feature extraction algorithm is also helpful in the detection of cardiac problems known as arrhythmias including tachycardia, bradycardia, heart rate variation etc. Beat detection is used to determine the heart rate and identify arrhythmias while further processing is performed to detect abnormal beats.A number of techniques have already been proposed for detection of ECG features.A novel algorithm based on the windowing technique is discussed in this paper which is used for high precision ECG feature extraction and pattern recognition.
This paper is organized as follows.The next Section, Section 2, explains the preprocessing required before ECG signal analysis.It also explains the different steps involved in implementing the windowing algorithm.Section 3 deals with the mathematical formulae involved in calculating ECG wave intervals and other parameters.Section 4 contains figures illustrating the results of the windowing algorithm.Section 5 concludes the paper.

ECG Preprocessing
An ECG signal acquired from the body contains interference from a variety of sources.Power Line Interference (PLI) and baseline wandering are two major sources of noise in an ECG signal.Therefore preprocessing is required to obtain a signal that is useful for analysis.Power line interference is 50 -60 Hz noise superimposed on the ECG signal as illustrated in Figure 2 and is caused by the AC mains supply.It can be removed by an algorithm proposed in [1] based on adaptive notch filtering.
Baseline wandering is low frequency noise caused by the movement of the patient during signal acquisition as shown in Figure 3.Other factors contributing to this type of noise may include the loose connection of electrodes, metal contact with the patient's body and the quality of the electrodes.It is removed by applying a high pass IIR (Infinite Impulse Response) filter of the 2nd order with a cutoff frequency of 0.05 Hz.
Once the effects of interference are removed, the clean ECG signal shown in Figure 4, is passed to the algorithm proposed in this paper.LEAD II is chosen for processing.

Windowing Algorithm
The windowing algorithm is based on the following: • The most prominent peaks in an ECG signal are the R-peaks.These peaks are detected by imposing a threshold condition on the amplitude of the signal as shown in Equation ( 1) where τ and m denote the threshold and peak value of the signal respectively.The values lying above τ are the R-peaks of the ECG.
• R-peaks occur periodically in an ECG signal.The threshold condition will also give different values in each period containing R-peaks.The particular R-peak in any period is selected by taking the mean of the R values in that period.• Different periods are selected by finding the difference between consecutive values obtained by the threshold condition.Values in one period are very close to each other and a sharp variation appears as one period ends.That sharp variation is useful in identifying one particular period.• Once the R-peaks have been identified, RR intervals denoted by t rr are calculated as given by Equation ( 2)

( ) ( )
( ) s f represents sampling frequency (100 Hz) and t rr is used in making windows for the P, Q, S and T waves.• P and T waves exist in one R-R interval, T waves lie next to the 1 st R-peak, and P-waves are present nearer to the 2 nd R-peak in one R-R interval.• The window for the T-wave in one R-R interval is selected by starting from 15% of the R-R interval added to the 1 st R-peak location and continuing to 55% of the R-R interval added to the same location.• The window for the P-wave in one R-R interval is selected by starting from 65% of the R-R interval added to the 1 st R-peak location and continuing to 95% of the R-R interval added to the same location.• The particular P and T peak location is selected by taking the highest value in their respective windows.
• The Q-peak is chosen by selecting minimum value in the window starting from 20 ms before the corresponding R-peak and that particular R-peak.• Similarly the S-peak is chosen by selecting the lowest value in the window starting from R-peak to 20 ms after that R-peak.These windows are adaptive because they depend upon R-R interval values and as this interval changes the window will also change.

Equations
ECG timing intervals are calculated as follows.

PR Interval
P-R interval, denoted by t pr , is calculated using following formula ( ) where f s denotes sampling frequency, R loc denotes R peak locations and P loc denotes P peak locations.The above formula yields an array of t pr values, which are then averaged out to get single t pr .

QRS duration
QRS duration t qrs is calculated using following formula where x denotes immediate 5 ms, these samples are added to S loc and are subtracted from Q loc because QRS duration is defined from start of Q peak till end of S peak as shown in Figure 5. S loc and Q loc denote S and Q peak locations respectively.This formula gives an array of t qrs , which are then averaged to get single t qrs .

QT Interval
QT interval t qt is calculated using following formula 13 T loc indicates T peak locations.0.13 Factor is multiplied with t rr and added to T loc , it has same effect as addition of 5 ms.Particular t qt value is selected by taking mean of all values of an array.

QT Corrected
QT interval depends on heart rate.An increase in heart rate results in lower RR and QT intervals [2].So QT interval is corrected to obtain QT corrected interval known as ( ) qt corr t by using correction called Bazett's formula [3] This correction also improves the detection of patients at increased risk of ventricular arrhythmia.

Vent Rate
Vent rate or BPM is calculated using Equation ( 7) A particular BPM value is selected by taking mean of all values of an array.

Results
The algorithm has been tested on MATLAB using a large number of normal ECG signals taken from Fluke PS-410 ECG simulator ranging from 30 BPM to 200 BPM shown in Figure 6.It is also tested on 42 records of MIT-BIH arrhythmia database.Two parameters-sensitivity and specificity given by Table 1 and Table 2 re  where TP denotes true positive, a condition in which certain peak is present and is detected correctly.FP is false positive, a condition in which a certain feature is present but not detected.FN is false negative, a condition in which a feature was detected but it was not actually present.For normal data 99% accurate results are obtained.
For MIT-BIH arrhythmia database 10 seconds long data was used for analysis and a high sensitivity and specificity values are achieved for PQRST detection.
Figure 7 shows results of algorithm on two records from MIT-BIH database 101.dat and 219.dat.Now comparing results of proposed algorithm with existing techniques, it is observed that algorithms based on Autoregressive (AR), Wavelet Transform (WT), Eigenvector, Fast Fourier Transform (FFT), Linear Prediction (LP), and Independent Component Analysis (ICA) approaches exist [5].specificity values for all these existing algorithms.Comparison of sensitivity and specificity values shows that proposed windowing algorithm is comparable with Eigenvector method that is regarded as best method [5] and better than wavelet transform and linear prediction methods.Windowing Algorithm is also easier to implement on application based hardware while existing methods are complex and require large computation time.
It is observed that sensitivity and specificity values for P and T detection are moderate because it is difficult to detect them as compared to QRS complex features irrespective of the approach or algorithm used.Reliable detection of P and T wave is more difficult than QRS complex detection for several reasons including low amplitudes, low signal-to-noise ratio, amplitude and morphological variability and possible overlapping of the P wave with the QRS complex [6].

Conclusion
A novel windowing algorithm has been proposed for electrocardiogram wave feature detection.Feature extraction and pattern recognition have been achieved on normal ECG data ranging from 30 BPM to 200 BPM with accuracy of 99%.Same algorithm was tested on patient data from MIT-BIH [7] database and highly accurate results are obtained.Interval values obtained using this approach are within ±5% of the exact values.Algorithm is tested on about 100 normal ECG records and 42 records of MIT-BIH database.

Figure 7 .
Figure 7. (a) Algorithm result on record 101.dat from MIT-BIH Database; (b) Algorithm result on record 201.dat from MIT-BIH Database; (c) Algorithm result on record 100.dat from MIT-BIH Database; (d) Algorithm result on record 219.dat from MIT-BIH Database.

Table 1 .
Sensitivity Calculation of PQRST detection on MIT-BIH arrythmia database.

Table 3
presents sensitivity and

Table 2 .
Specificity Calculation of PQRST detection on MIT-BIH arrythmia database.

Table 3 .
Sensitivity and specificity percentages of existing algorithms.