EEGcorco : a computer program to simultaneously calculate and statistically analyze EEG coherence and correlation

EEGcorco is a computer program designed to analyze the degree of synchronization between two electroencephalographic signals (EEG) by mean the analysis of correlation and coherence index. The correlation and coherence values permit the quantitative determination of the similarity among EEG signals from homologous areas of the cerebral hemispheres (interhemispheric), and among localized areas within one cerebral hemisphere (intrahemispheric). EEG coherence is a function of frequency; thus it is commonly presented in a spectral manner (coherence values in every frequency of the spectrum), in contrast, the correlation function has been employed mainly to search periodic components of bioelectrical signals, and normally appears as punctual values defined in time, hence it is not common calculate correlation spectra. EEGcorco offers an easy and novel way to calculate correlation spectra by mean the application of the Fast Fourier Transformation (FFT) to digitized EEG signals. Both, correlation and coherence spectra are obtained in both independent frequencies and frequencies grouped in wide bands. Moreover, the program applies parametric statistical analyses to those coherence and correlation spectra also, for each individual frequency and for the frequencies grouped in bands. The program functions on any PC-compatible computer equipped with a Pentium or superior processor and a minimum of 512 Mb of RAM memory (though the higher the capacity the better). The space required on the hard disk depends on the signals to be analyzed, as the output takes the form of files in text format that occupy very little space. The program has been elaborated completely in the Delphi environment for the Windows operating system. The efficacy and versatility of EEGcorco allow it to be easily adapted to different experimental and clinical needs.


INTRODUCTION
The electroencephalogram (EEG) is defined as a mixture of rhythmic sinusoidal-like fluctuations in voltage generated by the brain that, it has been suggested, represent the global activity of the pyramidal cells of the cortex and the activity of the neurons in the subcortical structures [1].It has been used for many years as a sensitive tool that makes it possible to examine brain functionality with no invasive intervention under many physiological conditions, during hormonal and pharmacological manipulations, and while subjects are resolving different tasks [2,3].This technique has an excellent temporal resolution that allows the researcher to obtain recordings of brain electrical activity from milliseconds to hours, days and even months; hence, it is probably due to this advantage that the EEG is still used in numerous laboratories around the world.
Although qualitative EEG analysis is still used in medicine, quantitative analysis of EEGs has become even more common due to the advances offered by personal computers.Such computerized analyses require digitized EEG signals (discreet in amplitude and time) and can be based on two techniques: coherence, in the frequency domain; and correlation, in the time domain.
Coherence and correlation are two mathematical indexes that allow the determination of the degree of similarity between two electroencephalographic signals and the establishment of a possible functional relation among different regions of the brain.Though the two methods are frequently considered equivalent, there are some important differences in the procedures used to calculate them and in the results they provide.Coherence is calculated by dividing the numerical square of the crossspectrum by the product of the autospectra.Therefore, it is sensitive to changes in power as well as to alterations in phase relationships.Consequently, if either power or phase changes in one of the signals, the coherence value is affected.Another important distinction is that the value of coherence for a single epoch or segment is always 1, regardless of the true phase relationship and the differences in power between the two signals.Over successive epochs, the measure of coherence is dependent on the power and phase of the two signals through the epochs.If there is no variation over time in the original relationship between the two signals, then the coherence value will equal 1.This means that coherence does not give direct information on the true relationship between the two signals, but only on the stability of this relationship with respect to power asymmetry and phase relationships.
In contrast, correlation may be calculated over a single epoch or over several epochs and is sensitive to both phase and polarity, regardless of amplitude.The calculation of coherence involves squaring the signal, which results in coherence values of 0 to 1, and a loss of polarity information.Unlike coherence, correlation is sensitive to polarity; hence correlation values rank from -1 to 1 [4].
From a historical point of view, correlation and coherence have evolved in different ways: the former is a product of work carried out by statisticians and mathematicians; whereas the latter has been developed principally by engineers, as it is based on spectral analysis, which is a fundamental tool in diverse areas of engineering [5].The mathematical method for calculating the coefficient of punctual correlation was created by Karl Pearson [6].
Early publications related to the use of coherence in the analysis of EEG signals appeared after the publica-tion of Cooley and Tukey's work [7], which reported the algorithm required to rapidly calculate the Discrete Fourier Transformation.Since then, EEG coherence has evolved as a method that involves the spectra of the calculation of the signals in a reasonably short time.
EEG coherence, meanwhile, is a function of frequency; thus it is commonly presented in a spectral manner (coherence values in every frequency of the spectrum) (Figure 1).
On the other hand, the correlation function was employed mainly to search for the periodic components of bioelectrical signals, and normally appears as punctual values defined in time.This analysis can be applied if there are two continuous variables whose relation is linear and whose punctuations have been obtained in independent couples [8].
In recent years, interest in investigating the functional relationships among different cortical areas has increased, especially the way that this relationship changes from one physiological state to another.This interest is based on the assumption that electroencephalographic similarity between two cortical areas reflects similarity in the underlying neurophysiological processes, such as the same inputs, similar information processing, or broad connections between them.In the opposite case, when the underlying neurophysiological processes of two cortical areas are different, then EEG signals from the two areas are different as well [9][10][11][12].In other words: the greater the functional relation between the two areas, the greater the similarity in their respective activity [11,13].
Considering the broad utility and application of these brain synchronization methods in both basic and clinical research, this report describes a computer program-EEGcorco-that has been designed to obtain rapidly and simultaneously, the coherence and correlation spectra of EEG signals, as well as statistical comparisons (parametric) among groups or conditions of the signals involved.

Algorithm for Calculating Coherence
The program calculates coherence from the Fast Fourier Transformation (FFT).Figure 2 presents a diagram indicating the steps necessary to make this calculation: Coherence is defined in the frequency domain and its spectrum can be calculated by the Eq.1: where: 0,1, 2, , x is the crossed spectrum between signals A and B in the x frequency.

 
AA S x is the autospectrum of signal A in the x frequency.

 
BB S x is the autospectrum of signal B in the x frequency.From the digital signals, the instantaneous spectrum (1 segment) of each signal is obtained and, later, the autospectra (S AA and S BB ) and crossed spectrum (S AB ) (nd segments).The coherence spectrum is calculated on the basis of the autospectra and crossed specrum.t

Algorithm for Calculating Correlation
Having ascertained each of the components (frequencies) of the bioelectrical signals in the time domain, the Pearson product-moment coefficient of punctual correlation for each frequency can be calculated using the following Eq.2: where: 0,1, 2, , 1 x  N (N frequencies into which signals A and B are decomposed).

 
cov AB x is the covariance between signals A and B, in the x frequency.

 
var AA x is the variance of signal A in the x frequency.  var BB x is the variance of signal B in the x frequency.
The correlation coefficient can also be calculated following a method similar to that used in the coherence calculation.It involves the use of the spectra amplitudes of the signals, and is illustrated in Figure 3, and explained following steps 1 to 4: 1) Discrete Direct Fourier Transformation: Eq.3 indicates how, from Eqs.4 and 5, it is possible to obtain the instantaneous spectrum of a digitized sign. where: Due to the properties of Discrete Fourier Transformation, we know that the number of frequencies into which Method for calculating Pearson's punctual correlation in each x frequency.Analog signals A and B are digitized through an analogical-digital convertor (A/D).From those digital signals, the instantaneous (1 segment) autospectra (S AA and S BB ) and crossed spectrum (S AB ) are calculated.Correlation in each x frequency is calculated on the basis of the autospectra and real part crossed specrum for this x frequency.t it is possible to separate the elements to a digital signal is N/2 plus an element that represents the average level of data (known as the level of direct current, or "DC" level); the remaining N/2 -1 frequencies are an image of the first ones.
2) Discrete Inverse Fourier transformation: Eq.6 makes it possible to return to a digitized signal from the frequency domain to the time domain. where: 3) The autospectra and crossed spectrum equations for digitized signs are (Eqs.7-16): Signal A Autospectrum: Crossed spectrum between signals A and B: where: nd = number of segments from the instantaneous spectra of signals A and B in the x frequency (the conjugate of a complex number is obtained by reversing the sign of the imaginary part) B x  4) By applying the Fourier Inverse Transformation to the crossed spectrum of a signal the crossed correlation function is obtained; as expressed in the following Eq.17: with respect to this correlation function, the only ele-ment is the correlation at time zero for each x frequency; thus n will always be zero.In other words, the only factor of interest is the first place (correlation at time zero) of the correlation function.Thus, the equation is as follows (Eq.18): This simplification was made possible due to n = 0, so: It is clear that calculating the first place of the correla-tion function does not require the use of the imaginary part of the crossed spectrum.Moreover, if it is supposed that for each x frequency all others are zero (as in an ideal filter, where the signals contain only one frequency), then for each x frequency, the correlation function at time zero will be calculated by Eq.19: where: frequencies into which the signals are decomposed) (x initiates at one because the "DC" component is not considered) As we know that in the previous equation

 
Fre x is equal to the real part of the crossed spectrum, Eq.20: Upon considering a single segment (nd = 1, which is possible for the correlation), Eqs.21 and 22 are obtained:  (22)   with these correlation values for each one of the N/2 frequencies, it becomes possible to calculate the correlation spectrum for the two signals involved.However, their values are not between -1 and +1, which is the best way of seeing them.To obtain values in the aforementioned range, the values of every must be divided by the square root of the product of place zero of the inverse transformation of the autospectra signals for the same x frequency; i.e., the autocorrelation at time zero of each one of the A and B signals [FacA and facB]), as indicated in Eq.23: where facA and facB are defined by Eqs.24 and 25: However, considering that the only point of interest is the place zero of the Inverse Transformation (n = 0), then Eqs. 26 and 27 are developed: Since the autospectra do not contain an imaginary part, and considering a value of zero for all the different frequencies of X, then

 
) Therefore, for one segment (nd = 1), Eqs. 30 Upon substituting Eqs.20, 30 and 31 in number 32, we obtain: In this way, it is possible to calculate the correlation spectrum (correlation values for every frequency into which the signal is decomposed) between two cerebral areas.Use of the EEGcorco program is very simple.It requires a text file ("file of names") which contains the name of the data files in every line (also in text format).Each data file contains only one datum per line.All sig-nals provided to the program must first be examined in order to eliminate all segments contaminated with artifacts.

Parametric Statistical Tests
Before executing the program, it is necessary to know precisely which statistical design is suitable for the data that is to be analyzed.EEGcorco compares the correlation or coherence values among groups using the next parametrical statistical test: correlated and uncorrelated Student's t, correlated and uncorrelated ANOVA of one and two factors, and Split-plot ANOVA of two factors).Besides, when ANOVAs tests are realized, the program calculates automatically Duncan and Tukey's post hoc test.In Figure 4, different designs in which the data can be arranged are represented graphically; in each design It is important to clarify that in all those designs it is indispensable that the number of subjects in every cell be the same (i.e., no mismatches are allowed).As Fig- ure 4 shows, cells are numbered in sequential order from left-to-right and from top-down.Each signal (registered channel) must be in an individual file (in text format, with a specific name).The names of the individual files will then be arranged in a main file in ASCII format, which we call the "file of names".In this file, the individual files will be arranged following the order of the cells.
Figure 5 exemplifies a "file of names" that contains data from a two-factor mixed design (2 × 2).First, the names of the files that constitute the first cell (group 1 in the first condition) appear, followed by cell 2 (group 1 in the second condition), cell 3 (group 2 in the first condition) and, finally, cell 4 (group 2 in the second condition).
It is important to clarify that EEGcorco does not function if any files are missing; that is, it is necessary to have all the files for all subjects (and all must appear in the "file of names").
EEGcorco transforms the coherence and correlation data to Z of Fisher values for the purpose of bringing them over, as far as possible, to a normal distribution (a requirement to permit the application of parametric statistics) [45].
When the program is executed, an initial screen appears in which the user must provide the parameters for program execution.Figure 6 shows an initial screen in which those parameters have been filled in.The user must indicate if there are 2 or 4 channels (in other words, if 2 or 4 derivations will be analyzed); the level number of the factors to be considered (for the statistics); if sampling was done at 256, 512 or 1024 Hz (the only sampling rates allowed); if the point number for each segment is 256, 512 or 1024 (the only segment durations allowed by the program); and must always respect the condition that each segment have a duration of at least one second.By default, the program determines the limits of the EEG bands to: band 1, from 1 to 3 Hz; band 2, from 4 to 7 Hz; band 3, from 8 to 10 Hz; band 4, from 11 to 13 Hz; band 5, from 14 to 19 Hz; band 6, from 20 to 30 Hz; and, band 7, from 31 to 50 Hz.Nevertheless, in the initial screen the user can modify the limits of the first six bands of analysis (but the limits of band 7 cannot be modified).If the user works with correlated or mixed designs, it is possible to choose to subtract the first cell from the other ones, since it is considered to be the baseline.

PROGRAM PERFORMANCE
When the user presses the "Start the analysis" button, the program initiates calculations and then the result files of Table 1 are obtained.All these files are in text format The decision was taken to join the 7 bands in the pre-established limits without subtracting the first cell.This is a mixed 2 × 2 design (4 cells).The "file of names" is EEGcorco.DIR.and they are arranged in matrices form (columns and rows) which do them easy to process in programs as excel or SPSS.As can be seen in Table 1, there are 20 exit files; all of which contain the name indicated at "file of names", though their terminations (the last 3 characters) are different.The ERR file contains the interhemispherical correlation spectrum for each frequency (from 1 to 30 Hz); ERH has the interhemispherical coherence spectrum for these frequencies; the EZR and EZH files contain, respectively, the interhemispherical correlation spectrum and interhemispherical coherence spectrum for each frequency transformed to Z of Fisher.TER has the interhemispherical correlation spectrum of the 7 bands with both transformed and non-transformed data; TEH contains the equivalent coherence spectrum.ARR and ARH contain the intrahemispherical spectra (1 to 30 Hz) of correlation and coherence for each frequency (in the anterior and posterior areas of the same hemisphere); AZR and AZH contain the respective correlation and coherence spectra transformed to Z of Fisher.TRR and TRH show the correlation and coherence intrahemispherical spectra grouped in the 7 bands considered.Each file contains both transformed and non-transformed values.The RES file contains the results of all the calculated spectra, both for frequency and bands, without transforming, while RET contains the transformed values.
Figure 7 shows the results of applying divided plots variance analysis to each variable (columns) contained in the RES and RET result files.The tests that turned out to be significant (p < 0.05) for the AVA and AVT files are indicated by an asterisk When the statistical design contains only 2 cells, the Student's t test for independent or correlated groups should be applied.Figure 8 shows part of the result files: TES (for non-transformed data) and TET (for transformed data), obtained upon applying the Student's t test to cells that contain independent groups.
If the user applies variance analysis, then EEGcorco will bring both the TUK files (for non-transformed data) and TUT (for transformed data).These contain the comparisons among cells, using both the Duncan and Tukey tests, in order to determine the significant differences among groups (or among conditions).

HARDWARE AND SOFTWARE SPECIFICATIONS
EEGcorco has been written in Delphi and will run in the Windows environment in any PC-compatible computer that has at least a Pentium processor and 512 Mb of RAM memory (though program performance improves with more RAM).The program requires little space on the hard disk because both the signals to be analyzed and the exit files are in text format (ASCII) and thus occupy only a small amount of disk space.Memory requirements or limitations are determined by the amount of data to be processed.The program requires that the signals be digitized (discreet in amplitude and time) from analog signals (continuous in amplitude and time); hence, it is necessary to take "N" points (samples) that are spread equally over time for every signal segment.Another requirement is that several of the segments that represent the conditions of interest be taken.

LESSONS LEARNED AND AVAILABILITY
In the present article, we have presented the computaional program EEGcorco, which offers an easy and novel t way to obtain complex quantitative analyses of EEG synchronization.This program allows one to calculate, in a fast and simultaneous manner, the correlation and coherence spectra of EEG signals, as well as their respective statistical parametric analyses.The calculation of correlation and coherence spectra can be obtained for both narrow and broad bands in a very short time by applying several parametric statistical tests.Dissimilarly to the most commercial programs designed to analyze the EEG, EEGcorco allows extracts the coherence and correlation.This last analysis has many advantages on coherence, since it is sensitive to both phase and polarity, regardless of amplitude.The coherence not provide direct information as to the true relationship between two signals and it only reflects the stability of this relationship with respect to power asymmetry and the phase relationship, For these reasons, when interest focuses on waveform and time coupling between two brain regions, correlation is a better choice than coherence.
The main contribution of EEGcorco is that it allows simultaneous inter-and intrahemispheric correlation and coherence spectra calculations, and the application of adequate parametric statistical analyses for which, to our knowledge, there are no commercially available programs.
EEGcorco offers numerous advantages: it runs on any PC, requires no complex equipment, and its output files take up little memory space on the hard disk.The versatility and flexibility of this program make it easily adaptable to diverse experimental and clinical needs.Moreover, the fact that EEGcorco can also be easily adapted to portable computers means that it can solve the problems that may arise when analyzing signals in locations outside the laboratory; for example, in hospitals or schools.
EEGcorco has been used for our work group as well as for other researchers in both, basical and clinical studies.For example, in a previous study we demonstrate that the functional coupling between the prefrontal and parietal cortices shows a characteristic pattern, specific to each age group (male children, teenagers and young adults) during performance of the Hanoi task, which is neuropsychological test widely used to evaluate executive functions as planning [46].In other studies, we found that alcohol decreases the correlation between frontal and parietal areas in humans, and between subcortical structures in rats [47,48].Although in those investigations only the correlation analysis was used, in futures studies we will use and compare both correlation and coherence analysis in clinical populations, particularly female adolescents with posttraumatic stress disorder and children abused.Finally, we believed that the analysis exposed in the present paper could be useful in clinical practice since the coupling between cortical areas could predict the cognitive functioning in neurodegenerative states as Alzheimer disease [49,50].
Although EEGcorco offers several advantages, it has some limitations.One very important condition is that the data of the different derivations and conditions of each subject must be very well organized in their respective files; but, if this condition is followed adequately, the program will run the statistical test (s) required with no problem.Another limitation of this program is the impossibility of analyzing more than four derivations simultaneously.Nevertheless, for studies that require more than four derivations, the program can be run several times, until all the comparisons of interest have been carried out.This program was created in order to answer specific questions about functional similitude among two or four brain regions.
EEGcorco is available upon request.

Figure 2 .
Figure 2. Method for calculating coherence.The analog signals A and B are digitized through the analogical-digital convertor (A/D).From the digital signals, the instantaneous spectrum (1 segment) of each signal is obtained and, later, the autospectra (S AA and S BB ) and crossed spectrum (S AB ) (nd segments).The coherence spectrum is calculated on the basis of the autospectra and crossed specrum.t FreB x are equivalent to the autospectra of the A and B signals, respect-tively (Eqs.28 and 29): EEGcorco is a flexible program since it works with EEG signals transformed to ASCII format.The most commercial computer programs designed to acquire EEG signals (i.e.Neuroscan Scan IV, Grass Technology PolyVIEW or Medicit Track Waker) allow exporting their data to ASCII format.

Figure 4 .
Figure 4. Schematic representation of different statistical designs that can be applied by means of the EEGcorco program.In each esign is indicated the statistical test used.d

Figure 5 .
Figure5.The order in which the file names must be arranged in the "file of names" assigned to the EEGcorco program.Subjects 01, 21, 22 and 24 belong to the first group, while subjects 09, 10, 12 and 14 belong to the second one.Each group was recorded in 2 conditions, HABA and HAHA (hence, there are 4 cells).Also, each subject was recorded in 4 regions (derivations): F3, F4, P3 and P4.The extension of the files is N10 (but can be of any length).

Figure 6 .
Figure 6.In this example of an initial screen, the user has indicated to the program that there are 4 channels, which are segments signals divided into 512 points and were sampled at a rate of 512 Hz.The decision was taken to join the 7 bands in the pre-established limits without subtracting the first cell.This is a mixed 2 × 2 design (4 cells).The "file of names" is EEGcorco.DIR.

Figure 7 .
Figure 7. Part of the AVA and AVT result files, in which the correlation and coherence spectra were calculated from 4 channels employing a 2 × 2 mixed statistical design.The columns present the significant values for factor A [p (FA)], factor B [p (FB)], and A × B interaction [p (FAB)].

Figure 8 .
Figure 8. Parts of both the TES and TET results files.A comparison has been made among the cells where the Student's t test was applied for independent groups.

Table 1 .
Names of EEGcorco's exit files.Files ending in R contain the names of the correlation files, while those ending in H show the names of the coherence files.The rest of files contain both correlation and coherence.