Coherence Modified for Sensitivity to Relative Phase of Real Band-Limited Time Series

As is well known, coherence does not distinguish the relative phase of a pair of real, sinusoidal time series; the coherence between them is always unity. This behavior can limit the applicability of coherence analysis in the special case where the time series are band-limited (nearly-monochromatic) and where sensitivity to phase differences is advantageous. We propose a simple modification to the usual formula for coherence in which the cross-spectrum is replaced by its real part. The resulting quantity behaves similarly to coherence, except that it is sensitive to relative phase when the signals being compared are strongly band-limited. Furthermore, it has a useful interpretation in terms of the zero-lag cross-correlation of real band-passed versions of the time series. Introduction In this paper, we examine the well-known formula for the frequency-dependent coherence between two time series and argue that it is not well-suited for quantifying the similarity of band-limited data. Using a time domain-based analysis, we identify a critical step in the development of the traditional algorithm, which we show is inappropriate in the bandlimited case, and propose an alternative that leads to the definition a new quantity , which while having a definition similar to , is better behaved. We then use both synthetic tests and analytic methods to elucidate the behavior of , and show that it is a viable alternative to . Our belief is that the choice of time series analysis technique should be guided by the properties of the data; one analyzes time series in a way designed to best extract knowledge from them. One should always be willing to adapt an analysis method to achieve this goal. The issue considered here is how best to quantify the similarity between time series that are 1) real (as contrasted to complex) and 2) band-limited (in the sense of being nearly monochromatic). Such time series constitute important special cases because most natural phenomena are described using real numbers and many are dominated by a single period of oscillation. For example, the daily period often contributes strongly to physiological and meteorological signals, the annual period to environmental and climatic signals, the precessional period (25.7 ka) [1] to sedimentary and paleontological signals, and so forth. Furthermore, commonly-used techniques such as multiple window coherence analysis [2], where two long time-series are divided into a sequence of shorter pairs before coherence analysis is performed, may accentuate the degree to which a single period of oscillation dominates the signal. An important property of nearly-monochromatic signals is their relative phase. Whether two time series that are in-phase (as in Fig. 1A) or out-of-phase (as in Fig. 1B) may be important, for example, from the perspective of an analyst trying to unravel the dynamics of the underlying causative processes. Traditional coherence analysis [3] has very limited application in this case, because of the well-known insensitivity of coherence to relative phase. The coherence of two sinusoidal time series of the same period is always unity, irrespective of their relative phase. Simply put, coherence does not distinguish a sine from a cosine. Given the general usefulness of coherence in other settings, it is well to ask why it “fails” in this special case and whether it can be modified to produce what may, in some circumstances, be a more useful measure of similarity. When asking why any quantity encountered in time series analysis, such as coherence, behaves in a certain way, one must contend with the fact that most, if not all, such quantities can be derived from several different perspectives. Any answer will probably make sense only from one of these points of view. Consider, for example, the estimated mean of a time series. This deceptively simple quantity can be understood, alternately, as arising through the minimizing of error (a deterministic derivation) [4] or through the maximizing of likelihood (a probabilistic derivation) [5] or through the maximization of importance (an informational derivation) [6] to name just a few. The answer to a question concerning the estimated mean, say for example, whether it should always be bounded by the smallest and largest datum, will necessarily refer to one of these perspectives. The same is true for coherence. We adopt here a deterministic perspective: The coherence between two time series, at frequency, , is closely-related to the zero-lag cross-correlation of band-passed versions of those time series, where the band-pass filter is one-sided and has center-frequency, . In fact, the former is merely a normalized and squared version of the latter. This is but one perspective among many, but one we find helpful because it brings out a relationship to the cross-correlation, another quantity useful in assessing the similarity between two time series. Cross-correlation is defined in the time-domain, as contrasted to coherence, which is defined in the frequency-domain, so the link provides complimentary information. The appearance of a one-sided filter (Fig. 2A), may seem counter-intuitive, because such filters are almost never used in practice, or at least not when the data are real, for they turn a real time series into a complex one. All the band-pass filters that an analyst would commonly use are two-sided (Fig. 2B), and so have real output. The reason for its appearance here is that the usual definition of coherence is completely general. It does not presume that the signals being compared are real, and so builds in the possibility that negative and positive frequency components of the time series behave completely differently from one another. This is contrast to real time series, where they are complex conjugate pairs. However, in being general, it cannot exploit an important property of real signals: that sines and cosines are distinguishable from one another. As we show below, substituting a two-sided filter produces a version of coherence that distinguishes sines from cosines; that is, one that is sensitive to the relative phase of band-limited signals. Coherence-Like Measure of Similarity Based of Cross-Correlation. The problem we consider is how to quantify the similarity of two real, transient time series, and , in the vicinity of a specified frequency, . The strategy we adopt is to band-pass filter these time series and then to compute their zero-lag cross-correlation. The filter selects out frequencies near and the cross-correlation quantifies similarity, since it attains its largest value when (ignoring, for the moment, the issue of normalization). We denote the filtered time series as, and where the symbol denotes convolution. We require the filtered time series to be purely real, so that the filter, has a two-sided Fourier transform with the symmetry, , where the tilde denotes Fourier transformation and the asterisk denotes complex conjugation. We choose a filter with a purely-real Fourier transform, built from two unit-amplitude boxcar functions, one centered at and the other at , each of width 2Δω. This filter does not affect Fourier components within the pass-band and completely rejects those outside of it. The convolution, , and cross-correlation, , of two real time series are defined as [7, their pages 24 and 46]: (1a,b) Note that at zero-lag, cross-correlation is just the area beneath the product of the two time series: (2) Note also that definition of the convolution and cross-correlation in (1a,b) differ only by a sign of in the term. The substitution, , leads to the very useful, well-known identity, [7, their page 47]. Applying this identify, we find that the cross-correlation of the filtered time series is: (3) At zero lag, the cross-correlation is proportional the integral of its Fourier transform, : (4) Inserting (3) into (4) and using the rule that the Fourier transform of a convolution is the product of the transforms [7, page 115] and the rule that the transform of is (see Appendix) yields:


Introduction
In this paper, we examine the well-known formula for the frequency-dependent coherence 2 C between two time series and argue that it is not well-suited for quantifying the similarity of band-limited data. Using a time domain-based analysis, we identify a critical step in the development of the traditional algorithm, which we show is inappropriate in the band-limited case, and propose an alternative that leads to the definition of a new quantity 2 S , which while having a definition similar to 2 C , is better behaved. We then use both synthetic tests and analytic methods to elucidate the behavior of 2 S , and show that it is a viable alternative to 2 C . Our belief is that the choice of time series analysis technique should be guided by the properties of the data; one analyzes time series in a way designed to best extract knowledge from them. One should always be willing to adapt an analysis method to achieve this goal.
The issue considered here is how best to quantify the similarity between time series that are 1) real (as contrasted to complex) and 2) band-limited (in the sense of being nearly monochromatic). Such time series constitute important special cases because most natural phenomena are described using real numbers and many are dominated by a single period of oscillation. For example, the daily period often contributes strongly to physiological and meteorological signals, the annual period to environmental and climatic signals, the precessional period (25.7 ka) [1] to sedimentary and paleontological signals, and so forth. Furthermore, commonly-used techniques such as multiple window coherence analysis [2], where two long time-series are divided into a sequence of shorter pairs before coherence analysis is performed, may accentuate the degree to which a single period of oscillation dominates the signal.
An important property of nearly-monochromatic signals is their relative phase. Whether two time series that are in-phase (as in Figure 1(a)) or out-of-phase (as in Figure 1(b)) may be important, for example, from the perspective of an analyst trying to unravel the dynamics of the underlying causative processes.
Traditional coherence analysis [3] has very limited application in this case, because of the well-known insensitivity of coherence to relative phase. The coherence of two sinusoidal time series of the same period is always unity, irrespective of their relative phase. Simply put, coherence does not distinguish a sine from a cosine. Given the general usefulness of coherence in other settings, it is well to ask why it "fails" in this special case and whether it can be modified to produce what may, in some circumstances, be a more useful measure of similarity.
When asking why any quantity encountered in time series analysis, such as coherence, behaves in a certain way, one must contend with the fact that most, if not all, such quantities can be derived from several different perspectives. Any answer will probably make sense only from one of these points of view. Consider, for example, the estimated mean of a time series. This deceptively simple quantity can be understood, alternately, as arising through the minimizing of error (a deterministic derivation) [4] or through the maximizing of likelihood (a probabilistic derivation) [5] or through the maximization of importance (an informational derivation) [6] to name just a few. The answer to a question concerning the estimated mean, say for example, whether it should always be bounded by the smallest and largest datum, will necessarily refer to one of these perspectives. The same is true for coherence. We adopt here a deterministic perspective: The coherence between two time series, at frequency, 0 ω , is closely-related to the zero-lag cross-correlation of band-passed versions of those time series, where the band-pass filter is one-sided and has center-frequency, 0 ω . In fact, the former is merely a normalized and squared version of the latter.
This is but one perspective among many, but one we find helpful because it brings out a relationship to the cross-correlation, another quantity useful in assessing the similarity between two time series. Cross-correlation is defined in the time-domain, as contrasted to coherence, which is defined in the frequency-domain, so the link provides complimentary information.
The appearance of a one-sided filter (Figure 2(a)), may seem counter-intuitive, because such filters are almost never used in practice, or at least not when the data are real, for they turn a real-time series into a complex one. All the band-pass filters that an analyst would commonly use are two-sided (Figure 2(b)), and so have real output. The reason for its appearance here is that the usual definition of coherence is completely general. It does not presume that the signals being compared are real, and so builds in the possibility that negative and positive frequency components of the time series behave completely differently from one another. This is contrast to real-time series, where they are complex conjugate pairs. However, in being general, it cannot exploit an important property of real signals: that sines and cosines are distinguishable from one another. As we show below, substituting a two-sided filter produces a version of coherence that distinguishes sines from cosines; that is, one that is sensitive to the relative phase of band-limited signals.

Coherence-Like Measure of Similarity Based of Cross-Correlation.
The problem we consider is how to quantify the similarity of two real, transient time series, ( ) u t and ( ) v t , in the vicinity of a specified frequency, 0 ω . The strategy we adopt is to band-pass filter these time series and then to compute their zero-lag cross-correlation. The filter selects out frequencies near 0 ω and the cross-correlation quantifies similarity, since it attains its largest value when ( ) ( ) Note that at zero-lag, cross-correlation is just the area beneath the product of the two time series: Note also that definition of the convolution and cross-correlation in (1a), (1b) differ only by a sign of τ in term. The substitution, τ τ ′ = − , leads to the very useful, well-known identity, ( ) ( ) , their page 47). Applying this identify, we find that the cross-correlation of the filtered time series is: At zero lag, the cross-correlation is proportional the integral of its Fourier transform, ( ) Inserting (3) into (4) and using the rule that the Fourier transform of a convolution is the product of the transforms ( [7], page 115) and the rule that the transform of ( ) Here ( ) The normalized measure of similarity, say S , is: Note that the quantity, 2 S , which we nickname here similarity, varies between zero and unity. It has almost exactly the functional form of the quantity called coherence, except for the taking of the real part. The imaginary part cancelled from (5) precisely because the time series are real and the filter is two-sided.

Coherence Related to Zero-Lag Cross-Correlation
As asserted in the Introduction, the usual formula for coherence can be obtained simply by switching to a one-sided filter, a single unit step function of width 2 ω ∆ centered at frequency 0 ω (where 0 ω −∞ < < +∞ ). The filtered time series f u * and f u * are complex, so that one must define a cross-correlation appropriate for complex signals; that is, replace ( ) u τ with ( ) u τ * in (1b). These modifications lead to a version of (7) that is exactly the usual formula for the coherence: As an aside, we note that our derivations of ( )  S ω , is most easily calculated using its time-domain definition. Taking, without loss of generality, the window of observation to be 0 2π τ < < , we have:  d  and  sin  d  2  2   and  0  sin  sin  d  cos  cos  2  2  0  so cos .

Similarity and Coherence of Real Band-Limited Signals
We then find: This is the well-known result that the coherence, 2 C , is unity irrespective of the relative phase of the two sinusoids. This behavior is a consequence of the one-sided filter, which turns both

Examples
We consider the example of a sequence of nearly-monochromatic wavelets, formed by taking the product of a phase-shifted sinusoid of frequency, 0 ω , and a normal envelope function of half-width, σ : analysis scenario where ( ) u t represents the external forcing applied to some dynamical system, and ( ) v t represents the response. In such a context, the distinction between these different wavelet shapes is important, say for detecting whether or not some anticipated interaction has occurred. In this case, the similarity, S ω (red curves in Figure 3(d), Figure 3(c)) is a more useful quantity than the coherence, C ω (black curves), since it varies strongly with the phase-relationships, whereas coherence does not.
We have not performed an exhaustive analysis of the differences between C ω , when they are applied to broad-band signals. The key difference is the effect of the taking of the real part: where the Fourier transforms are written in terms of their real and imaginary parts,  . Since 2 S and 2 C differ by a manifestly positive amount, we are guaranteed that 2 2 C S ≥ . However, without further specification of the behavior or u  and v  , no further characterization is possible. In the special case where both time series contain a common function ( ) w t , so that ( ) ( ) ( ) We might expect in the case that 2 2 C S ≈ , since the cross-terms are averages of functions that oscillate around zero and therefore likely to be small. Numerical tests (Figure 4) support this idea, at least for non-transient broad-band time series with a moderate degree of correlation.

Conclusion
In summary, we recommend this simple modification of coherence in cases where the time series that are being compared are narrow-band and where phase relationships between them are considered important. For pure sinusoids differing by phase, ϕ , it obeys the rule 2 2 cos S ϕ = ; that is, similarity monotonically decreases from unity, when 0 ϕ = , and to zero, when π 2 ϕ = . In other respects, it behaves very similarly to coherence. Finally, it has a very intuitive time-domain interpretation: ( ) 0 S ω gives you exactly what you would get if you normalized each time series by the square-root of its energy, band-pass filtered each with a two-sided boxcar filter centered around frequency, 0 ω , and computed their zero-lag cross-correlation.