Scanning for Clusters of Large Values in Time Series: Application of the Stein-Chen Method

The purpose of this application paper is to apply the Stein-Chen (SC) method to provide a Poisson-based approximation and corresponding total variation distance bounds in a time series context. The SC method that is used approximates the probability density function (PDF) defined on how many times a pattern such as { } 1 2 , , 1 0 1 t t t I I I + + = occurs starting at position t in a time series of length N that has been converted to binary values using a threshold. The original time series that is converted to binary is assumed to consist of a sequence of independent random variables, and could, for example, be a series of residuals that result from fitting any type of time series model. Note that if {1 0 1} is known to not occur, for example, starting at position t = 1, then this information impacts the probability that {1 0 1} occurs starting at position t = 2 or t = 3, because the trials to obtain {1 0 1} are overlapping and thus not independent, so the Poisson distribution assumptions are not met. Nevertheless, the results shown in four examples demonstrate that Poisson-based approximation (that is strictly correct only for independent trials) can be remarkably accurate, and the SC method provides a bound on the total variation distance between the true and approximate PDF.


Introduction and Background
Suppose there is interest in the probability that a pattern such as {1 0 1} or {1 1 1} occurs in a sequence of N = 10 independent Bernoulli trails. The main interest in this paper is the case with a small Bernoulli success probability, p = P (I i = 1), consisting, for example, of whether a residual from a fitted time series model exceeds a threshold. A pattern such as {1 0 1} or {1 1 1} could indicate a depar-ture from the fitted model, perhaps indicating that a signal of interest is present. This paper considers scanning for {1 x 1} with x = 0 or 1, with p = P (I i =1) being quite small, such as 0.10 or less. Then the probability of the pattern {1 x 1} is p 2 , and there are N -2 = 8 possible starting locations for the pattern in N = 10 trials. Because there are only 2 10 = 1024 possible patterns of 0's and 1's, all 1024 patterns could be listed, and the probabilities assigned to each set of 10 binary values that include {1 x 1} at least once could be summed to provide an exact calculation (Example 2 in Section 4). For larger values of N, this exact calculation is unwieldy, so an approximate method is desired, provided the approximation is highly accurate with provable error bounds.
This paper uses the SC approximation method (Section 3) to greatly simplify calculating the probability that a specified pattern occurs in a sequence of independent residuals in a time series context. The SC method approximates the probability density function (PDF) defined on how many times a pattern such as has been converted to binary values using a threshold. The original time series that is converted to binary is assumed to consist of a sequence of independent random variables, and could, for example, be a series of residuals that result from fitting any type of time series model. Note that if {1 0 1} is known to not occur, for example, starting at position t = 1, then this information impacts the probability that {1 0 1} occurs starting at position t = 2 or t = 3, because the trials to obtain {1 0 1} are overlapping and thus not independent, so the Poisson distribution assumptions are not met. Figure 1 is an example of time series data, consisting of electric consumption recorded every hour for 14 days for a total of 336 measurements (the data named elec_load aggregated from 30 minute to 60 minute time steps from the TSrepr package in [1]). This electric consumption data is used here simply as an example of the type of time series that this paper considers. Figure 2 is a binary version of the series in Figure 1, with values 3.5 or larger set to 1 and values less than 3.5 set to 0. Approximately 2% (6 of 336) of the 336 values exceed 3.5.   Figure 2 is the type of data that motivated this case-study application of the SC method. In the applications of interest, the combinatorial counting can only be done for very short time series (and so it is done only in Example 2 below with a length 5 time series). Therefore, the application led the authors to apply and assess the SC bound for a corresponding simple Poisson approximation. The SC bound does not seem to be well known among practitioners; however, as this paper shows, the SC bound can defend the use of the simple Poisson approximation in some real applications, and can provide a very small bound on the approximation error.
The advantage of the SC method in this context is simplicity and tractability (as shown in Examples 1 to 4 below). The disadvantage is that the SC method is an approximate method for which the total variation distance bound must be calculated in order to assess the quality of the approximation under various conditions (as shown in Examples 1 to 4 below). Fortunately, the SC approximation quality is excellent in the applications of interest.

Methodology: Scanning for Specified Patterns
The N = 336 binary values in Figure 2 are an example of the type of binary time series considered here. The binary values 1 2 3 , , , , N I I I I  are assumed to be independent and identically distributed with constant probability p = P (I i = 1). The probability p is the probability that the original time series X exceeds a threshold, and the I notation denotes an indicator or binary variable. As an aside, the SC method can also be applied if p is not constant over time, but the independence assumption is difficult to avoid [2]. Any type of time series model [3] can be fit to the series of interest, and then the resulting residuals become the original series that is thresholded to convert to binary; therefore, the application is quite general. Suppose that large values of the original series are thought to rarely cluster, so, for example, a pattern such as {1 0 1} or {1 1 1} could indicate a departure from the assumed time series model, perhaps indicating that a signal of interest is present. This paper will consider scanning for {1 x 1} with x = 0 or 1, with p = P (I i = 1) being quite small, such as 0.10 or less. Then the probability of the pattern {1 x 1} is p p = p 2 .
Start at index i = 1 and check whether {1 x 1} occurs in positions {1 2 3}, then start at index i = 2 and check whether {1 x 1} occurs starting at index 2 in positions {2 3 4}, then start at index 3, etc. Note, for example, that if {1 x 1} occurs starting at position i = 1, then the probability that {1 x 1} also occurs starting at index 3 is p. Clearly, there is a small neighborhood of dependence around each starting index, as just illustrated. This neighborhood of dependence violates the assumptions for a Poisson distribution (as a limit distribution for a sequence of N Bernoulli trials, each with small probability of success), but [2] shows that provided the dependence neighborhood is modest, the Poisson distribution can still provide an excellent approximation to the PDF defined on the number of times {1 x 1} occurs in a series of length N.

Stein-Chen Method
According to Theorem 2 in [2], the Poisson PDF with mean parameter ( ) The TVD is a quite general distance measure between two PDFs. The TVD is defined here as the maximum absolute difference between the probability assigned by Y and the probability assigned by W to any specified subset of possible integer values. In the current scanning context, the most important subset of possible values to consider is the single value {0}, which would imply that the pattern {1 x 1} never occurred (occurred 0 times) in the N -2 overlapping trials. Then, the SC method in this context uses the Poisson approximation to assign a value to P {0} and the SC method ensures that the Poisson approximation to P {0} is quite accurate, as shown below.

Simulation Results
This , which is nearly the same value of λ as in Example 1, but this example has quite large N and quite small p. The simulation-based PDF assigns 0.903 to 0 occurrences of {1 x 1} and 0.091. Example 4 is close to the real application that motivated this paper, and for that length time series, the exact method's combinatorial counting (as was done in Example 2 where N = 5) is prohibitively unwieldy, so the SC bound becomes indispensable.

Conclusions and Summaries
This paper applied the SC method to approximate the PDF for the number of occurrences of an example pattern in an independent binary time series. In scanning for whether a pattern such as {1 x 1} occurs starting at index i, there are overlapping tries to achieve the pattern, resulting in many non-independent trials consisting of the values in three successive indices. As the time series length increases and the probability p = P (I i = 1) decreases, the SC method shows that the Poisson approximation is excellent, with a small total variation distance bound, just as in the case of many independent trials, each with small success probability.
The SC bound does not seem to be well known among practitioners; however, related references are available [4] [5] [6] [7]. Reference [4] applies the SC method to calculate coincidence probabilities. References [5] and [6] apply the SC method in different time series contexts than ours. Reference [7]  The main contribution of this paper is to show that the SC bound can defend use of the simple Poisson approximation in real applications (as opposed to unwieldy combinatorial calculations as in Example 2 for larger time series lengths N), and provide a very small bound on the approximation error. Example 4 is close to the real application that motivated this paper, and for that length time series, the exact method's combinatorial counting (as was done in Example 2 where N = 5) is prohibitively unwieldy, so the SC bound becomes indispensable.