A Bayesian Approach for Stable Distributions : Some Computational Aspects

In this work, we study some computational aspects for the Bayesian analysis involving stable distributions. It is well known that, in general, there is no closed form for the probability density function of stable distributions. However, the use of a latent or auxiliary random variable facilitates to obtain any posterior distribution when being related to stable distributions. To show the usefulness of the computational aspects, the methodology is applied to two examples: one is related to daily price returns of Abbey National shares, considered in [1], and the other is the length distribution analysis of coding and non-coding regions in a Homo sapiens chromosome DNA sequence, considered in [2]. Posterior summaries of interest are obtained using the OpenBUGS software.


Introduction
A wide class of distributions that encompasses the Gaussian one is given by the class of stable distributions.This large class defines location-scale families that are closed under convolution.The Gaussian distribution is a special case of this distribution family (see for instance, [1]), described by four parameters α, β, δ and σ.The   0, 2   parameter defines the "fatness of the tails", and when α = 2 this class reduces to Gaussian distributions.The is the skewness parameter and for β = 0 one has symmetric distributions.The location and scale parameters are, respectively, and (see [3]).
The difficulty associated to stable distributions  , , S     , is that in general there is no simple closed form for their probability density functions.However, it is known the probability density functions of stable distributions are continuous [6,7] and unimodal [8,9].Also the support of all stable distributions is given in (−∞, ∞), except for α < 1 and |β| = 1 when the support is (−∞, 0) for β = 1 and (0, ∞) for β = −1 (see [10]).The characteristic function Φ(.) of a stable distribution is given by where 1 i   and sign(.)function is given by Although a good class for data modeling in different areas, one has difficulties to obtain estimates under a classical inference approach due to the lack of closed form expressions for their probability density functions.An alternative is the use of Bayesian methods.However, the computational cost can be further exacerbated in assessing posterior summaries of interest.duced by Buckle (1995) using Markov Chain Monte Carlo (MCMC) methods.The use of Bayesian methods with MCMC simulation can have great flexibility by considering latent variables (see, for instance, [11,12]), where samples of latent variables are simulated in each step of the Gibbs or Metropolis-Hastings algorithms.
Considering a latent or an auxiliary variable, [1] proved a theorem that is useful to simulate samples of the joint posterior distribution for the parameters α, β, δ and σ.This theorem establishes that a stable distribution for a random variable Z defined in (−∞, ∞) is obtained as the marginal of a bivariate distribution for the random variable Z itself and an auxiliary random variable Y.This variable Y is defined in the interval (−0.5, a α,β ), when , and in (a α,β , 0.5), when .The quantity a α,β is given by The joint probability density function for random variables Z and Y is given by , exp 1 where (1.5) and , for 0.

 
From the bivariate density (1.4), [1] shows the marginal distribution for the random variable Z is stable S a (  , 0, 1) distributed.Usually, the computational costs to obtain posterior summaries of interest using MCMC methods are high for this class of models, which could give some limitations for practical applications.One problem can be the simulation algorithm convergence.In this paper, we propose the use of a popular free available software to obtain the posterior summaries of interest: the OpenBUGS software.
The paper is organized as follows: in Section 2 we introduce a special case of the stable distributions, namely, the Lévy distribution.In Section 3, we introduce a Bayesian analysis for stable distributions.Two applications are presented in Section 4. Section 5 is devoted to some concluding remarks.

A Special Case of Stable Distributions: Lévy Distribution
Some special cases of stable distributions are given for specified values of α and β.If α = 2 and β = 0 one has the Gaussian distribution with δ mean and variance equals to for δ< x< ∞. Figure 1 presents Lévy probability density functions for δ = 0 and different values of the σ scale parameter.
The probability distribution function of the random variable X with a Lévy distribution defined in (2.1) is given by where is the complementary error function with the error erf(.)given by   2 0 2 erf e d .
x t x t The Lévy distribution with probability density function (2.1) has undefined mean and undefined variance but its median is given by

 
To obtain the probability, density function or the median of a random variable X with a Lévy density function different approximations for the complementary error function are introduced in the literature (see [13]).Some special cases are presented below.and 6 0.0000430638 a  ; 3) where the maximum error is 5 × 10 −4 and a 1 = 0.278393, , and ; and sign(.) is given by (1.2). 4) An approximation for the inverse error function is given by where the maximum error is where the constant a and the sign (.) are given in (2.7).Assuming a random sample of size n with a Lévy dis-tribution with probability density as (2.2), the likelihood function for δ and σ is given by ple of size n, where , that is, where I (A) denotes the indicator function of set A.
Inferences for δ and σ parameters in the case of Lévy distribution are obtained using standard Markov Chain Monte Carlo methods (see [14,15]).

 
,0, Assuming a joint prior distribution for α, β, δ and σ, given by     , [1] shows that the joint poste- rior distribution for parameters , ,    and σ is given by

A Bayesian Analysis for General Stable Distributions
Let us assume that i where , , , n y y y y   are respectively, the observed and non-observed data vectors.Notice that the bivariate distribution in expression (3.1) is given in terms of i x and the latent variables i , and not in terms of i and i (there is the Jacobian In the simulation algorithm to obtain a Gibbs sample for the random quantities , ,     0 and σ having the joint posterior distribution (3.1), we assume a uniform U(−0.5, 0.5) prior distribution for the latent random quantities i for Observe that, in this case, we are assuming , , . With this choice of priors, one has the possibility to use standard software package like OpenBUGS (see [16]) with great simplification to obtain the simulated Gibbs samples for the joint posterior distribution.
In this way, one has the following algorithm: , , , y y y  y  from the conditional distributions π( |   , (0)  , (0) , ,  , (0) , , ); 4) Repeat Steps 1), 2) and 3) until convergence.From expression (3.1), the joint posterior probability distribution for , , , where θ and Since we are using the OpenBUGS software to simulate samples for the joint posterior distribution, we do not present here all full conditional distributions needed for the Gibbs sampling algorithm.This software only requires the data distribution and prior distributions of the interested random quantities.This gives great computational simplification for determining posterior summaries of interest as shown in the applications below.

Buckle's Data
In Table 1, we have a data set introduced by [1].This is the daily price return data of Abbey National shares in the period from July 31, 1991 to October 08, 1991.
In Figure 2(a), we present the histogram of the returns ρ(.) time series (given in Table 2) while in Figure 2(b) we have the Gaussian probability plot for the same data.From these figures, one observes that the Gaussian distribution does not fit well the data.
Assuming a Lévy distribution with probability density function given in (2.1) for a Bayesian analysis we consider the following prior distributions for δ and σ, δ ~ U(−1, −0.0271) and σ ~ U (0, 1), where U(a,b) denotes a uniform distribution on the interval (a, b).Observe that the minimum value for the ρ(.) data is given by −0.0271 To simulate samples for the joint posterior distribution for  and σ, using standard MCMC methods, we have used Open-BUGS software which only requires the log-likelihood function and prior distributions for model parameters.In Table 3, we present the posterior summaries of interest considering a burn-in-sample of size 5000 discarded to eliminate the initial value effect.After this burn-in-sample period we simulate another 200,000 Gibbs samples taking every 10-th sample.This gives a final sample of size 20,000 to be used for finding the posterior summaries of interest.Convergence of the Gibbs sample algorithm was verified by trace-plots of the simulated Gibbs samples.From OpenBUGS output we obtain a Deviance Information Criterium (DIC) value equals to −151.7.In Figure 2, (red line), we have the plot of the fitted Lévy density with δ = −0.0487mean and σ = 0.0391 as the scale parameter and the histogram of the ρ (.) returns.
Assuming a general stable distribution, we present in Table 4 the posterior summaries of interest obtained using OpenBUGS software considering the following priors: α ~ U(1, 2), β ~ U(−1, 0), δ ~ U(−0.5, 0.5) and σ ~ U(0, 0.5).In the simulation procedure, we have used a burn-in-sample of size 10,000 and another 490,000 Gibbs samples taking every 100-th sample.This gives a final sample of size 4900 to be used for finding the posterior summaries of interest.
In Figure 3, we have the trace-plots of the simulated Gibbs samples.In Figure 2, we also have the plot of the fitted stable distribution with α = 1.653, β = −0.3455,δ = 0.00782 and σ = 0.001132.We observe good fit of the table distribution (black line) The obtained DIC value is s .

Coding and Non-Coding Regions in DNA Sequences
Crato, et al. [2] introduce the length distribution of coding and non-coding regions for all Homo Sapiens chromosomes available from the European Bioinformatics Institute.In this way they consider a transformation of the genomes in numerical sequences.As an illustration, we have, respectively, in Tables 5 and 6, the data for coding and non-coding length sequences for H. Sapiens chromosomes transformed in a logarithm scale (sequence CM000275 extracted from Table 2, in [2]).Figure 4 presents the histograms of the data given in Tables 5 and 6, assuming a logarithm transformation.From these plots, we observe that a Gaussian distribution could not be a reasonable model for fitting the data.Assuming a Lévy distribution with probability density function (2.1) for a Bayesian analysis we consider the following prior distributions for δ ~ U (−1000, 1.0986), where 1.0986 is the minimum of the observations in logarithm scale, and for σ ~ U (0, 10000).In Table 7, we have the posterior summaries of interest considering the transformed coding and non-coding data using Open-BUGS software.
Figure 4 shows the fitted Lévy density with δ = 0.9693 and σ = 3.167 (for coding data) and with δ = 4.182 and σ = 1.633 (for non-coding data).From this figure we observe that the data is not well fitted by the Lévy distribution.
For a Bayesian analysis of the data assuming a general stable distribution, we consider the following prior distributions: 10).Using the OpenBUGS software, we simulated 600,000 Gibbs samples.From these 600,000 samples, we discarded the first 100,000 as a "burn-in-sample" to eliminate the initial value effects.After this "burnin-sample" period, we took every 500-th sample, which gives a final Gibbs sample of size 1,000 to be used for Monte Carlo of the interested random quantities.Convergence of the Gibbs sampling algorithm was verified from trace plots of the simulated samples for each parameter.Table 8 presents the posterior summaries of interest.Figure 5 shows the fitted stable distributions for coding and non-coding data.We observe good fit of     techniques (see, for instance, [11]) is the th od performance for the MCMC simulation method for applications using stable distributions.Observe that MCMC methods are a class of algorithms for sampling from probability distributions based on constructing a Markov Chain that has the desired distribution as its equilibrium distribution.The state of the chain after a large number of steps is then used as a sample of the desired distribution.The quality of the sample improves as a function of the number of steps.The obtained simulation results for the applications in Section 4, could be easily replicated using the same auxiliary random variable Y defined in Section 1 and the non-informative prior distributions defined in Section 3 for the parameters of the model.More accurate posterior summaries results could be obtained using informative prior distributions

Concluding Remarks
The use of stable distributions tive for many applications in data analysis, since this model has a great flexibility for fitting the data.With the use of Bayesian methods and MCMC simulation methods it is possible to get inferences for the model despite the nonexistence of an analytical form for the density function.It is important to point out that the computational work in the sample simulations for the joint posterior distribution of interest can be greatly simplified using standard free softwares like the OpenBUGS software.
In the simulation study considered in both examples introduced in Section 4, the use of data augmentation

1 
 one has a Lévy distribu- tion with probability density function given by

Figure 1 .
Figure 1.Lévy density function for δ = 0 and different values for the σ scale parameter.
by the right-hand-side of expression (1.4)).Observe that when α = 2 one has θ = 2 and , 0 b    .In this case one has a Gaussian distribution with δ mean and 2 2 variance.For a Bayesian analysis of the proposed model, we assume uniform U(a,b) independent priors for , ,    and  , where the hyperparameters a and b are assumed to be known in each application following the restrictions

Table 4 . Posterior summaries for general stable distribution.
From this value we conclude that the data is better fitted by the general stable distribution in contrast to the Lévy distribution (since it has smaller DIC value).