Bayesian Regularized Quantile Regression Analysis Based on Asymmetric Laplace Distribution

In recent years, variable selection based on penalty likelihood methods has aroused great concern. Based on the Gibbs sampling algorithm of asymmetric Laplace distribution, this paper considers the quantile regression with adaptive Lasso and Lasso penalty from a Bayesian point of view. Under the nonBayesian and Bayesian framework, several regularization quantile regression methods are systematically compared for error terms with different distributions and heteroscedasticity. Under the error term of asymmetric Laplace distribution, statistical simulation results show that the Bayesian regularized quantile regression is superior to other distributions in all quantiles. And based on the asymmetric Laplace distribution, the Bayesian regularized quantile regression approach performs better than the non-Bayesian approach in parameter estimation and prediction. Through real data analyses, we also confirm the above conclusions.


Introduction
Since the pioneering work by Koenker and Bassett in 1978, quantile regression (QR) has been deeply studied and widely applied to descript the elaborate relationship between the dependent variable and predictors [1]. Compared with the traditional mean regression, quantile regression has more robustness to data with outliers. In 1999, Koenker  the MCMC iterations to construct asymptotically valid intervals [9].
In 2004, Koenker added the Lasso regularization method to the mixed-effect quantile regression model for the first time, and the Lasso penalty made the random effect shrink to zero [10]. In 2007, Wang et al. considered the least absolute deviance (LAD) estimate with adaptive Lasso penalty (LAD-lasso) and proved its oracle property [11]. In 2008, Li and Zhu considered quantile regression with the Lasso penalty and developed its piecewise linear solution path [12]. In 2009, Wu and Liu studied the quantile regression with the SCAD method and the adaptive Lasso method [13]. In 2008, Park and Casella studied the Lasso penalty from the Bayesian angle, and proposed that the hierarchical model can be effectively solved by the Gibbs sampler, thereby introducing the regularization method [14]. In 2010, Li et al. studied the regularization method in quantile regression from the perspective of Bayesian and proposed to set the prior distribution of parameters to Laplace prior, and use Gibbs sampler to sampling Bayesian Lasso quantile regression [15]. In 2012, Alhamzawi et al.
proposed Bayesian adaptive Lasso quantile regression, by setting different penalty parameters for different variables, and setting the penalty parameter to inverse gamma distribution, and the inverse gamma priori Parameters are treated as unknowns and estimated along with other parameters [16]. In 2018, Adlouni et al. showed that a regularized quantile regression model with B-Splines based on five penalties (Lasso, Ridge, SCAD0, SCAD1 and SCAD2) in Bayesian framework [17].
Based on the existing literature, the Bayesian quantile regression is realized by expressing the asymmetric Laplace distribution as scale mixtures of the standard normal distribution and the standard exponential distribution, and the Gibbs sampler is used to simulate the distributed parameters. The regularized quantile regression under the Bayesian framework is compared with the non-Bayesian regularized quantile regression method. Finally, the prostate cancer data sets are used to illustrate the advantages and disadvantages of these two approaches.
where, σ is the scale parameter, µ is the location parameter and θ is the asymmetrc parameter.
Then the likelihood function of the sample ( ) 1 2 , , , n y y y ′ = y can be expressed as Tsionas [18], Kozumi and Kobayashi [7] have demonstrated that ALD can be viewed as a mixture of standard an exponential distribution,

Bayesian Quantile Regression with Lasso and Adaptive Lasso Penalty
The Bayesian quantile regression parameter estimation model with Lasso penalty (Li and Zhu) [12] is Li et al. [15] set the prior distribution of the parameter j β to a Laplace prior It is also assumed that the error term i µ obeys the asymmetric Laplace distribution (ALD).
Bayesian quantile regression with adaptive Lasso penalty (BQR-AL) is based on different penalty parameters are applied to different regression coefficients.
Therefore, the parameter estimation model of BQR-AL is ( ) Alhamzawi and Yu [16] proposed different penalization parameters for different regression coefficients, and the prior of the penalty parameters is set to the inverse gamma distribution, and treat the hyperparameters of the inverse gamma prior as unknowns. Thus, the Laplace prior on j β by ( ) Andrews and Mallows [19] The prior distribution of 2 j λ is set to the inverse gamma prior, so the distri- where 0 δ > and 0 γ > are two hyperparameters. Yi and Xu [20] pointed out that the values of these two hyperparameters determine the degree of compression of the variables in (13), because small γ large δ will lead to greater compression, so in order to avoid the impact of special values on the estimation of regression coefficients, we consider δ and γ as unknown parameters.
In summary, the Bayesian quantile regression hierarchical model with adaptive Lasso penalty is

Gibbs Sampling
From the hierarchical model, the joint posterior density function of each parameter is Since the full condition posterior distribution ( ) ( ) of δ does not have a closed form, it is a logarithmic convex function. Gilks [21] proposed using the adaptive rejection sampling algorithm to sample this distribution.

Independent and Identically Distributed Random Errors
Here, we follow the same simulation strategy introduced by Li, Xi and Lin [15] in the simulation studies 1, 2 and 3 with different parameter values for the error distributions.
We consider a linear model In the first three simulation studies, the rows of x is generated in a multiva- In Simulation 4, we first In each simulation, we consider the error distributions in our simulation fol-  Figure 1 shows that under the condition of simulation 1, that is, sparse coefficients, the MMD values of BQR-AL and BQR-L are lower than those of the frequency school method, in which the MMAD value satisfying the error term compliance distribution ( ) , 0.5,1 ALD µ is smaller. Figure 2 corresponds to the dense coefficient in simulation 2, and it can be seen that the MMAD values of BQR-AL and BQR-L are very small and close to each other. Figure 3 corresponds to the case where the coefficient of the simulation 3 is sparse, and the same conclusion as the simulation 1 can still be obtained. And the effect of Bayesian regularized quantile regression is more obvious.
For simulation 4, since the number of variables is larger than the sample size, the design matrix is a singular matrix, so the QR and QR-SCAD methods cannot be run in this simulation, and the other methods can still operate normally. This also proves the advantages of the regularization method. The results are shown in Figure 4. Journal of Applied Mathematics and Physics      4 show that, in terms of the MMAD, BQR-AL and BQR-L method performs better than the other regularized quantile regression method. The results of the MMAD for simulation 1 -4 are reported in Figures 1-4. From these simulation, we can learn the following results: 1) In the above simulation, the MMADs of BQR-AL and BQR-L tend to give lower MMAD compared with the other regularized quantile regression under non-Bayesian for all distributions under considerations. It is shown that the stability and repeatability of the Bayesian regularized quantile regression are better.
2) In the case of sparse and very sparse regression coefficient, the MMAD value of BQR-AL is the smallest. In the case of dense regression coefficient, the MMAD value of BQR-L is smaller. Moreover, the estimation effect of the two methods is similar.
3) The BQR-AL and BQR-L methods can achieve good results under all error term distributions. It is shown that the regularized Bayesian quantile regression method is robust to the assumption of the error term, and the two methods are satisfactory even if the error term deviates from ALD. 4) No matter what the distribution of the original data is, when the error distribution is ALD, the regularized quantile regression method under Bayesian framework has high accuracy, especially the BQR-AL method, and the estimated value of its parameters is the closest to the real value.
In addition to observing the MMADs of each method, this paper can also observe the estimation of its parameters. Due to the limited space, in this paper, the parameter estimation results of error obeying ALD distribution in simulation 1 are simulated: It can be seen from the parameter estimates in Table 1, the QR class method generally gives less biased parameter estimates, but this does not guarantee a good quantile prediction, as implied by the MMADs in Figure 4.

Non-i.i.d. Random Errors
Consider the following model when the error term is subject to a non-i.i.d.

( )
The remaining five noise variables 4 5 6 7 8 , , , , x x x x x are generated from the independent standard normal distribution. The results are shown in Table 2 and are based on 50 repetitions, each with sample size n = 200.
It can be known from Table 2 that the BQR-L method have smaller MMAD values, indicating that the regularized quantile regression method under the Bayesian framework is also superior to the regularized quantile regression method under the non-Bayesian framework when the error obeys the non-i.i.d..

Prostate Cancer Data Set
This section mainly analyzes prostate cancer data in the "bayesQR" package [22]. The dataset was first proposed by Stamey et al. [23], including a medical    In Table 3, we will compare three methods: QR, BQR-L, and BQR-AL. The QR method will use the "quantreg" package in R and the default rank method to get the confidence interval. Here, the 95% interval is considered and parameter estimation is performed. For the Bayesian method, the MCMC algorithm is used to perform 5000 simulations on the posterior distribution by default, and the Q. Q. Tang et al.
first 1000 samples are discarded. The results are shown in Table 3.
It can be seen from Table 3 that the parameter estimates of the BQR-L and BQR-AL methods are very close to the classical quantile regression and the credible intervals of the BQR-L and BQR-AL methods are narrower than the QR, in which the BQR-AL interval to be more accurate than BQR-L, from this point of view, the Bayesian quantile regression method with penalty is estimated to be more accurate than the non-Bayesian regularized quantile regression method.
Of course, we can also intuitively understand by drawing images. In this section, for a more intuitive observation, the estimated values of the various methods for 0.7 θ = are plotted, and similar results are obtained for other quantiles. In order to be intuitive, the estimated values of each method will be translated, and the image is drawn as shown in Figure 5.  It can be clearly seen from Figure 5 that the quantile regression with Bayesian can provide a credible interval, while the non-Bayesian quantile regression is not always available and the credible interval of BQR-AL is narrower than BQR-L.
The Bayesian quantile regression with penalty is estimated to be more accurate than the non-Bayesian regularized quantile regression. The effect of BQR-AL is better.

Conclusion
Bayesian quantile regression with adaptive Lasso penalty is an extension and improvement of the Lasso method. Adaptive Lasso penalty is based on different penalty parameters are applied to different regression coefficients. This method can effectively eliminate the influence of noise variables and obtain more accurate parameter estimation. Through the Gibbs sampling algorithm, this paper systematically compares the regularized quantile regression under the non-Bayesian and Bayesian framework, and finds that when the error term obeys the independent identically distributed or heteroscedasticity distribution, both BQR-AL and BQR-L have higher accuracy and are superior to non-Bayesian methods. When the error obeys ALD, the BQR-AL method has the highest accuracy for the MMAD under the same quantile, and its parameter estimate is also the closest to the true value in general. In the real data set, we can also find the same conclusion. Therefore, we can say that the Bayesian penalty regression method can get a good effect under the condition that the coefficient is sparse or dense, and it can be described in full aspect at different quantile points, and it will occupy a very important position in the future high-dimensional data analysis.

Funding
This work was supported by the National Natural Science Foundation of China