1. Introduction
As a striking member of Lp regression family, quantile regression is a classic statistical method that provides a comprehensive description of data by estimating the conditional distributions at different quantiles, which was first introduced by Koenker and Bassett [1] in 1978. It is widely applied in various fields such as economics, medicine, and ecology for data analysis, we refer to the review paper for the quantile regression methods [2]. As a parallel method, expectile regression is proposed by Newey and Powell [3]. The type of method obtains its popularity in finance thanks to its intrinsic nature, while it is not well received like the quantile method in general. The Lp quantile regression theory proposed by Chen [4] is developed on the basis of quantile and expectile regression, and it possesses superior characteristics.
Chen investigated Lp quantile regression and used it to test the symmetry of data. Recently, some authors found that Lp quantile regression has combined the robustness of quantile regression with the efficiency advantage of the expectile approach. It significantly outperforms traditional quantile regression and expectile regression in handling outliers and non-normally distributed data. With the advancement of data science and statistics, improving and developing Lp quantile regression not only enriches the theoretical statistics tool cabinet but also promotes its widespread application in practical scenarios, offering new insights and methodologies for understanding complex data.
The purpose of this paper is to systematically review the development history of Lp quantile regression, compare the advantages and disadvantages of different regression methods, and review the latest research progress. The innovation of this paper lies in the synthesis and comparison of existing quantile regression methods, especially Lp quantile regression, to provide a comprehensive framework to help researchers and practitioners understand the application scope of these methods and their applicable contexts. In addition, the paper highlights the unique advantages of Lp quantile regression in financial risk measurement, demonstrating its potential in dealing with extreme data and complex risk scenarios.
The structure of the article is as follows: the second part introduces the basic concepts and connections of quantile regression, expectile regression, and Lp quantile regression, and the third part reviews the advantages and disadvantages of the three quantile regression methods and their related application research, and the fourth part introduces Lp quantile regression in the field of financial risk measurement. The fifth part concludes the paper and looks forward to future research direction. Through this review, this paper hopes to provide a new perspective for the theoretical development and practical application of Lp quantile regression and stimulate more research interest and application exploration.
2. Definitions of Quantile, Expectile, and General Lp Quantile Regression
Suppose that the observations
come from the following linear model:
where
and the first component is
,
is the error term. Quantile regression uses a loss function
, which assigns different weights to positive and negative residuals and is defined as follows
where
is the indicator function and
represents the weight level. Then, the regression coefficient estimates in quantile regression are obtained from the following optimization problem:
Inspired by the quantile regression concept, Newey and Powell [3] defined the asymmetric square loss function
,
where
represents the weight level. Just as the quantile loss function
can induce the quantile of a random variable, an asymmetric squared loss can also induce a certain statistic, which is now generally called expectile. Expectile regression solves the following optimization problem to obtain the estimates of the regression coefficients:
Chen [4] proposed Lp quantile regression. For
,
, the loss function
is defined as follows:
where
represents the level of weights. When
and
,
can be reduced to quantile and expectile regression loss functions, respectively; when
and
, it is the loss function of the classical least squares regression method. The various loss functions are shown in Figure 1.
Analyze the loss functions corresponding to the various regression methods in Figure 1, when the value of p is between 1 and 2, it can be seen from the loss function image that the loss function curve of the Lp quantile is also between the quantile and the desired quantile loss function curve; when the value of p is greater than 2, from the loss function can be seen from the definition that as p continues to increase, the loss value will show an explosive increase.
Additionally, although the loss function defined by Lp quantiles includes quantile and expectile loss functions, it is also different. Both quantile regression and Lp quantile regression represent asymmetric levels, but
in quantile regression can represent probability and
in Lp quantile regression has no probability implications. Jones [5] proved theoretically that there is a one-to-one mapping from expectile to quantile
. Efron [6] proposed a method to estimate quantile by expectile. Similarly, we can calculate the relationship between
and
for different values of p in the Lp quantile, as shown in Figure 2. We calculate and plot the relationship between
and
under the standard normal distribution and t(5) distribution. Figure 2 shows that the Lp quantile is completely different from the quantile and expectile under different distributions, but can be transformed into each other.
In past studies, quantile regression and expectile regression have established a solid theoretical foundation. However, Lp quantile regression provides a more flexible framework for both. By considering the loss function with adjustable weight, Lp quantile regression can not only capture the different characteristics of conditional distribution, but also take into account the robustness and effectiveness when analyzing complex data, especially when facing the problem of high-dimensional heteroscedasticity.
Figure 1. Comparison of four loss functions when τ is 0.7.
(a) (b)
Figure 2. Under the standard normal distribution and t(5) distribution, when p takes different values, the relationship between τ and α.
The Lp quantile regression method proposed in this paper can provide a more efficient solution for dealing with outliers and non-normal distributions and can be widely used in finance, economics, biostatistics, and other fields. In addition, the discussion of different parameter p values not only deepers the understanding of the connections and differences among the three regression methods, but also provides a new entry point for future research and encourages researchers to continue to explore the potential and value of quantile-based methods in practical applications. This work enriches the existing statistical regression theory and also provides new ideas and tools for data analysis, thus having important academic research and practical application value.
3. Merits and Demerits of the Three Regression Methods
3.1. Quantile Regression
Quantile Regression has received much attention in statistics, econometrics, and finance in the past decade, and scholars have gradually realized its importance in obtaining information about random variables. In particular, when investigating relationships between response variables and certain covariates, both classical and Bayesian studies have developed new tools to delve deeper into these relationships, especially when focusing on tail behavior. Yu and Moyeed [7] pioneered discussions about Bayesian studies based on quantile regression, and subsequent studies such as Koenker [8], Kottas and Krnjajić [9], Taddy and Kottas [10], and Bernardi et al. ([11] [12]) provided their own Bayesian perspectives, further enriching research in this field. In recent studies, Tang et al. [13] explored quantile regression with adaptive Lasso and Lasso penalty from a Bayesian perspective, analyzed Gibbs sampling algorithm based on asymmetric Laplace distribution, and compared several regularized quantile regression methods under different error distributions and heteroscedaskedastic conditions. The results show that Bayesian regularized quantile regression outperforms non-Bayesian methods in terms of parameter estimation and prediction accuracy under the error term of the asymmetric Laplace distribution and shows superior performance at all quantiles. Combining the advantages of quantile regression and the Bayesian method makes the model flexible, robust, and efficient, especially when dealing with outliers and complex data structures, showing significant application potential.
The quantile regression method has shown its powerful function and wide applicability in various data types and application fields. Huang et al. [14] pointed out that quantile regression can effectively detect the heterogeneous effects of covariates, which is especially important when dealing with independent data, time-to-event data, and longitudinal data. When there are outliers or heavy-tailed distributions in the data, quantile regression not only supplements the deficiency of mean regression, but also provides a new perspective for research in science and finance. Mike and Israel [15] studied the impact of climate change in Florida and used quantile regression analysis to find that demographic factors such as knowledge, age, and income have a significant impact on the “social communication index” at different quantiles, which provides a basis for effective transmission of climate change information. In addition, Varouchas et al. [16] explored the impact of gender diversity on the board of directors on bank performance. In general, the quantile regression method plays an important role in revealing the complex relationship between variables and capturing data heterogeneity, which provides strong support for research and policymaking in related fields.
In recent years, quantile regression has been used in survival data analysis [17], environmental science ([18]-[20]), medical research [21], and extreme value theory [22] have been further applied and developed.
A significant advantage of quantile regression is its robustness. Unlike least squares estimation, quantile regression is insensitive to outliers in the sample data, particularly when the disturbance terms follow a non-normal or heavy-tailed distribution, where its effectiveness is notably higher than that of least squares estimation. Moreover, under the assumption of homoskedasticity, the estimators in quantile regression at different quantile levels only vary with the intercept term, while the slope term remains constant. Under the assumption of heteroskedasticity, the slope term changes with the quantile, thus enabling quantile regression to effectively test for heteroskedasticity.
As an effective alternative to mean regression, the advantages of quantile regression mainly lie in the following aspects: firstly, it has weaker requirements for conditions, being able to comprehensively depict the effect of explanatory variables on the entire conditional distribution of the response variable; secondly, quantile regression does not require distribution assumptions for the model’s random disturbance term, thereby enhancing the robustness of model construction; thirdly, it has a monotonic transformation property for the response variable; finally, under large sample theory, parameter estimation exhibits asymptotic superiority. These advantages make quantile regression increasingly valued in certain scenarios, exhibiting different properties from mean regression.
In short, compared with traditional estimation methods, quantile regression exhibits stronger robustness in the face of outlier values, and there is no need to make excessive assumptions about the error terms in the model. It can reflect the characteristics of data at different quantiles and provide more effective information, so it has shown a wide range of potential and value in various scientific and financial applications.
3.2. Expectile Regression
As an effective alternative to traditional quantile regression, expectation quantile regression has received widespread attention in quantitative risk management, nonparametric regression, and extreme value theory.
The expectile is widely used in the field of quantitative risk management. Related studies such as Schnabel and Eilers [23], De Rossi and Harvey [24], Sobotka and Kneib [25], Sobotka et al. [26] and Guo et al. [27]. In these studies, it is expected that quantiles will be introduced for analyzing risks, and they will gradually gain popularity in subsequent literature, such as Cai and Weng [28], Bellini and Di Bernardino [29], and Daouia et al. [30], further enriching the theoretical basis and application examples of expectile regression.
In the field of nonparametric regression, research on expectile regression has gradually emerged. Yao and Tong [31], De Rossi and Harvey [24], S. K. Schnabel and P.H.C. Nonparametric expectile regression was discussed in early literature such as Schnabel and Eilers [23], Waltrup et al. [32]. In recent years, Guerra et al. [33] have introduced two new nonparametric smoothing methods, proposed a fuzzy transformation method based on L1 and L2 norms to construct fuzzy approximation models, and verified their effectiveness on actual financial datasets. In addition, Almanjahie et al. [34] proposed an expectile regression estimator based on kernel method for strong mixed-function time series data, and proved its consistency and asymptotic normality with large samples, indicating its application potential in financial time series analysis.
The application of expectile in extreme value theory is also worthy of attention. Padoan and Stupfler [35] discussed the estimation of extreme expectation values and their application in heavy-tailed time series, emphasizing the importance of data dependence in financial risk management. Girard, Stupfler, and Usseglio-Carleve [36] established the estimation theory of the expected value of extreme conditions in the heavy-tailed heterovariance regression model, demonstrated the application of the theory in a variety of important models, and verified its effectiveness through empirical data. In addition, Daouia, Padoan, and Stupfler [37] innovatively proposed an extreme expected value estimation method suitable for short-tail distribution, further expanding the theory of expected value estimation in risk management.
It is worth mentioning that the expectile regression method has the characteristics of both mean and quantile and has attracted much attention in recent years. Although traditional least squares regression has limitations in analyzing high-dimensional heteroskedasticity [38], it is expected that the quantile method can fully describe the conditional distribution and show greater flexibility because it does not require distribution assumptions. Expectile is not as intuitive as mean and quantile, but they are seen as an effective alternative in expectile regression based on asymmetric least squares estimates.
In summary, expectile regression exhibits significant superiority in high-dimensional heteroscedasticity analysis, and its characteristic of not requiring any distribution assumptions makes it effective in characterizing conditional distributions. However, the intuitiveness of expectile needs further exploration to promote their wide acceptance and realization in practical applications.
3.3. Lp Quantile Regression
Since the development of quantile and expectile regression theory, the concepts of M quantile and Lp quantile have been proposed one after another. Breckling and Chambers [39] and Chen [4] introduced a wider range of tools, extending the idea of traditional quantile, respectively. The M quantile is achieved by minimizing the general asymmetric loss function, while the Lp quantile is achieved by minimizing the asymmetric power function. Expectile can be regarded as a special case of M quantiles. In Lp quantile regression, estimates of the regression coefficients for loss functions whose exponents lie in the interval (1, 2) can be obtained by the following optimization problem:
In the application of Lp quantiles, existing research indicates that it can be combined with methods such as expectile, extreme quantile, and Bayesian approaches to solve problems more efficiently. For instance, Girard et al. [40] proposed a method for estimating extreme expectile based on Lp quantiles, unifying absolute error loss and squared error loss, which was applied to estimate the extreme value parameters of the potential distribution and was validated on multiple datasets. Stupfler and Usseglio-Carleve [41] introduced a composite method based on the Lp quantile for extreme quantiles and expectile to address issues with existing estimators and parameter selection. Lukas and Johan [42] outlined Bayesian and frequency estimation procedures for Lp fractional regression based on the skewed exponential power distribution (SEPD), comparing them with previous studies, and demonstrated that the new methods provided better-quantified estimation and mean squared error in practical data applications. Additionally, Arnroth [43] extended single Lp quantile regression and composite quantile regression, providing a Bayesian method that was shown to outperform traditional Bayesian composite quantile regression in many aspects.
Moreover, particularly within the range of 1 < p < 2, the Lp quantile is of interest due to its robustness and efficiency. For instance, Jiang, Lin, and Zhou [44] introduced a new expected function based on the loss function between quantile and expectile, developing k-th power expectile regression, which further expands the application of Lp quantile. Gao and Wang [45] proposed two distributed Lp quantile regression estimators to address issues caused by high-dimensional external covariates. They explored how to obtain low-dimensional parameter pre-estimates through regularized projection score estimation and proposed two efficient proxy projection score estimators. Through simulation experiments, they primarily examined the scenario of independent and identically distributed data, finding that the selection of p values between 1 and 2 provides a good balance between robustness and efficiency.
Despite the less intuitive interpretation of Lp quantile regression compared to standard quantile regression, it can be linked to quantile estimates through appropriate transformations ([5] [46]). As illustrated in Figure 3, Lp quantile regression provides estimates for any quantile of the dependent variable, and as
approaches 0, the loss values for different p values converge, whereas they exhibit greater disparity as
approaches 1. This indicates that when p is between 1 and 2, the Lp quantile loss function lies between the quantile and expectile loss, highlighting the importance of the test function based on minimum absolute deviation in robust regression, while the squared test function is more effective under normal conditions. This suggests that Lp quantile regression strikes a favorable balance between traditional quantile and expectile, effectively addressing heavy-tailed distributions.
Therefore, although Lp quantile regression lacks intuitiveness to a certain extent, it shows significant advantages in taking into account robustness and effectiveness and has become an emerging method for solving high-dimensional hetero-variance problems and parameter estimation. It not only enriches the existing statistical regression theory, but also provides new ideas and practical application tools.
(a) (b)
Figure 3. Comparison of Lp quantile loss function.
To sum up, the merits and demerits of the three methods of quantile regression, expectile regression, and Lp quantile regression are summarized in Table 1.
4. Application of Lp Quantile Regression in the Financial Risk Field
Quantile regression is a powerful statistical tool widely used in economics and finance. It models the different quantiles of the conditional distribution of the dependent variable, making it suitable for analyzing and managing extreme risks.
Quantile or value-at-risk (VaR) (see [47]) are generally accepted risk measures. As one of the main applications of quantitative risk assessment, quantile correlation has been increasingly studied, and quantile and its related concepts play a central role in the financial and actuarial science literature as key tools for calculating capital requirements. VaR is defined as the maximum loss of an investment portfolio and the quantile of the loss distribution, introduced by J. P. Morgan in 1994 as a measure of financial risk. Since then, it has become the broadest measure of risk for regulatory purposes, see Jorion [48]. Quantile regression provides a solid theoretical basis for VaR estimation based on historical data, which can effectively capture different quantiles of the return distribution, thus helping financial institutions to evaluate risks more comprehensively.
However, traditional VaR measurement methods, such as the variance-covariance method, historical simulation method, and Monte Carlo simulation method, have their own limitations. To solve these problems, Engle and Manganelli [49] proposed the CAViaR (Conditional Autoregressive Value at Risk) model, which directly modeled VaR itself and pioneered the application of quantile regression in financial risk measurement. CAViaR model is especially suitable for autocorrelation and volatility characteristics in financial data because of its lower sample size requirements and stronger robustness.
At the same time, expectile, as an extension of quantile regression, provides a new approach to risk measurement. Efron [6] showed that VaR can be effectively estimated by expectile because expectile are more sensitive to conditional distributions
Table 1. Merits and demerits of three regression methods.
Methods |
Merits |
Demerits |
Quantile regression |
1) Strong robustness to outliers and non-normal distribution data; 2) Distribution assumptions independent of error terms are applicable to a wider range of scenarios; 3) Capable of describing the effects of explanatory variables on different quantiles of response variables; 4) Parameter estimation has asymptotic optimality. |
1) May be less efficient when dealing with high-dimensional data; 2) Providing only certain properties of the conditional distribution cannot explain the overall mean. |
Expectile regression |
1) Both mean and quantile features provide comprehensive information; 2) Bypass the assumptions of traditional least squares methods and enhance flexibility; 3) Without distribution assumption, conditional distributions can be effectively characterized. |
1) Relatively weak intuitiveness, difficult to understand and explain; 2) May not perform as well as traditional methods in some cases. |
Lp quantile regression |
1) Both quantile regression and expectile regression are robust and valid, applicable to a variety of data scenarios; 2) It shows good robustness to heavy-tailed distributions; 3) It can solve high dimensional heteroscedasticity problems, and the parameter estimation is more flexible. |
1) The theoretical explanation is relatively complicated, and it is not as intuitive as the quantile and mean; 2) When the p-value is selected improperly, the effect of the model may be damaged. |
as tail risks, which makes them particularly effective at capturing extreme risks. Many scholars (e.g. [50]-[52], etc.) have demonstrated the superior performance of expectile in VaR estimation, further promoting their application in quantitative risk management.
With the deepening of research, Lp quantile regression, as a natural extension of expectile and quantile, has gradually become a hot topic in the field of risk management. It can not only effectively measure VaR, but also extract features and handle extreme risks. Relevant literature shows that the Lp quantile has superiority in high-dimensional and extreme data cases. For example, Usseglio-Carleve [53] and Tang and Chen [54] demonstrated the validity and applicability of the Lp quantile in conditional extreme risk measures and models based on realized volatility, respectively. Sun et al. [55] focused on the application of Lp quantile regression inaccurate estimation of VaR in their latest research, proposed a conditional Lp quantile nonlinear autoregression model (CAR-LP-quantile model), verified the limit theorem of regression estimator, and derived an algorithm for parameter estimation and optimal p-value. Through simulation and empirical analysis, it is shown that the CLVaR method based on the CAR-LP-quantile model has strong validity and advantages.
Overall, quantile regression, expectile regression, and Lp quantile regression demonstrate strong potential for application in financial risk management. By analyzing different quantiles, financial institutions can achieve a more comprehensive risk assessment and develop appropriate response strategies according to specific circumstances. Especially in extreme risk and complex data environments, Lp quantile regression provides a more robust and effective tool for financial risk management, enabling institutions to identify and mitigate potential risks more effectively.
5. Conclusions and Prospect
In general, the quantile regression method is widely used in economics, finance, medicine, environmental science and other fields due to its good effectiveness and applicability. Expectile regression provides higher robustness due to its non-dependence on distributional assumptions, which strengthens the application potential of quantile regression in complex data environments. At the same time, Lp quantile regression combines the effectiveness of balanced quantile regression with the robustness of expectile, showing its unique advantages as a powerful statistical regression method. Especially in financial risk measurement, the application of Lp quantile regression has gradually increased, which not only provides richer data summary than mean regression, but also shows great potential in dealing with high-dimensional data and uncertainty modeling.
Recent studies have revealed that Lp quantile regression can provide more accurate results than traditional regression methods in the context of high-dimensional heteroscedasticity. This finding makes its application to risk management and credit scoring in financial markets as well as insurance more effective. The adjustable parameter (p) of Lp quantile regression provides researchers with the ability to flexibly adjust the model according to the needs of specific problems, so as to better capture different data characteristics.
Despite the remarkable results achieved by Lp quantile regression, further research is needed in terms of algorithmic efficiency and model design to improve its adaptability in the ever-changing data environment. The future research direction should focus on multiple fields: First, improve the efficiency of the algorithm to reduce the computational complexity of Lp quantile regression on large-scale data sets. Second, explore the possibility of combining it with deep learning technology to develop more flexible and accurate prediction models. In addition, attention should be paid to applications in dynamic data environments, especially in time series data, and the impact of its dynamics on risk assessment. With the deepening of the research on Lp quantile regression, combined with the progress of computing technology, its algorithm efficiency and application breadth will continue to improve, providing new perspectives and methods for future data analysis and decision-making and promoting the further development of theory and practice.
Funding
This work is partly supported by the Graduate Textbook Construction Project of Sichuan University of Science and Engineering (Grant No. KA202011) and the Opening Project of Sichuan Province University Key Laboratory of Bridge Non-destruction Detecting and Engineering Computing (2024QYY02).