Quantile Regression Based on Laplacian Manifold Regularizer with the Data Sparsity in l 1 Spaces

In this paper, we consider the regularized learning schemes based on l 1 -regularizer and pinball loss in a data dependent hypothesis space. The target is the error analysis for the quantile regression learning. There is no regularized condition with the kernel function, excepting continuity and boundness. The graph-based semi-supervised algorithm leads to an extra error term called manifold error. Part of new error bounds and convergence rates are exactly derived with the techniques consisting of l 1 -empirical covering number and boundness decomposition.


Introduction
The classical least-squares regression models have focused mainly on estimating conditional mean functions.In contrast, quantile regression can provide richer information about the conditional distribution of response variables such as stretching or compressing tails, so it is particularly useful in applications when both lower and upper or all quantiles are of interest.Over the last years, quantile regression has become a popular statistical method in various research fields, such as reference charts in medicine [1], survival analysis [2] and economics [3].
In addition, relative to the least-squares regression, quantile regression estimates are more robust against outliers in the response measurements.We introduce a framework for data-dependent regularization that exploits the geome-try of the marginal distribution.The labeled and unlabeled data learnt from the problem constructs a framework and incorporates the framework as an additional regularization term.The framework exploits the geometry of the probability distribution that generates the data.Hence, there are two regularization terms: one controlling the complexity of the classifier in the ambient space and the other controlling the complexity as measured by geometry of the distribution in the intrinsic space.

The Model
In this paper, under the framework of learning theory, we study l 1 -regularized and manifold regularized quantile regression.Let X be a compact subset of x y pairs generated according to ρ .Unlabeled examples are simply x X ∈ drawn according to the marginal distribution X ρ of ρ .We will make a spe- cific assumption that an identifiable relation between X ρ and the conditional ( ) . The conditional τ-quantile is a set-valued function defined by where ( ) The empirical method for estimating the conditional τ-quantile function is based on the τ-pinball loss where : f X →  .Based on observations, the empirical risk of the function f is ( ) ( ) Next, We assume that 1, .., f a e x X τ ρ ≤ ∈ with respect to X ρ .
In kernel-based learning, this minimization process usually takes place in a hypothesis space, Reproducing Kernel Hilbert Space (RKHS) [4] [5] k  generated by a kernel function : K X X × → .In the empirical case, a graph-based regular quantile regression problem can be typically formulated in terms of the following optimization By the representers theorem, the solution to (2.5) can be written as The l 1 -norm penalty not only shrinks the fitted coefficients toward zero but also causes some of the coefficients to be exactly zero when making A γ suffi- ciently large.When the data lies on a low-dimensional manifold, the graphbased method seems more effective for semi-supervised learning and many approaches have been proposed for instance Transductive SVM [6], Measurebased Regularization [7] and so on.Then the l 1 -regularized and manifold regularized quantile regression are as following where ( ) where D is a diagonal matrix with diagonal entries . The weight ij ω is given by a similar function ( ) x x ω .The more similar i x and j x , the larger ij ω should be.

The Restriction
Definition 3.1.The projection operator on the space of function is defined by Hence, it is natural to measure the approximation ability by the distance ( ) q ∈ ∞ .We say that ρ has a τ-quantile of p-average type q if for almost all x X ∈ with respect to X ρ , there exist a τ quantile t ∈  and constants ( ] and that the function If ρ has a τ-quantile of p-average type q for some ( ] q ∈ ∞ , then for any measurable function where ( ) We assume throughout the paper that < ∞ .Our approximation condition is given as , for some 0 1, here, the kernel K  is defined by Hence, although kernel K in not positive semi-definite, K  is a Mercer kernel,  denotes the associated reproducing kernel Hilbert spaces.The ker- nel K  defines an integral operator : Note that ( , ), , min min inf : such that for all , there is min , There exist an exponent ( ) . The above equation tells us that Next, The performance of K  approaching f τ ρ can be described through the regularizing function f γ defined as the above function f γ given by (3.13 ) can be expressed as where .
Denote the ball of radius

Error Decomposition
Proposition 4.1.Let ( ) , , 0, 0 given by (2.6).Then , , , , 1 where ˆˆ, , , , ˆˆˆ, The fact Hence, the second item in the right-hand side of the above equation is at most 0 by the reason that γ γ ≤ , we see that the last but one item is at most 0. The fifth item is less than by the ( ) ( ) . Thus we complete the proof.

Estimation of the Regularization Error
Proposition 4.2.Assume (3.7) holds, denoting where we get the following relationships we derive the desired bound.

Estimation of the Manifold Error
In this subsection, we estimation the manifold error.Denote Let ξ be a random variable on a probability space X with Proof.By the definition of ( ) Then we find the manifold error bound holds true.

Estimation of the Hypothesis Error
This subsection is devoted to estimate the hypothesis errors.
Proof.We estimate 1  .Recall ( ) , there holds ( ) Then the bound of the following is derived with Finally, we have The 2  has been proved in [8].

Estimation of the Sample Error
then there exists a constant c µ depending only on µ such that for any 0 1 δ < < , with confidence 1 δ − , there holds where .The same bound also holds true for ( ) The following proposition which has been proved in [9] will be utilized to bound 1  .
Proposition 4.5.suppose that ρ has a τ-quantile of p-average type q for some ( ] B satisfies the capacity assumption (3.12) with some 0 2 µ < < .Then, for any 0 1 δ < < , with confidence 1 δ − , there holds, for all Here 1 C and 2 C are the constants depending on , , , K c µ µ θ and C θ .The following proposition which has been proved in [9] will be utilized to bound 2  .
Proposition 4.6.Under the assumptions of proposition 4.5.Then, for any 0 0 δ < < ,with confidence 1 δ − , there holds, , for any r g ∈ , we have ( ) ( ) ( ) By Lemma 3.1, the variance-expectation condition of ( ) g z is satisfied with θ given by (3.4) and , c C θ β θ = = .Then we get ( ) Applying lemma 4.2 to r  , then for any ( ) there holds that, for any where . From the processing of estimating 1  , for any ( ) which implies there exists a subset 1 V of l u X + with measure at most 2 5 ( ) , , The Proposition 4.4 tells that there exists a subset 2 V of l u X + with measure of at most 2 5 δ 2 log 5 ˆ2 log 5 Obviously, the measure of V is at most δ and for every \ l u x X V + ∈ , the above inequalities hold.Finally, we combines (4.5.11), (4.5.12), (4.5.13), the result is completed.

Convergence Radius and Main Result
2 log 10 2 log 10 1 where we only need Then we get ( ) Combine (6.2) and (6.10), we have  .We thus derive the following inequality, which plays an important role in our mathematical analysis.

Conclusion
In this paper, we have a discussion of the lowest convergence rate of quantile regression with manifold regularization optimizing the intrinsic structure using the unlabeled data.The main result is to establish an upper bound for the total error showing less than . Meanwhile, the quantile regression provides a piecewice linearity but a convex technique to overcome difficulties such as a high nonlinearity dependence on the predictor and linear suboptimal models.Finally, the sparsity is analysised in the 1 l space.


and Y ⊂  .There is a probability distribution ρ on X Y × according to which examples are generated for function learning.Labeled examples are ( ) , the generalization error to minimize the conditional τ-quantile function f ρ with the loss function τ ρ

with measure at most 5 δ
2 and (4.5.8), there existing r V γ such that Open Journal of Statistics for every