Engineering, 2013, 5, 78-83
doi:10.4236/eng.2013.55B016 Published Online May 2013 (http://www.scirp.org/journal/eng)
Metasample-Based Robust Sparse Representation for
Tumor Classification
Bin Gan1, Chun-Hou Zheng1,2, Jin-Xing Liu1
1College of Information technology and communication, Qufu Normal University, Rizhao, China
2College of Electrical Engineering and Automation, Anhui University, Hefei, China
Email: ganbinganxinhai@126.com
Received 2013
ABSTRACT
In this paper, based on sparse representation classification and robust thought, we propose a new classifier, named
MRSRC (Metasample Based Robust Sparse Representation Classificatier), for DNA microarray data classification.
Firstly, we extract Metasample from trainning sample. Secondly, a weighted matrix W is added to solve an l1-regular-
ized least square problem. Finally, the testing sample is classified according to the sparsity coefficient vector of it. The
experimental results on the DNA microarray data classificati on pr o ve that the pr oposed algorithm is efficient.
Keywords: DNA Microarray Data; Sp arse Representation Classificatio n; MRSRC; Robust
1. Introduction
A tumor is a neoplasm from an abnormal growth of cells.
An accurate, effective and prompt treatment of tumor is
necessary for patient. But before treatment, how to clas-
sify the tumor is a more important mission because tu-
mors have many types. If you make a wrong analysis,
your treatment will become another killer.
DNA microarray is a biotechnology that simultane-
ously monitors the expression of tens of thousands genes
in cells. There are many methods have been used in tu-
mor classification through microarray gene expression
profiling like independent component analysis (ICA),
nonnegative matrix factorization (NMF), i.e. Since the
biological data is too large in scale and too complicated
in profiling, which not only bring a great difficulty to
save, search, process or analyze these data, but also bring
an big challenge to data mining technology. A new effi-
cient data mining technology is necessary for improving
the accuracy of tumor classification.
Sparse representation has been successfully used in
image processing applications [1], DNA microarray data
classification [2], and Text classification. Intuitively, the
sparsity of the coding coefficient vector for samples can
be measured by the 0-norm or -norm minimization
(-norm minimization is the closest convex function to
0-norm minimization) of it. The -norm minimization
is widely applied in sparse coding. Generally, the sparse
coding problem can be formulated as
l1
l
1
l
1
l
l
1
min
subject to 2
2
yD

(1)
where y is a given signal, e.g. the gene expression profile
of a sample. D is the dictionary of coding atoms,
is
the coding vector of y over D and
> 0 is a constant.
The processing means of this function for gene ex-
pression data can be interpreted as following: By coding
DNA microarray data y as a sparse linear combination of
the training samples D via the 0
l-norm or -norm
minimization(here we used -norm minimization) in
above function, SRC(Sparse Representation Classifica-
tion) classifies y by evaluating which class of training
samples could result in the minimal reconstruction error
of it with the associated coding coefficients.
1
l
1
l
But there are two important issues in this model. The
first one is that whether 1
is well enough to represent
the signal sparsely. there are many works having been
done for it. For example, adding a nonnegative constraint
to
[3]; introducing a Laplacian term of coefficient in
sparse coding [4]; Designing the sparsity regularization
terms by using the Bayesian methods [5] and using the
weighted 2-norm in the sparsity constraint [6]. So, the
first issue has been solved well. The Second one is that
whether the 2
l-norm term
l
2
2
yD

is effective
enough to represent the sign al fidelity when y is noisy or
has many outliers. By now there are only in [7,8] the
-norm was used to keep the coding fidelity of
1
l
1
yD
. Since we assume that the testing signal y
can be represented by the training sample D. But in prac-
tice this assumption may not hold well as the noisy or
outliers. So the 2
l- or -norm may not be robust
enough in DNA microarray data classification.
1
l
To improve the robustness and effectiveness of sparse
Copyright © 2013 SciRes. ENG
B. GAN ET AL. 79
representation based classification. In this paper, we
propose a Metasample-Based Robust Sparse Representa-
tion Classifier (MRSRC) for DNA microarray data clas-
sification using Sparse Coding and Robust theory. A
weighted matrix W is added in -regularized least
square problem to reduce noisy or outliers.
1
l
The rest of this paper is organized as follows. Section
2 presents the proposed MRSRC model and algorithm.
The metasample model of gene expression data are first
presented, then MRSRC is given. Section 3 presents the
numerical experiments. Section 4 concludes the paper.
2. MRSRC
2.1. Classification Based On Sparse
Representation
Sparse representation (SR) is a new effective method
based on -norm minimization of data processing. By
using SR technique, we can represent a new sample as a
linear combination of training sample set with sparse
coefficient, and classify new sample based on the sparse
coefficient vector. Sparse Representation Classification
(SRC) has been used in face recognition, tumor classifi-
cation [9] and performed well. The core idea of the SRC
is that a test sample can be well expressed by only using
the same class training sample. And in this condition,
there are only a few nonzero elements in the coefficient
vector of SR. In SR, we use -norm sparsity-con-
strained least square estimation to get coefficient vector.
SRC is different from traditional supervised learning
algorithm in that SRC hasn’t strict training and test proc-
ess. So it has no over learning problem.
1
l
1
l
However, using original training sample as dictionary
directly can not express a new test sample well enough
sometimes. At the same time, if there are too much
training samples, the speed of algorithm will slow down.
To solve this problem, we first extract metasamples from
each class training samples, then using them to construct
the dictionary. The detailed processes are listed in the
following Section.
2.2. Maint aining the In tegrity of the Specific ation
A typical characteristic of DNA microarray data is that
gene amount is much more than the number of samples.
Generally, the number of samples is about hundreds, but
there are thousands of genes in each sample. Which
makes that many classic classification methods can’t be
used in DNA microarray data analysis. Fortunately,
methods of choosing related features or extract new fea-
tures can solve well this issue. So far, there are many
documents have studied how to use genes selection to
classify the tumor samples, like [10-12], and how to ex-
tract new features, e.g., metasamples [16]. Alter et al. [13]
used SVD to transform the gene expression data from the
“genes
samples” space to diagonalized “eigengenes
eigenarrays” space, where the eigengenes are unique
orthonormal superpositio n of the genes. Brunet et al. [14]
used NMF to describe the gene expression data through a
fe w of metasamples. (Figure 1)
Metasamples of gene expression data is defined as a
linear combination of several samples. We factorized the
gene expression data set matrix A into two matrices
~
A
WH , (2)
where matrix A is of size . In matrix A, the col-
umn represents the expression level of genes in a sample.
Each row means the expression level of one gene through
all samples. Matrix H is of size. Matrix W is of
size
nm
km
nk
.
From the above analysis, it can be seen that there are
many methods to extract the metasamples, such as SVD,
PCA [13], ICA [15], NMF [14], etc. In consideration of
algorithm’s simple and fast features, in this paper we use
SVD to extract the metasamples.
We extract metasamples from the samples in class i
and denoted them as i, then put them together, which
constructed the new dictionary.
W
D = 12
,,,
k
WW W

, (3)
2.3. The RSRC (Robust Sparse Representation
Coding) model
The sparse coding model in Eq. (1) is equivalent to the
LASSO problem [22]:
2
2
min yD
subject to 1
, (4)
where
> 0 is a constant, y =
12
;;;
n
yy y n
R
is
the signal to be coded, D =
12
dd,,,
m
d nm
R
is the
dictionary with column vector
j
d being the atom,
and
th
j
is the coding coefficient vector. Here the resid-
ual e =yD
follows Gaussian distribution.
We can see that a sparsity-constrained least square es-
timation problem is necessary to the sparse coding prob-
lem in Eq. (4). If e follows Laplacian distribution, this
solution will be
1
min yD
subject to 1
, (5)
Figure 1. The metasample model of gene expression data.
Copyright © 2013 SciRes. ENG
B. GAN ET AL.
80
However, the residual e may be far from Gaussian or
Laplacian distribution because of noisy or outliers. So the
conventional sparse coding models in Eq. (4) and Eq. (5)
may not be robust and effective enough for DNA mi-
croarray data classification. In order to build a more ro-
bust model for DNA microarray data classification, we
rewrite D as D=
12
;;;
n
rr r , where row vector is the
row of D, and set e = i
r
th
iyD
=
12
;;;
n
ee e . Then we
get iii
eyr
 . And i
e are distributed according to
some probability density function ()
i
f
e
With consideration of the sparsity constraint of
.
, the
RSRC can be formulated as [23]:
1
min( )
n
ii
iyr
subject to 1
, (6)
where ()
i
e
= ln()
i
f
e
and it has the following
properties: (0)
is the global minimal of ()
i
e
;
()
i
e
= ()
i
e
; ()
i
e
<()
j
e
if i
e>j
e, and we
let (0)
= 0.
From Eq. (6), we can see that the propose RSRC
model is a more general sparse coding model. Eq. (4) and
Eq. (5) are special cases of it when it follows Gaussian
and Laplacian distributions.
Now by solving Eq. (6), we can get the coding coeffi-
cient vector
. But one key problem is how to determine
the distribution of
. From above analysis we can see
that taking
as Gaussian or Laplacian distribution
directly is not effective or robust enough. So we do not
determine
directly to solve Eq. (6). Insteadly, we
transform Eq.(6) into an iteratively reweighted sparse
coding problem[23]:
2
1/2
2
min ()WyD
subject to 1
, (7)
where W is a diagonal matrix:
'
,0, 0,
() ()/
ijii i
Wwe ee

 0,
, (8)
Eq. (7) is a weighted LASSO problem. Because W
needs to be estimated by using Eq. (8). Eq. (7) is a local
approximation of the RSRC in Eq. (6) at , and W
should be updated using the residuals in previous itera-
tion via Eq. (8). Using Eq. (7) the determination of dis-
tribution
0
e
is transformed into the determination of W.
As the logistic function has properties similar to the
hinge loss function in SVM, the weight function can take
22
() exp()/(1exp())
ii
weu ueu ue


i
, (9)
where u and
are positive scalars. u controls the de-
creasing rate from 1 to 0, and
controls the location of
demarcation point. Through Eq. (9), Eq. (8) and (0)
= 0, we get
2
1
()(ln(1 exp()) ln(1exp()))
2
ii
euue
u

 , (10)
The sparse coding models in Eqs. (4) and (5) can be
interpreted by Eq. (7). When = 2, we will get the
model in Eq. (4). If we let i
=
()
i
we
()we 1/ i
e, we can get
the model in Eq. (5). Eq. (7) has the following advantage:
outliers will be assig ned with low weights to reduce th eir
affects. The weighted function of Eq. (9) is bounded in
[0,1].
According to Eq. (7, 8, 9, 10), the detailed algorithm to
get
is following as:
Start from t = 1:
1) Compute resisual . ( is first ini-
tialed in Eq.(11))
() ()
rec
y
t
eyt()t
rec
y
2) Estimate weights as
() ()()
() (
(
tt t
tt
 

() 2
()
())() 2
exp() )
()
1exp(() )
t
ti
it t
i
e
we e

, (14)
where ()t
and ()t
are parameters estimated in the
iteration.
th
t
3) Sparse coding:
2
*()1/2
2
min ()()
t
WyD

subject to 1
|| ||
where is the estimated diagonal weight matrix with
.
()t
W()tt() ()Wwe
,ii i
4) Update the sparse coding coefficients:
If t = 1, () *t
;
If t >1, ;
()(1tt)()*(1)
()
t t
 

 
()t
where 0 <
< 1 is the step size. ()t
can be searched
from 1 to 0 by the standard line-search process.
5) Compute the reconstructed test sample:
()t
rec
y = D()t
, and let t = t + 1.
6) Return step 1 until convergencing, or reached the
maximal number of iterations.
Iterations finished and output sparsity coefficient vec-
tor
.
2.4. Algorithm of MRSRC
When we get a testing sample y, in order to initialize the
weight, we set e as ini
eyy
y
. In this paper, we com-
pute as
ini
y
ini D
m
(11)
where
D
m is the mean of all training samples.
At this algorithm, W will change as e in Eq. (8) at
every iteration. We stop the iteration if the following
condition satisfying :
()( 1)( 1)
22
/
tt t
WW W

<
, (12)
where
is a small positive scalar, t is amount of itera-
tion.
After the iteration, we get the coefficient
, then clas-
sify y using the following function:
2
min( )()
i
iryyDi
 (13)
where =
ˆi
y()
i
D
is a rebuilt testing sample by the
class metasamples. Then we can classify y according to
th
i
Copyright © 2013 SciRes. ENG
B. GAN ET AL. 81
the difference between y and . For example, if the
difference between y and is smallest, y is classified
to the class.
ˆi
y
3
ˆ
y
3th
The classification algorithm is summarized as follows:
Input: matrix of training samples A = [12
,,,
k
A
AA

n]
for k classes; testing sample y
mn
R
R
and
initialized as.
(1)
rec
y
ini
y
Step 1: mormalize the columns of A to have unit
-norm.
2
l
Step 2: Extract the metasamples of every class using
SVD and get D.
Step 3: calculate the sparse coefficient
using Eq.
(7).
Step 4: Compute the difference:
2
() (
ii
D)ry y
(
i
r y

rgm
Output: i, i.e., The informa-
tion of which class y belongs to.
() ain)identity y
From the algorithm it can be seen that MRSRC is the
combination of RSC and metasample based cluster. In
MRSRC, the complexity of algorithm depends on the
number of iterations t, which depends on the percentage
of outliers in the DNA microarray data set. Generally, the
number of iteration takes 2, unless the percentage of out-
liers is too big. At that instance, t should be taken about
10 to ensure the algorithm to reach convergence.
3. Experimental Results
In this section, experiments were performed to show the
efficiency of the proposed method.
3.1. Parameter Selection
In the weight function Eq. (9), there are two parameters,
i.e.,
and
.
is the parameter of demarcation point.
We compute the value of
as follows.
Denote by22
) ,2
12
[(),(,]ee e( )
n
 . We sort
ele-
ments in an ascending order, then we can get a new array
. Letkn


, where scalar
0,1
, and n
outputs the largest integer smaller than n
. We set
as
k

(15)
controls the decreasing rate of weight value from 1
to 0. We set/c
, where c is a constant. According to
lots of experiments, we usually set c as 8, and set
as
0.8 in our experiments.
3.2. Two-Class Classification
In this subsection, we use three microarray data set to
study the tumor classification, they are Colon cacer data
set, Prostate cancer data set and DLBCL data set. The
data set datails are listed in Table 1. All these three data
sets have two class samlples. Colon data set has 2000
genes, Prostate data set has 12600 genes and DLBCL
data set has 54 69 genes.
We used the proposed to classify these dataset. Fro
comparison, we also used other three methods, i.e., SVM,
LASSO and SRC, to classify these experimental datasets.
The classification results are listed in Table 2. In our
method, SVD is used to extract the metasamples of gene
expression data. Here, we choose 3 dimensions' meta-
samples when we extract these two class samples. And as
the samples are not big enough, we use the nested strati-
fied 10-fold cross validation to get a more accurate re-
sult.
From Table 2 we can see that, MRSRC have a good
classification performance in Colon cancer datasets and
DLBCL datasets. But in Prostate cancer datasets, even
MRSRC is not better than SRC, but it has an advantage
over SVM and LASSO. SRC is best in Prostate datasets,
but as not well as MRSRC in the other two datasets. In
all, in this Two-Class Classification experiment, MRSRC
has well results.
To better illustrate results, we show the accuracies of
our methods MRSRC in Figure 2 when t = 10. The
nested stratified 10-fold cross validation is also be used.
Table 1. Three type of tumors for DNA classification ex-
periments.
samples
Datasets Class 1 Class 2 Genes
Colon cancer 40 22 2000
Prostate cancer 77 59 12600
DLBCL 58 19 5469
Table 2. The classification accuracies by different methods.
Datasets SVM LASSO SRC MRSRC-SVD
Colon
cancer 85.48 85.48 85.48 90.32
Prostate
cancer 91.18 91.91 94.85 93.38
DLBCL 96.10 96.10 97.40 98.70
Figure 2. The classification accuracy with t = 10.
Copyright © 2013 SciRes. ENG
B. GAN ET AL.
82
The x-axis represents the k-dimension. The y-axis repre-
sents the accuracy of the classification. From Figure 2,
we can see that when the dimension of metasample is 3,
the best classification accuracy can be reached. This re-
sult fully shows the ad vantage of metasample in redu cing
calculation complexity.
3.3. Multiclass Classification
To further investigate the performance of our method, we
also used five multiclass data sets to do experiment. All
the five data sets were produced by oligonucleotide mi-
croarrays. They are the Lung cancer data set [17], which
contains 4 lung cancer types, includes 203 samples with
12600 genes. The Leukemia data set [18], which contains
3 kinds of samples, includes 72 samples with 11225
genes. The SRBCT(Small round blue cell tumors of
childhood) data set [19], which contains 4 types of tu-
mors, includes 83 samples with 2308 genes. The
11_Tumors data set [20], which contains 11 various hu-
man tumor types, includes 174 samples with 12533 genes.
The 9_Tumors data set [21], which contains 9 various
human tumor types, includes 60 samples with 5726
genes.
The results are listed in Table 3. From the experiments
we can see that, for multiclass classification the proposed
MRSRC does not have clear advantages over SVM and
SRC. The reason is that the training samples are very few
so that the extracted metasamples cannot fully represent
the information of these classes. For example, 9_Tumors
data set only have 60 samples but 9 classes. One class
only has 7 samples so that the training samples can not
fully represent the testing sample.
4. Conclusions
In this paper, using Sparse Coding and Robust theory, we
proposed a Metasample-Based Robust Sparse Represen-
tation Classifier (MRSRC). Comparing MRSRC with
SRC and SVM on various types of DNA microarray data,
the experimental results valida ted that MRSRC is effective
and efficient in tumor classification. One important advan-
tage of MRSRC is that MRSRC show robustness to various
types of outliers or noisy because the reweighted func-
tion can reduce the outlier’s affection in each iteration.
Table 3. The Multiclass Classification Accuracies by Dif-
ferent Methods.
Dataset SVM SRC MRSRC-SVD
Lung cancer 96.05 95.07 95.07
Leukemia 96.60 95.83 97.22
SRBCT 100 100 100
11_Tumors 94.68 94.83 95.40
9_Tumors 65.10 66.67 60.00
One should be noted is that our method is based the as-
sumption that any testing sample can be well represented
as a linear combination of the training samples from the
same class. This means that the training samples should
be many enough. Otherwise, the experimental results
may be not famous. In future, we will use gen e selection
and SVM or NMF to further reduce training samples’
dimension, speed up calculation and improve accuracy of
classification.
5. Acknowledgements
This work was supported by the National Science Foun-
dation of China under Grant No. 61272339, the Science
Foundation of Anhui Province under Grant No.
1308085MF85, the China Postdoctoral Science Founda-
tion Funded Project under Grant No. 2012M510091 , and
the Key Project of Anhui Educational Committee under
Grant No. KJ2012A005.
REFERENCES
[1] J. Mairal, M. Elad and G. Sapirol, “Sparse representation
for color image restoration,” IEEE Transactions on image
processing, Vol. 17, No. 1, pp. 53-69, 2008, 625.
[2] C. H. Zheng, L. Zhang, T. Y. Ng, Simon C. K. Shiu and
D. S. Huang, “Metasample-Based Sparse Representation
for Tumor Classification,” IEEE/ACM Transactions on
Computational Biology and Bioinformatics, Vol. 8, No. 5,
2011, pp. 1273-1282. doi:10.1109/TCBB.2011.20
[3] Y. N. Liu, F. Wu, Z. H. Zhang, Y. T. Zhuang and S. C.
Yan, “Sparse face recognition under nonnegative curds
and whey,” IEEE Conference on Computer Vision and
Pattern Recognition, 2010,p. 626.
[4] S. H. Gao, I. W. H. Tsang, L. T. Chia and P. L. Zhao.
“Local Features Are not Lonely-Laplacian Sparse Coding
for Image Classification,” IEEE Conference on computer
vision and pattern recognition, 2010, p. 626.
[5] S. H. Ji, Y. Xue and L. Carin. “Bayesian Compressive
Sensing,” IEEE Transactions on Singal Processing, Vol.
56, No. 6, 2008, pp. 2346-2356.
[6] J. J. Wang, J. C. Yang, K. Yu, F. J. Lv, T. Huang and Y.
H. Gong, “Locality-Constrained Linear Coding for Image
Classification,” IEEE Conference on Computer Vision
and Pattern Recognition, 2010, p. 626.
[7] J. Wright and Y. Ma, “Dense Error Correction Via l1
Minimization,” IEEE Transactions on Information The-
ory, Vol. 56, No. 7, 2010, pp. 3540-3560.
[8] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry and Y. Ma,
“Robust Face Recognition Via Sparse Representation,”
IEEE Transactions on pattern analysis and machine in-
telligence, Vol. 31, No. 2, 2009, pp. 210-227.
[9] X. Huang and F. X. Wu, “Sparse Representation for
Classification of Tumors Using Gene Expression Data,”
Journal of Biomedicine and Biotechnology, Vol. 2009,
No. 1, 2009, pp. 1-6. doi:10.1155/2009/403689
[10] T. R. Golub, D. K. Slonim et al., “Molecular Classifica-
Copyright © 2013 SciRes. ENG
B. GAN ET AL.
Copyright © 2013 SciRes. ENG
83
tion of Cancer: Class Discovery and Class Prediction by
Gene Expression Monitoring,” Science, No. 286, 1999,
pp. 531-537. doi:10.1126/science.286.5439.531
[11] J. G. Liao and K. V. Chin, “Logistic Regression for Dis-
ease Classification Using Microarray Data: Model Selec-
tion in a Large p and Small n Case,” Bioinformatics, Vol.
23, No. 15, 2007, pp. 1945-1951.
doi:10.1093/bioinformatics/btm287
[12] H. H. Zhang, J. Ahn, X. L in an d C. Par k, “Gen e Sele ction
Using Support Vector Machines with Non-Convex Pen-
alty,” Bioinformatics, Vol. 22, 2006, pp. 88-95.
doi:10.1093/bioinformatics/bti736
[13] O. Alter, P. O. Brown and D. Botstein, “Singular Value
Decomposition for Genome-Wide Expression Data Proc-
essing and Modeling,” Proceeding of the National
Academy of Sciences of the United State of America, Vol.
97, 2000, pp. 10101-10106.
doi:10.1073/pnas.97.18.10101
[14] J. P. Brunet, P. Tamayo, T. R. Golun and J. P. Mesirov,
“Metagenes and Molecular Pattern Discovery Using Ma-
trix Factorization,” Proceeding of the National Academy
of Sciences of the United State of America, Vol. 101, No.
12, 2004, pp. 4164-4169.doi:10.1073/pnas.0308531101
[15] D. S. Huang and C. H. Zheng, “Independent Component
Analysis-Based Penalized Discriminant Method for Tu-
mor Classification Using Gene Expression Data,” Bioin-
formatics, Vol. 22, No. 15, 2006, pp. 1855-1862.
doi:10.1093/bioinformatics/btl190
[16] W. Liebermeister, “Linear Modes of Gene Expression
Determinded by Independent Component Analysis,” Bio-
informatics, Vol. 18, 2002, pp. 51-60.
doi:10.1093/bioinformatics/18.1.51
[17] A. Bhattacharjee et al., “Classification of Human Lung
Carcinomas by mRNA Expression Profiling Reveals Dis-
tinct Adenocarcinoma Subclasses,” Proceeding of the Na-
tional Academy of Sciences of the United State of Amer-
ica, Vol. 98, 2001, pp. 13790-13795.
doi:10.1073/pnas.191502998
[18] S. A. Armstrong, J. E. Staunton, L. B. Silverman, R.
Pieters, M. L. den Boer, M. D. Minden, S. E. Sallan, E. S.
lander, T. R. Golub and S. J. Korsmeyer, “MLL Translo-
cations Specify a Distinct Gene Expression Profile that
Distinguishes a Unique Leukemia,” Nature Genetics, Vol.
30, 2002, pp. 41-47. doi:10.1038/ng765
[19] J. Khan, J. S. Wei, M. Ringner, L. H. Saal, M. Ladanyi, F.
Westermann, F. Berthold, M. Schwab, C. R. Antonescu,
C. Peterson and P. S. Meltzer, “Classification and Diag-
nostic Prediction of Cancers Using Gene Expression Pro-
filing and Artificial Neural Networks,” Nature Medicine,
Vol. 7, No. 6, 2001, pp. 673-679. doi:10.1038/89044
[20] A. I. Su et al., “Molecular Classification of Human Car-
cinomas by Use of Gene Expression Signatures,” Cancer
Research, Vol. 61, No. 20, 2001, pp. 7388-7393.
[21] J. E. Staunton et al., “Chemosensitivity Prediction by
Transcriptional Profiling,” Proceeding of the National
Academy of Sciences of the United State of America, Vol.
98, No. 19, 2001, pp. 10787-10792.
doi:10.1073/pnas.191368598
[22] R. Tibshirani, “Regression Shrinkage and Selection Via
the Lasso,” Journal of the Royal Statistical Society B, Vol.
58, No. 1, 1996, pp. 267-288.
[23] M. Yang, L. Zhang and J. Yang et al. “Regularized Ro-
bust Coding for Face Recognition,” 2012.