Dynamic Spatial Discrimination Maps of Discriminative Activation between Different Tasks Based on Support Vector Machines

As a set of supervised pattern recognition methods, support vector machines (SVMs) have been successfully applied to functional magnetic resonance imaging (fMRI) field, but few studies have focused on visualizing discriminative regions of whole brain between different cognitive tasks dynamically. This paper presents a SVM-based method for visualizing dynamically discriminative activation of whole-brain voxels between two kinds of tasks without any contrast. Our method provides a series of dynamic spatial discrimination maps (DSDMs), representing the temporal evolution of discriminative brain activation during a duty cycle and describing how the discriminating information changes over the duty cycle. The proposed method was applied to investigate discriminative brain functional activations of whole brain voxels dynamically based on a hand-motor task experiment. A set of DSDMs between left hand movement and right hand movement were reached. Our results demonstrated not only where but also when the discriminative activations of whole brain voxels occurred between left hand movement and right hand movement during one duty cycle.


Introduction
Support vector machines (SVMs) [1][2][3] are a set of related supervised learning methods.A SVM has been applied to functional magnetic resonance imaging (fMRI) [4] data analysis [5][6][7][8][9][10][11][12][13][14][15].Few of these studies have focused on assessing discriminative activation dynamically [6,11].Mitchell et al. [6] used fMRI-sequence   1 2 , t t , i.e., the sequence of fMRI images collected during the contiguous time interval   1 2 , t t , as input to the classifier of different machine learning methods, including Gaussian Naive Bayes (GNB), SVM and k Nearest Neighbor (kNN).Mourão-Miranda et al. [11] used a SVM termed a spatial-temporal SVM to obtain a dynamic discrimination map, i.e., for each time point or TR within the duty cycle, it shows the discriminating weight of each voxel.By using the approach it is possible to observe dynamic changes in the brain during the performance of a task or a cognitive state, and some more temporal brain activities were possibly explored.However, this method can not be adapted to fMRI data of a single object because, in general, the number of the spatiotemporal observations of single object is much smaller than that of the features (voxels) in a spatiotemporal observation, which makes the number of training samples (spatiotemporal observations) input to SVM is too small to train for a SVM-based classifier.
In the present paper, basing on principal component analysis (PCA) [16,17] and SVM, a method was proposed to investigate discriminative brain functional activations dynamically between different tasks based on a hand-motor task experiment.By using our method, a series of dynamic spatial discrimination maps (DSDMs), which represent the temporal evolution of discriminative brain activation during a duty cycle and describe how the discriminating information changes over the duty cycle, were reached.The DSDMs allow us to visualize discriminative regions of whole brain between different cognitive tasks dynamically without any contrast by using conventional method such as statistical parametric maps (SPMs) [4].Moreover, our method is quite fit for a single-subject case, thus could overcome the disadvantages described above.The proposed method was also applied to fMRI data of hand-motor experiment to investigate the discriminative activations of whole brain voxels dynamically between left hand movement and right hand movement.
This paper is organized as follows.In Section 2, we will give the SVM-based method.In Section 3, the proposed method will be applied to fMRI data of left and right finger movement experiment.Some conclusions are drawn in Section 4.

Dimensionality Reduction
First, PCA is a method that searches for directions which have the largest variance in the data and, using these, projects the data into a new orthogonal coordinate system, the output is a lower dimensional representation of original data [14].In the current study, PCA was applied only for data compression without losing the information.The PCA was performed on the selected data only for one subject and the training data were projected onto the resulting singular vectors or basis.For a detailed description of PCA, we refer to the literature [9].
Before the dimensionality reduction, the experimental data were preprocessed using SPM2 software (http:// www.fil.ion.ucl.ac.uk/spm).Reconstructed images were corrected for slice timing effects and motion artifacts, as well as transformed to standard space [18] at 3 × 3 × 3 mm 3 , and spatial smoothing with an isotropic Gaussian kernel of 8 mm FWHM was also performed to increase the MR signal-to-noise ratio.The baseline and linear detrench components were removed by applying a regression model for each voxel.Finally, a mask which contains brain tissue for all subjects was applied to select voxels.
Let D be the M × N preprocessed fMRI data matrix with one volume per column and one voxel per row be D with the average volume of the data set subtracted from each column, where the submatrix ) presents continuous observations (Figure 1) within duty cycle i [11] with each column , , , , , where , , , , , , , , with N dimension was the projection of the volume

Support Vector Machines
Support vector machines (SVM) [1][2][3] are supervised learning methods, which have been applied to fMRI data [5][6][7][8][9][10][11][12][13][14].The main idea of SVM is to find the maximum-margin hyperplane (defined by the normal w and the distance to the origin of the multi-dimensional space b) which divides the points having 1 i y  (class 1) from those having 1 i y   (class 2).The primal form is that the data is linearly separable.The problem of finding the optimal separating hyperplane can be expressed by the following optimization problem:  (1) where x y  define the inner product of x and y.In the general case of overlapping classes the problem of finding the optimal separating hyperplane that maximizes the distance to the nearest training points of the two classes is defined as the following optimization problem: (5) where i 0, 1 i N     are slack variables that account for training errors and C is a positive real constant appearing only as an additional constraint on the Lagrangian multipliers.
Both cases above can be translated into their unconstrained dual form.By using Lagrangian multiplier method, the solution can be found as [2]: where 0 i   is the Lagrangian multiplier, which is constrained by 0 In the present work, a linear SVM is used to determine a weight vector p w (corresponding to a hyperplane in eigen space) comprising discriminative information of spatiotemporal observations in eigen space.The input to SVM is the examples of the form , i i x y , where i x represents a spatiotemporal observation in eigen space (see Figure 1) and i y is the task label ( 1 i y  for right hand movement task cycle (task 1) and 1 i y   for left hand movement task cycle (task 2)).
A spatiotemporal observation in eigen-space is described in Figure 1.In the eigen space, a single spatiotemporal observation, .g.,   p 1i ti Ti ', , ', , ' projected vectors of volumes within duty cycle i, which comprises a task block (10 volumes) and the following rest block (10 volumes).The spatiotemporal data are represented by a

Dynamic Spatial Discrimination Maps
Once the weight vector , we can project 1 2 , , into the original fMRI data space by   represents a map of the most discriminating regions, i.e., a discriminating volume [9,10] or a spatial discriminance map (SDM) [12].All weight vectors 1 2 T w w w  were normalized to have the same scale.Thus, we obtained a dynamic spatial discrimination maps (DSDMs) 1 2 , , , T w w w  .As a nonparametric technique, permutation test [20] has been previously applied to fMRI data analysis.This technology was used to determine the threshold of the DSDMs.Firstly, we applied the proposed method to the training data, i.e., the spatiotemporal observations in eigen space, and obtain the normalized weight vector w (Figure 2), which contained the DSDMs 1 2 , , , T w w w  as its sub-blocks.Then under the null hypothesis of no relationship between the class labels and the global structure of the spatiotemporal observations in eigen space, the class labels were permuting 2000 times randomly and each time our method was applied to the spatiotemporal observations with this permutation of labels to produce a normalized weight vector The performance of the classifier was estimated using the conventional leave-one-out cross-validation test [21].Leave-one-out cross-validation test was applied to test the performance of SVM-classifier in some previous fMRI studies [9][10][11][12][13][14].In the present work, 50 percent of all the spatiotemporal observations in eigen space are used for training the classifier and the rest is used for evaluating the classier.
Our method described above can be summarized in Figure 2.

Experimental Design
Stimuli were presented in a blocked fashion.There were two different active conditions: left hand movement and right hand movement，and a control condition (rest).The subjects were required to concentrate on the fixation cross in control condition and to move their right hand only when the symbol R was presented but to move their left hand only when the symbol L was presented.Each run comprised 16 blocks and each block lasted 20 seconds (20 seconds for rest, 20 seconds for right hand movement, 20 seconds for rest, 20 seconds for left hand movement, orderly and alternately).Before the fMRI experiment, subjects were trained for about 1 hour to ensure that they could perform the task correctly.

Statistic Parametric t-Maps (SPMt)
The results of statistical parametric t-maps between left hand movement and right hand movement block (right hand movement > left hand movement) were obtained by performing the general linear model (GLM) [4] analysis using SPM2 (http://www.fil.ion.ucl.ac.uk/spm).All voxels with the absolute t-value above the threshold 5.85 (p < 0.001, corrected) were shown in Figure 3 with color scale (light/dark blue for negative values and red/orange for positive values).Correspondingly, the active regions, the Talairach coordinates of voxel with maximum t-value, cluster size and the Brodman area of active voxels for left hand movement and right hand movement were described in Table 1, where the regions only with cluster size > 10 were displayed.

Dynamic Spatial Discrimination Maps
For each of the six subjects, our method was applied to produce a sequence of 20 dynamic spatial weight vectors (scaled to the range from -1.0 to 1.0 by each element of  a vector dividing the maximum absolute value of all the weight vectors), which can be termed as "dynamic spatial discrimination maps" (DSDMs) to differ "dynamic discrimination map" [11].
. The color scale identifies the most discriminating regions for each time point (light/dark blue for negative values, i.e., relatively more activation for left hand movement, and red/orange for positive values, i.e., relatively more activation for right hand movement).The first 10 rows (Figures 4(a The DSDMs described how the discriminating information changes over one duty cycle.It can be seen from Figure 4 that no voxels with a highly discriminating weight were found at 2 s.At 4s the first discriminating areas between left hand movement and right hand movement appear in red and blue, increase continuously until the fifth or sixth TR and decrease after the end of the image presentation.At 26 s few discriminating voxels in primary motor cortex appeared and disappeared completely at 28 s until 40 s.There is a 4-6 s BOLD signal delay (Figure 4(c)).It is possible to observe how the primary motor cortex discriminates between left hand movement and right hand movement through time.Figure 5 shows that more discriminating voxels for right hand movement than those for left hand movement appear at all time points from 2 s to 24 s, which suggests asymmetrical cortical activities in left and right hemispheres of brain in a way.

Performance of Classifier
Repeating the leave-one-out cross-validation test [21] 1000 times, we got the classifier performance of 88 ± 5.1%.

Discussion
In this paper, a new neuroscience imaging method described in Figure 2 was proposed to investigate discriminative brain functional activations of whole brain voxels dynamically based on a hand-motor task experiment.Our method could produce a set of DSDMs representing the temporal evolution of discriminative brain activation during a duty cycle dynamically.The proposed method extended the ability of present SVM-based methods not only on visualizing discriminative brain regions between different tasks but also being suit to a case of single subject.Our method focuses on the discriminative activations of whole brain voxels between two different cognitive or motor tasks dynamically.
There are two main advantages in our method.One is that a set of so called DSDMs could be reached by our method.The DSDMs allow us visualizing dynamically regions of whole-brain discriminative activation between two kinds of tasks without any contrast by using conventional method such as SPMs [4,22].Basing on fitting a GLM to each voxel's time series independently, SPMs [4,22] shows differences in blood oxygenation leveldependent (BOLD) response between tasks and the estimated significance of these differences.The SPM t-contrast was performed between left hand movement and right hand movement (right hand movement > left hand movement).The SPM t-maps (Figure 3) between left hand movement and right hand movement were obtained and the active regions were displayed in Table 1.
The other is that our method is fit for a single subject case.By using PCA-based dimension reduction, the spatiotemporal observations in eigen space (Figure 1) of a single subject allow SVM to train a classifier.Previous method constructed a spatiotemporal observation in saw data space [11].However, this method can not be adapted to fMRI data of a single object because, in general, the number of the spatiotemporal observations of single object is much smaller than that of the features (voxels) in a spatiotemporal observation, which makes the number of training samples input to SVM is too small to train for the classifier of SVM.The proposed method could overcome this disadvantage.For the just reason, we only display the results of a single subject instead of that of multi-subjects.
A spatial and temporal response factorization was also performed similar to the method described by Mourão-Miranda et al. [11].Some previous studies have assessed formally the issue of separability [11,23].Spatiotemporal factorization is an assumption implicit in conventional unsupervised approaches like ICA.The spectrum of eigenvalues obtained by PCA decomposition of the spatiotemporal weight vectors (corresponding to DSDMs) is presented in Figure 6.The first mode is higher than the second, the second mode is higher than the third, the  third mode is higher than the fourth and the fourth mode is higher than the fifth, the latter is higher than others, which suggest some degree of space time separability between the first five dynamic spatial discrimination maps.The results suggest that joint spatiotemporal analysis yields significant extra information over the separate spatial and temporal analysis.
The proposed method was applied to fMRI data of hand movement experiment.A series of DSDMs (Figure 4) were reached, which display discriminative regions over one duty cycle between left hand movement and right hand movement dynamically with p-value < 0.001.The DSDMs contains a lot of discriminating information between left hand movement and right hand movement.Firstly, the DSDMs demonstrated when the discriminative activations of whole brain voxels occurred and disappeared between left hand movement and right hand movement during one duty cycle.There were no discriminating voxels at 2 s.At 4s the first discriminating areas between left hand movement and right hand move-ment appeared (Figure 4(a)).Secondly, the DSDMs also demonstrated where the discriminative activations of whole brain voxels occurred.The main discriminating areas were primary motor cortex (M1).There were more activations in M1 of the right hemisphere when left hand movement while more activations in M1 and cerebelum of the left hemisphere when right hand movement.There were no discriminative activations in the supplementary motor area (SMA).This result is consistent with that by Sato et al. [14].In addition, the DSDMs described how the discriminating information changes over one duty cycle.The discriminating areas between left hand movement and right hand movement increased continuously from the second TR to the fifth or sixth TR and decreased after the end of the image presentation and disappeared completely at 28 s until 40 s (Figures 4(c) and  (d)).There was also a 4-6 s BOLD signal delay (Figures 4(a) and (c)).
The discriminative regions in the 5th spatial discrimination map (at 10 s) with p-value < 0.001 and cluster size of voxels > 10 between left hand movement and right hand movement are also given in Table 2.
It is possible to observe how the primary motor cortex discriminates between left hand movement and right hand movement through time.Figure 5 shows that more discriminating voxels for right hand movement than those for left hand movement appear at all time points from 2 s to 24 s, which suggests asymmetrical cortical activities in left and right hemispheres of brain in a way.
Furthermore, the asymmetrical cortical activities in left and right hemispheres of brain [24] were implied in the DSDMs.It can be observed from Figure 5 that there are more discriminating voxels for right hand movement  than for left hand movement at all time points from 2 s to 24 s.In conclusion, a SVM-based method presented in this paper could produce a set of DSDMs, which allow visualizing discriminative activation of whole-brain voxels dynamically between different tasks over a duty cycle without any contrast by using conventional method such as SPM [4,22].Moreover, our method is adapted to fMRI data of a single-subject.The results can be further input to a population inference [24].The proposed method is useful to detect the discriminative brain functional activations between complex cognitive tasks dynamically without any contrast.

Figure 1 .
Figure 1.A spatiotemporal observation in eigen space.In the eigen space, a single spatiotemporal observation, e.g.,   , , , ,      1i ti Ti V   p p p p sti v v v , is a vector consisted of T = 20 projected vectors of volumes within duty cycle i, which comprises a task block (10 volumes) and the following rest block (10 volumes).
vectors", which lie in the supporting hyperplane.For any other points, 0 i   .In fMRI neuroimaging field, w represents a map of the most discriminating regions, i.e., a discriminating volume [9,10] or a spatial discriminance map [12].Given two classes, task 1 and task 2, with the labels +1 and −1, a positive value in the map means that this voxel has higher activity during task 1 than during task 2 in the training examples that contribute most to the overall classification, i.e., the support vectors; negative value means lower activation during task 1 than task 2. For a detailed theory of SVM, we refer to the literatures [1-3,19].
tiotemporal observation matrix, where 1 * d T N  is the number of voxels in a spatiotemporal observation and 2 1 d I   is the number of the spatiotemporal observa-tions.

3. 1 .
Hand Movement Dataset Four male and two female subjects with age range 19-25 years participated in this fMRI study after giving informed consent approved by local Institutional Review Board approval.All subjects were right-handed with normal or corrected-to-normal visual acuity.None had a history of any neuropsychiatric disorders.The experiment was conducted at the University of Texas Health Science Center in San Antonio, Texas, with Siemens 3-T Magnetom Trio.The echo planar imaging (EPI) settings were as follows: repetition time = 2000 ms; matrix size = 64 × 64; voxel size = 3.75 × 3.75 × 4 mm; echo time = 30 ms; flip angle = 90.The first four scans of each run were discarded to allow for magnetic saturation effects.

Figure 3 .
Figure 3. SPM t-maps (right hand movement > left hand movement).All voxels with absolute t-value above of 5.85 (p < 0.001, corrected) are shown in color scale (light/dark blue for negative values and red/orange for positive values).

Figures 4 (
a), (b), (c) and (d) showed the DSDMs of a single subject.All Voxels with p-value < 0.001 are shown in color scale corresponding to the values in each weight vector ) and (b)) correspond to time points during left hand movement and right hand movement (Figure 4(a): 2-12s, Figure 3(b): 14-20s) and the following 10 rows (Figures 4(c) and (d)) correspond to time points during rest condition.

Figure 4 .
Figure 4. Dynamic spatial discrimination maps (DSDMs) of a single subject between left hand movement and right hand movement.All voxels with p-value < 0.001 are shown in color scale corresponding to the values in the weight vector (light/dark blue for negative values, i.e., relatively more activation for left hand movement, and red/orange for positive values, i.e., relatively more activation for right hand movement).

Figure 5 .
Figure 5.The number of discriminative voxels in primary motor cortex with left hand movement and right hand movement during one duty cycle.

Figure 6 .
Figure 6.Normalized eigenvalues of the spatiotemporal weight vectors.