Long-Term Electrophysiological and Behavioral Analysis on the Improvement of Visual Working Memory Load, Training Gains, and Transfer Benefits

Recent evidence demonstrates that with training, one can enhance visual working memory (VWM) capacity and attention over time in the near transfer tasks. Not only do these studies reveal the characteristics of VWM load and the influences of training, they may also provide insights into developing effective rehabilitation for patients with VWM deficiencies. However, few studies have investigated VWM over extended periods of time and evaluated transfer benefits on non-trained tasks. Here, we combined behavioral and electroencephalographical approaches to investigate VWM load, training gains, and transfer benefits. Our results reveal that VWM capacity is directly correlated to the difference of event-related potential waveforms. In particular, the “magic number 4” can be observed through the contralateral delay amplitude and the average capacity is 3.25-item over 15 participants. Furthermore, our findings indicate that VWM capacity can be improved through training; and after training exercises, participants from the training group are able to dramatically improve their performance. Likewise, the training effects on non-trained tasks can also be observed at the 12th week after training. Therefore, we conclude that participants can benefit from training gains, and augmented VWM capacity sustained over long periods of time on specific variety of tasks.


Introduction
Visual working memory (VWM), or visual short-term memory, refers to a limited amount of information storage within few seconds [1], which is associated with important cognitive modalities, including attention, perception, reasoning, comprehension, and language acquisition [2] [3]. VWM also plays a critical role in preserving and processing information, and its capacity has been suggested to be a sensitive predictor of cognitive ability [4]. For example, researchers have implicated that VWM capacity can distinguish healthy or memory-impaired individuals suffering from attention-deficit hyperactivity disorder (ADHD) [5], schizophrenia [6], stroke [7], Alzheimer's Disease [8]- [11], or age-related diseases associated with memory deficits [12] [13]. Recent evidence demonstrates that brain training can enhance an individual's VWM capacity and attention over time [12] by increasing activity in the prefrontal cortex, the parietal cortex, and the basal ganglia [14] [15]. Not only do these studies reveal characteristics of VWM load and effects of training, they may also provide insights into effective rehabilitation means for patients with low VWM capacity. Furthermore, healthy individuals who seek to enhance their intellectual performance may also benefit from the training [14].
Despite potential applications of VWM, very few studies have investigated VWM over extended periods (i.e. beyond 5-weeks) and further evaluated transfer benefits on nontrained tasks [14]. Most of the research has been focused on distinguishing different memory systems and memory-processing phases to have a better understanding of memory characteristics and functions [16]. Considering this information is important correlated to VWM training, we employed a combined behavioral and electrophysiological test to reveal the impact of VWM load, training, and transfer effects on memory capacities and taskrelated performances. Event-related potentials (ERP) as the result of VWM information processing were recorded. Arrays of colored squares were used to estimate the VWM capacity through computerized tasks, broken into several experimental blocks. In addition, three features (color, position, shape) from tasks were taken into account in order to establish a complete model of VWM capacity.
The goal of this study is to evaluate the changes of the neural activity that accompany the training to perform a specific variety of tasks. We hypothesized that participants would exhibit increased neural activities in raising attention and memorization as a result of training paradigms to improve VWM. Three major experiments were conducted: (1) VWM load experiments. We estimated VWM capacity behaviorally through accuracy measurements and electrophysiologically through the amplitude level of contralateral delay activity (CDA). CDA is the contralateral negativity during the memory period of VWM experiments and reflects to the number of items in the memory array. CDA amplitude increases as number of items increase up to individual's VWM limit [17]. (2) VWM training gain experiments: VWM training was developed to expand memory capacity over a period of 12-week. (3) The training effects to untrained tasks were studied in VWM transfer benefit experiments. Subjects were divided into two groups, training and control groups, to evaluate the behavioral evidence and neural activity changes. The transfer benefits were calculated by comparing between pre-training and post-training.

Ethical Statement
The experimental protocols were approved by the Louisiana Tech University's Institutional Review Board Committee. Informed written consent was obtained before experiments began.

Participants
Three experiments were implemented in this study (VWM load, VWM training gains and VWM transfer benefits). There were total of fifteen healthy young participants (18-31 years of age, including five females and ten males) in the VWM load task (the first experiment), five of the subjects in the training gain study (the second experiment), and ten subjects in transfer benefit tasks (the third experiment; five people per training or control group). All participants had normal or corrected to normal eye sight. None of them had a reported history of neurological or psychological disorder. All of them had no prior experience with the computerized VWM training.

Stimuli and Procedure
Participants were seated in front of a 17-inch computer screen and given visual cues at a distance of 80 -100 cm in a dark sound-attenuated room. All stimuli were presented on a white background (RGB = 255, 255, 255). They were instructed to respond to the VWM arrays ("change" or "no-change" tasks) by pressing a button on a response pad as quickly and accurately as possible in assigned tasks. All experiments were conducted between 9:00 am to12:30 pm. Each subject was given 20 min of practice time prior to starting the VWM load task in order to minimize the individual differences and effects.
The sequence of each trial is shown in Figure 1 . The color of each square was randomly chosen one at a time (no repetitive color appeared in the same memory array on each side). These positions of colored squares were also randomly arranged in each trial. Fifty percent of the trials had the same colored and oriented squares in both memory and test arrays. The rest of the trials presented different targets in the test array. Each memory and test array pair was separated by a 900 ms retention interval. The test array would last, at most, 2000 ms, or until the subject responded. A 500-ms inter-trial interval would directly follow the termination of the test array. Figure 1(b) demonstrates three switch types in the VWM load experiment (the first experiment). VWM load experiment contained 2, 4, and 6 items on each side in the memory array and task difficulty levels were displayed from easy to complex conditions (2 to 6 items) in order to adapt the visual stimuli for subjects.
In the training gain experiment (the second experiment), participants completed 2 hours of VWM training one day per week in total of 12 weeks and the entire training time was 24 hours. The difference between the previous and the training gain experiments was that 8 colored items were included in the memory array to raise task difficulties. All experiments were recorded to observe the performance changes weekly. The transfer benefit task (the third experiment) not only required the observers to memorize colors and positions in the memory arrays, but also the shape of a particular item displayed in the test array (changedetection: No-change or change occur in color, position, or shape) is shown in Figure 1(c). The transfer benefit data from participants in the training group before and after the 12-week training session were labeled as TB1 and TB2, respectively. For comparison, the control group was assigned to participate only at the beginning and at the end of a 12-week period, without any training sessions in between. In order to reduce user fatigue, the recording sessions were separated into blocks with 5 min breaks. Depending on the experiment, each block consisted of 100 trials of VWM load experiment, 150 trials for training gain, or 150 trials for transfer benefit experiments. Consequently, the VWM load task was divided into 6 blocks, the training gain and transfer benefit tasks were separated into 4 blocks, with each block lasting approximately 10 min. This procedure resulted in total 600 trials per recording session. The experiment time required for each day was approximately 2 hr.

Electroencephalogram Recording
Electroencephalogram (EEG) signals were recorded using Net Amps 300 that connects a high-density 128 channel HydroCel Geodesic Sensor Net [18] (Electrical Geodesics Inc., Eugene, OR) with Net-Station 4.3 software. All electrode impedances were below 50 KΩ before recording was started [19]. Regions of interest (ROIs) around the parietal-occipital cortex were selected from the following standard international 10/20 posterior parietal electrodes for PO3 and PO4. Electrodes were referenced to the average Cz channel. All signals were anti-aliasing, low-pass filtered at 100 Hz, and digitized at a sample rate of 250 Hz.

EEG Preprocessing and Artifact Removal
The EEG data were filtered between 0.1 -30 Hz. The data were segmented into individual conditions based on the number of items in the memory array. Each data segment, or epoch, lasted 1200 ms, and consisted of a 200 ms pre-memory array, a 100 ms memory array, and a 900 ms retention interval. The data were re-referenced to the average signal across all 128 electrodes. Bad channel replacement and bad trial elimination were applied to the data. The first 200 ms of each trial was used for baseline correction. The artifacts of poor skin contact, eye blink, eye movement, or muscle movement were detected and removed using independent component analysis (ICA) with extended Infomax algorithm [20] in EEGLAB [21]. Artifacts related ICs were removed and the remaining ICs were projected back to scalp for further ERP analysis. The details of ICA denoising approach can be seen by other studies [22]- [24].

Event-related Potential (ERP)
EEG waveforms were selected from parietal-occipital cortex area. Each condition, ERPs were obtained by averaged trials across participants. CDA (300-900 ms) was measured as the amplitude difference of ERPs between electrodes contralateral and ipsilateral to the location of the task-relevant cue in the memory array [25]. We acquired data from PO3/PO4, because the CDA amplitude was consistently high. However, the same patterns can be obtained over P3/P4, P5/P6, T5/T6, and O1/O2 electrode pairs [26].

Behavioral Measures and VWM Capacity
For each participant, the reaction time (RT) was calculated by the correctness of subject responses [27]. Incorrector no response trials were excluded from the analyses. Accuracies in percentage of correct responses ± standard error were calculated from all recorded trials. For VWM capacity, Pashler's formula [28] [29] has been used to estimate the VWM capacity.
, where K is the memory capacity, S is the number of memory array size, H is the hit rate, and F is the false alarm rate. Generally, WM capacity has been considered to be limited [30]. The assumption may assume that an observer can hold K index in memory from S items in the memory array, guided by the correct performance on VWM experiments. The commonly accepted capacity limit for an individual is 4-item [30]. Statistical analysis is done by one-way analysis of variance (ANOVA).

VWM Load Experiment
The goal of VWM load experiments was to estimate the VWM capacity. Because the information that can be maintained and stored in memory is limited, it is important to understand the differences that may impact one's ability to learn. Behavior and brain activity could predict the VWM capacity by task performance and the CDA amplitude of ERP [31] [32]. This VWM load experiment also included a parametric manipulation of the number of possible items in the memory array to further test the hypothesis that CDA amplitude can be used to understand VWM templates [26] [33]. The RT and CDA amplitude were measured at different levels of memory load.

Behavior-
The average accuracy decreased significantly as the number of items increased (96.46 ± 0.85%, 90.27 ± 1.51%, 77.5 ± 2.35% for 2-, 4-, and 6-items, respectively; (F(1,29) = 12.69, p < 0.005; F(1,29) = 20.78, p < 0.005), as indicated in Figure 2(a). The large drop in accuracy for 6-item suggest that a memory array of this complexity may have exceeded the individual's memory limit. The mean memory capacity was 3.25 items over 15 participants. In addition, the reaction time was significantly slower in 4-to 6-item conditions compared with 2-item (575.6 ± 25.6 ms vs. 602.0 ± 23.0 ms vs. 678.3 ± 28.1 ms, respectively; F(1,29) = 0.59, p > 0.05; F(1,29) = 4.42, p < 0.05). The reaction time was highly affected by the difficulty of memory array (Figure 2(b) bar graph). Figure 3 illustrates the ERP contralateral effects for the VWM load experiment. The average ERP waveforms of the 15 subjects for 2-, 4-, and 6-item conditions at PO4 electrode contralateral to the location of the left cue during the stimulus encoding phase are illustrated. The mean N1 (100-180 ms) component and the duration of 300-900 ms were associated with the contralateral ERP effects measured in Figure 3(a). The 3D scalp topographies of the VWM load experiments for 6-item at mean N1 (top) and the latency (300-900 ms) of mean ERP (bottom) demonstrate activated contralateral parietal-occipital regions (Figure 3(b)). Figure 4 shows the average ERP difference waveforms from the parietal-occipital electrodes for the VWM load experiments. The CDA components during the time duration 300 -900 ms after the memory cue were measured. A paired comparison of the mean CDA showed a significant increase from the 2-to 4-item condition (−0.2586 μV and −1.2212 μV; F(1,29) = 6.6, p < 0.005), whereas no difference was found between 4-and 6-item conditions (−1.2212 μV and −1.2904 μV; F(1,29) = 2.7, p > 0.05). This data provided statistical evidence that the CDA reaches a plateau at approximately 4-item which was the suggested maximum memory capacity for most people [34].

VWM Training Gain Experiment
The goal of the VWM training gain experiment was to develop VWM training procedures that would lead to the improvement of VWM performance. The effects of time period were materialized and investigated by comparing the difference in training gains for various levels of difficult tasks and observing the memory capacity improvement over 12-week.

Behavior-VWM
training gains within the 12-week intervention period are shown in Figure 5(a). At the beginning of the training gain tasks (1st week), the average accuracies of the 8-, and 6-item conditions were 56.67% and 79%, respectively. An accuracy exceeding 89% was found in the 4-and 2-item conditions. After 12 weeks of training, participants achieved significant improvements in the average accuracies (up to 80%, 88.33% and 96.67% for the 8-, 6-and 4-item trials, respectively). Because there was a high baseline level in the average accuracy (96%-99%) for the 2-item trials, no significant effect can be observed after 12-week of training. Particularly, the VWM performance may demonstrate an increasing trajectory from week to week. This result indicates that memory capacities of individuals have already shifted to the upper-limit level (index K ↑), which also indicates that the VWM capacities of the trainees have expanded through a long period of training. Consequently, Figure 5(b) shows that the reaction time has decreased after 12-weeks of training in all 2-item, 4-item, 6-item, and 8-item conditions (570 ms vs. 478 ms, 613 ms vs. 524 ms, 699 ms vs. 578 ms, and 751 ms vs. 563 ms; F(1,9) = 26.0, p < 0.005, F(1,9) = 29.3, p < 0.005, F(1,9) = 38.5, p < 0.005, F(1,9) = 69.3, p < 0.005, respectively), which suggests that the participants performed these tasks better in the last week than the beginning week.

VWM Transfer Benefit Experiment
The purpose of the VWM transfer benefit experiment was to study the impact of training gains on high-level cognitive VWM tasks (Change in color, position, and shape features). Subjects were separated into two groups, training and control groups, to evaluate the behavioral evidence and neural activity. Their performance on non-trained tasks was compared between pre-training (TB1) and post-training (TB2) sessions. The neural activity (TB2-TB1) that refers to changes of ERP amplitude can be used as a signal marker to illustrate the training effects.

ERP-
The ERP analysis of the VWM transfer benefit was applied to evaluate the 8item condition, because this highest difficult-level task show the greatest improvement (TB2 -TB1: 8%) for the trainees. In the training group, ERP(300 -900 ms) amplitude was significantly different from TB1 to TB2 as shown in Figure 6(c) (−2.91 μV vs. −1.52 μV; F(1,9) = 8.42, p < 0.05). However, there was no significant difference for the control group (−3.5 μV vs. −3.06 μV; F(1,9) = 0.006, p > 0.05). This diminished ERP in the training group could be used as a predictor of VWM training effects on non-trained tasks. Recent reports have implicated that training could enhance VWM capacity and attention over time and also increase the brain activities in the prefrontal and parietal-occipital cortex [6] [14]. We analyzed the transfer benefits by proposing a subtractive measure where the difference between the post-training TB2 and the pre-training TB1 was computed for both training and control groups. The positive increase in neural activity was generated by training. For the control group, no significant change in ERP activity was observed.

Discussion
This study demonstrates the long-term training effects of VWM. In summary, we found that VWM capacity can be estimated based on accuracy and CDA level. First of all, neural evidence for VWM capacity limit, which approximates 4 discrete representations "magic number 4" [30], is observed in CDA waveforms. In particular, the average capacity across 15 participants is 3.25 items, which is common and similar to other studies [31] [35] [36]. Next, in Experiment 2, we are particularly interested in a longer training period (up to 12 weeks) than other studies (5 weeks) [15] in order to enhance training effects that indeed raise the number of objects stored in VWM. Obviously, the reaction time is shorter after 12week of training in all 2-, 4-, 6-, and 8-item conditions, which suggests that the participants may feel more comfortable performing tasks at the end of training than the first week. Thus, VWM capacity can be improved through training over an extended period of time. Furthermore, in transfer benefit experiment, we also found that trainees have higher accuracy and memory capacity improvement than the control group. Explicitly, the increased neural activity is obtained in a subtractive measure (TB2 -TB1), which is the difference between post-training and pre-training for trainees. The training related changes of VWM capacity in K display the meaningfulness of the improvements, so the diminished ERP (300 -900 ms) can be used as a neural marker of VWM training effects on non-trained tasks where ERP waveform changes led to improve memory capacity and VWM performance. Therefore, these cognitive improved measurements are important to reveal the strength of the training effects on non-trained tasks.
Because VWM plays a critical role in change-detection, especially when the memory array becomes more complex, it becomes more demanding as the memory array becomes more difficult [2]. In general, low-item condition results in less information, higher efficiency, higher accuracy, and low CDA amplitude. On the other hand, when many items are in a memory array, the subjects' recall complex information is less efficient, resulting in a lower accuracy and higher CDA amplitude. Furthermore, individuals with low memory capacity depend on more working memory to perform VWM tasks. In contrast, participants with high memory capacity could perform VWM tasks much more easily and efficiently. Likewise, they are able to process and store more information during VWM experiments. Hence, this arrangement is supported by the accuracy, reaction time, VWM capacity, and CDA in VWM load experiments, representing that high-capacity individuals are more accurate and efficient under more complex conditions than low-capacity individuals. We hypothesize that those individuals with high VWM capacities can concentrate and manage distractions more effectively.
The results of this study clearly demonstrate that CDA component can be used to analyze the individual's VWM capacity. Because CDA increases until it reaches a plateau at the individual's limit, it can be observed for many types of WM studies, and the CDA pattern in the difference of ERP waveforms (Figure 4) has been consistent with other's findings in which the CDA amplitude from 2-object searching is twice as high as for a single target [26]. Meanwhile, the CDA has disappeared when the subjects would repeat the same searching tasks after a short period of time [37]. This might be due to the fact that the participants' VWM capacities are enhanced by training, or changed by the transition from a short-term memory to a long-term memory.
In addition, the present study also focuses on differences between trainees and controls in the transfer benefit experiment where the subjects would benefit from the training and then transfer to other tasks, such as near and far transfer experiments. In this study, we are mainly interested in the near transfer tasks based on the task similarity, whereas the far transfer tasks with major feature differences along with a non-adaptive control group of the visualdetect-training condition will be addressed in our future report [38] [39]. Because training gains and transfer effects can maintain across the 12-week intervention period, and the subtractive measure in transfer benefits suggests that neural activity induced by training often appears in VWM and attention [40], this study may assist cognitive decline and memory deficits due to normal aging by training and medication [41]. Moreover, the comparison between young and old adults would help understanding the training gains and transfer effects in systematic development [8] [42] [43], though we can expect that there are significant training gains for young adults in the most of VWM training paradigms.
Additionally, our method may be extended on a useful tool to predict an age-related trend in memory capacity through large cohorts of participants and the machine learning technique [44]. We are also interested in investigating how practice and training may impact an individual's performance with aging [45], and the efficiencies of various training programs for different age groups. Besides, integrating EEG and transcranial magnetic stimulation (TMS) or transcranial direct-current stimulation (TDCS) to understand the influence of TMS/TDCS training is also important [46]. Especially, studies on VWM raise many unanswered questions, such as the optimal duration period and the sufficient amount of training time. Likewise, the comparison between visual WM and verbal WM tasks will be necessary to improve WM method development. We believe that the plasticity of the brain enables it to become more effective in memory, attention, processing information, thinking innovation, and solving problems [47] [48] through effective novel brain training simulations [49].
Interestingly, working memory tasks have also been used to study schizophrenia patients where prefrontal inefficiency and cognitive deficits have been found [7]. At the same time, recent studies have suggested that new neurogenesis is ongoing throughout life, so training and cognitive exercises may have a positive impact on building up strong neural connections and creating new brain networks [14] [50]. In principle, neural mechanisms of VWM can be simplified by the recurrent feedback loop to explain the neural activity network during VWM retention interval [51]. The recurrent feedback loop is easy to maintain a simple object, but difficult to keep representative complex object, which requires the interaction of more neurons. Neural oscillations are the driver that involves both local and global communication networks to maintain the recurrent activation and represent different objects. Therefore, prefrontal and parietal-occipital functional network could be a key to understand neural mechanisms of VWM [52].
The EEG based VWM approach is a precise and sensitive method to directly detect the neural activity changes from parietal-occipital electrodes [53]. Likewise, compared to other neuroimaging methods [54] [55], such as functional magnetic resonance image (fMRI) [56], EEG is less expensive, less time consuming and easier to operate. Another advance is that EEG is capable to resolve neural dynamic changes of integrated cognitive activity over milliseconds, which helps to further understand the short-term memory representation, whereas the fMRI requires the resolution over few seconds [57]. Overall, these results support our hypothesis that participants can benefit from training gains, and demonstrate sustained impacts are present on VWM capacity over a long period of time regarding a specific variety of tasks. Moreover, the general cognitive improvement associated with training needs further investigations.    ERP difference waveforms of the VWM load experiment. The average ERP difference waveforms of 15 subjects between PO3 and PO4 electrodes during the stimulus encoding phase are shown. The ERP difference was measured between contralateral and ipsilateral waveforms to the location of the cue. The duration (300 -900 ms after cue onset) was the CDA component measured. The yellow bar (0 -100 ms) on the timeline represents the epoch of the memory cue.  VWM transfer benefit experiment results. (A)VWM transfer benefit experiment between subjects with and without weekly training. The training group achieved a significant improvement over the control group. (B) The improvements of VWM capacity for training and control group (C) ERP of TB1/TB2 for training and control groups. Signals from parietal-occipital (ROIs) have significantly diminished amplitude (300 -900 ms) for the training group at TB2, but not for the control group. *** p < 0.005.