BCI inside a virtual reality classroom: a potential training tool for attention

Background: A growing population is diagnosed with Attention Deficit Hyperactivity Disorder (ADHD) and are currently being treated with psychostimulants. Brain Computer Interface (BCI) is a method of communicating with an external program or device based on measured electrical signals from the brain. A particular brain signal, the P300 potential, can be measured about 300 ms after a voluntary cognitive involvement to external stimuli. By utilizing the P300 potential, we have designed a BCI- assisted exercising tool targeting attention enhancement within an immersive 3D virtual reality (VR) classroom. Methods: Combining a low-cost infrared camera with an “off-axis perspective projection” algorithm to achieve the illusion of 3D, an engaging training environment has been created. The setup also includes a single measurement electrode placed on the scalp above the parietal lobe (Pz). Two sets of experiments have been performed to elicit the P300 potential. One used a system which is a variant of Farwell and Donchin’s famous P300 speller and the other used a system where the user is required to search for a specific letter in a series of changing images. A non-linear optimized support vector machine (SVM) classifier has been used to automatically detect the P300 potential. Results: Six subjects have participated in the preliminary experiment to test the prototype system, and an average error rate below 0.30 have been achieved, which is noteworthy considering the simplicity of the scheme. Conclusions: This work has successfully demonstrated a non-intrusive, low-cost, and portable system targeting attention in a motivating and engaging environment.


Background
Attention deficit hyperactivity disorder (ADHD) is a major ailment among children characterized by behavioral problems in the form of inattentiveness, hyperactivity and impulsiveness [1]. Worldwide the childhood prevalence of ADHD is about 5 %, making it one of the most dominant disorders among children [2]. The current first-line treatment option for such disability involves pharmacotherapy by psycho-stimulants which directly affects the central nervous system, and is associated with severe side effects [3]. The demand for an alternative treatment is apparent in the vast majority of ADHD studies. With the recent advances in neurofeedback, an alternative treatment for ADHD utilizing the brain waves has shifted focus from psycho-stimulants to drug-free treatments.
Neurofeedback is a Brain-Computer Interface (BCI) variant where the user is receiving feedback based on the frequency components in the electroencephalogram (EEG) [4]. BCI is an interface between the brain and an external device which enables signals from the brain to control the external device. It can be perceived as a communication scheme in which the user's intention is converted to an output without involving the usual output pathways of peripheral nerves and muscles [5]. Several researchers have shown that BCI has the potential as an alternative treatment option to train ADHD subjects [6][7][8][9][10][11][12]. Neurofeedback for ADHD treatment are of two different types: one based on the Sensory Motor Rhythms (SMR), and the other based on the α, β and θ waves in the EEG. The latter approach is derived from the "low-arousal hypothesis" that the ADHD subjects are experiencing less sensory stimuli than normal subjects [4]. When comparing ADHD subjects to normal control subjects, an excessive amount of slow waves (θ and α, usually present during shallow sleep or relaxed state) and less amount of high frequency waves (β, usually present during excited and mentally active states) are evident. By providing a positive feedback when the subject manages to display higher frequency, a training tool for attention can be developed.
The highly researched neurofeedback methods are subject to discussion. They are based on brain indices that can be altered simply by closing or opening the eyes, or by performing hard mental tasks. These parameters cannot be linked as a measure of attention to the relative task. In our pilot study, another approach is sought, that ensures attention to the specific task. We propose a first-of-a-kind (to the best of our knowledge) BCI system for attention training that is based on the P300 potential. P300 is a large positive voltage in the recorded EEG, strongest around midline of the parietal lobe, peaking 300 ms after a rare relevant stimuli. It is only present when the subject has to get involved during a specific stimulus or event, and represents voluntary cognitive processing [13,14]. Therefore, we believe that P300, as a potential only present at cognitive involvement, would be a suitable brain index that contains information on whether the subject is attentive or not. With this idea in mind, our approach has been to develop a prototype that uses numerous feedback systems as attention games. These games require the subject to be fully engaged in locating or acquiring relevant information that is presented rarely or in a short amount of time, amongst the frequent non-relevant information. If the subject can attend the relevant information, a P300 response can be identified in the EEG at that time. A competitive reward system provide points for eliciting the P300 responses as well as for correctly answering questions where attention on the information is necessary. A training environment requires challenges for the subject to be able to improve, just as gravity is a challenge when learning to ride a bike. Therefore, we embedded the proposed system in a 3D virtual reality (VR) classroom, that serves as an environment where real-life distractions can be simulated and controlled.

Methods
This section covers the methodology for implementing the prototype of the training tool which could eventually be used for ADHD subjects. It includes the design of the attention games and the VR classroom with the simulated distractions. The experimental protocol on healthy subjects and the machine learning methods to classify a possible attentive state are also explained in detail.

Attention games
Two oddball attention experiments were designed for the BCI setup running inside the VR classroom. These were constructed to challenge the sustained visual attention and visual discrimination. Both experiments require a great deal of conscious effort to concentrate and continue looking as well as quickly shifting attention; tasks that are hard for ADHD subjects [15].
In the first experiment, called the ANISPELL, a variant of the famous Farwell and Donchin's P300 speller is used [16]. The interface consists of a 4 × 4 grid of neutral animal pictures in grayscale with black background. In a pseudorandom fashion, a row or column of pictures were flashed-up, by displaying the true colors with a white background (illustrated in Fig. 1). The flash-up was kept for 100 ms and then followed by a 100 ms duration with all pictures returned to grayscale, corresponding to an inter-stimulus interval (ISI) of 200 ms. The sequence was repeated until all rows and columns have had 15 flash-up phases, which was denoted as a trial.
The subject was asked to attend a specific animal for the entire trial. They have to locate the most dominant color, a unique color, which could be a small area with manually changed pixel values, and a third attribute that was specified after the trial. Keeping the third question unknown made the subject engaged even after finding the answer to the two first questions. As an example, the third question for the goldfish was: "Was the top of the head darker than the body?". Twelve trials were conducted for this experiment, with the two first trials used to generate an unbiased template for the feature extraction process elaborated in the Feature Extraction section. The second experiment, called the T-SEARCH, was designed with inspiration from Frintrop et al. [17]. Twelve different images were shown randomly one at a time with an ISI of 200 ms containing several "X" and "T" in different colors. The images were repeated, pseudorandomly, ten times. Three example images are displayed in Fig. 2a-c. Four of the twelve images contain a blue "T" and the order of their appearance, in the pseudorandom repetitions, would not change. The subjects were asked to pay attention to the blue colored "T", remember its location and count the number of red T's that were present together with the blue "T" images. Five trials were run each with a different set of images. At the end of each trial, the subjects had to identify the location of the blue "T" in a compartmentalized square (shown in Fig. 2d). The questions, that were answered in the two experiments, were used in a cumulative scoring system to create a competitive game between the subjects.

Virtual reality classroom
VR can be seen as a Human-Computer Interface where it is possible to interact and become immersed in a computer created environment that is sought to be naturalistic and provide a sense of presence [18]. It has the advantages that it provides a "real-life" chaotic and naturalistic environment where the subject forgets the controlled test lab environment. Most importantly, the environment can be fully controlled making it possible to simulate distractions. In our study, a classroom was chosen since it is an environment that children are exposed to almost every day, where attention is especially important in relation to learning and socializing.
The VR classroom was created using the gaming engine UNITY with some of the 3D accessories designed in the 3D modeling software BLENDER. The classroom includes six Fig. 2 a-c Illustrates three out of the twelve images presented during the attention game, T-SEARCH. The images are shown one at a time with a frequency of five images each second, repeated several times. The subject is asked to locate any blue colored "T" symbol presented in the images. d Represents the image that is displayed at the end of a trial. The subject had to show (in the order of their appearance), which area a blue colored "T" were present pupil desks each having two seats, projection screen, posters, a soccer ball, several hulahoops, a wall-clock, book shelves and a first aid kit. Two windows in the left side of the room are facing out onto a road. The ceiling of the classroom contains a fan, projector and six fluorescent lights. Also is seen a female teacher, her desk as well as a computer on the desk, and a blackboard behind her. Screenshots of the VR classroom are shown in Fig. 3.

Distractions
Several distractions have been designed and incorporated inside the VR classroom. They were split into auditory distractions (sound of cars passing outside, and ambient classroom noise of children talking, pencil dropping and chairs moving) and visual distractions (construction worker entering and exiting the classroom, paper planes flying in the classroom, the fan rotating and a car passing outside) inspired by Rizzo et al. [19]. These distractions were not used in our experiments with the prototype system, since the aim of this pilot study was to prove that P300 is indeed a measure of attention. By excluding distractions in our preliminary experiment, we were able to assume attention on all relevant stimuli. These distractions will be included in future training sessions on ADHD subjects.

Microsoft kinect
An idea based on the work by Lee [20] provides an immersive "sense of presence" 3D illusion on a standard monitor. With MICROSOFT Kinect -an infrared camera measuring depth values, it is possible to track a person in front of it. The recorded depth values were sorted into 32 bins covering a depth range of 4096 mm, with each bin representing a depth resolution of 128 mm. The torso of the subject in front of the camera is in general flat and occupies a large area. Therefore, the bin with the largest count of depth values and the two bins next to it were defined to represent the subject. The neighboring bins were used to increase the robustness during bending posture. The depth values placed in the three bins was averaged together and the modulus was calculated with respect to the frame size to find the (x, y, z) position. The position was sent to an "off-axis perspective projection" algorithm [21]. The algorithm was implemented inside UNITY, and updates the 3D view of the entire VR classroom based on the received position. The subject was able to look around, above, below or closer in the VR classroom just by moving respectively left or right, down, up, or closer to the monitor.

Experimental setup
Six healthy young subjects (one female and five males) aged between 24 and 32, participated in this pilot study (the first subject was excluded due to a different procedural method). The EEG recordings were done using four electrodes. The unipolar reference electrode was placed at the left earlobe, a ground electrode at Fpz, an electrooculography (EOG) electrode below the left eye, and a measurement electrode at Pz. These positions were in accordance with the 10-20 international standard of EEG electrode placement. A fifth electrode input was activated as a trigger channel. Each time a stimulus was presented to the user, a trigger was set as a simultaneous time-stamp to the EEG.
The electrodes were all attached to the GTEC bio-amplifier (G.USBamp), which digitized the signal at 256 Hz, and band-pass (0.5 − 30 Hz) filtered with an 8th-order Butterworth filter. The amplifier was connected with a USB port to a 32-bit Windows XP computer using MATLAB R2008 to run the amplifier. The VR classroom interface was running on a 64-bit Windows 7 with a Xeon processor. A UDP connection was established between the two computers. Figure 4 illustrates the setup with the VR interface that was presented to the user.
The subjects reported no discomfort with the experiments or mounting of the electrodes. They reported that the questions and competition element made it motivating to stay engaged during the entire recording session.

Classifier
We chose a subject specific classifier for the prototype system as per the findings by Thulasidas et al. ([22], Table 1). The collected EEG signals were further low-pass filtered by a 20-sample moving average filter with a corresponding cut-off frequency of 12.8 Hz to minimize possible muscle artifacts, while still preserving the P300 potential [23]. The VR classroom training system gave rise to eye movements from the subjects. An adaptive filter based on the Recursive Least Squares (RLS) algorithm [24] was used to minimize the EOG artifacts, from the EEG signals.
The preprocessed signal was divided into epochs each representing a displayed stimulus from onset till 700 ms after. Five epochs representing identical stimuli (same row/column or same image) were then averaged together. An equal number of P300 epochs and non-P300 epochs was ensured by removing a large portion of non-P300 epochs. The grand average signal from the P300 epochs (solid lines) and the non-P300 epochs (dashed lines) are displayed in Fig. 5.

Non-parametric permutation test
Before extracting the features from the preprocessed EEG signal, an interval defining the P300 occurrence has to be located for each subject. By a permutation test using the paired t-test statistics, and corrected for the Multiple Comparisons Problem (MCP) by the t max method [25], it was possible to locate samples that were statistically different. The MCP has to be addressed, since we were testing all sample points within an epoch of length 700 ms with sampling frequency of 256 Hz, it corresponds to 179 samples to test for significance. The probability for Type I error in the 179 tests (α fam ) was set at 0.05 with The procedure was to calculate the paired t-value for all sample values, and then compare it against the non-parametric null-distribution that was generated by permuting the samples between P300 and non-P300 epochs. Each permutation generates 179 new t-values, but only the largest (t max ) was stored for the null-distribution. The positive statistically significant samples were chosen to define the P300 interval.

Feature extraction
The processed epochs constitute a matrix (S) as shown in Eq. (1) with epochs (m = 1, 2, . . . M) along the rows and sample number (n = 1, 2, . . . N) along the columns. 1 s 1 The collected data contains M = 240 rows (8 (flash targets) × 15 (repetitions)/ 5(averages) × 10 (trials)), but cut down to (M = 120) for reasons mentioned earlier, and N = 179 samples. A total of 24 features were extracted from S. We used features targeting the temporal shape of the recorded epochs and (dis)similarity features that were produced by comparing the epochs with a template epoch that was generated during the ANISPELL recordings. The most relevant features are presented below (three temporal features and three template features).
(i) Standard fraction: The ratio between the standard deviations in the P300 interval (t p ) and baseline interval (t b ), defined here as the interval from onset till 200 ms after (the intervals are shown in Fig. 6), denoted as f 3 m in Eq. (2): where T p and T b are the number of samples respectively in t p and t b , and μ m (·) are the corresponding mean values of epoch m.
(ii) Power fraction: The ratio between the total power in the P300 interval (P t p ) and the baseline interval (P t b ), denoted as f 5 m in Eq. (3): (iii) Triangle area: The area of a triangle within the P300 interval (as shown in Fig. 6), denoted as f 23 m in Eq. (4): where | · | denotes the determinant.
where s m = s m,1 , s m,2 , . . . , s m,N T , s * is a column vector representing the template epoch and · denotes the Euclidean norm.
(v) Pearson correlation: A measure of the linear relationship between the epoch and the template. It is calculated using Eq. (6) after making the vectors zero-mean and unit variance, and is denoted as f 9 m : (vi) Weighted Euclidean: A weighted Euclidean distance measure between the epochs and the template, denoted as f 12 m in Eq. (7): where D = diag (w 1 , w 2 , . . . , w N ) is a N × N diagonal matrix with the elements representing the weighting of the sample points (the t p interval was weighted four times as large). At this stage we had a feature matrix of size 120 × 24 of the ANISPELL data. From this matrix, 25 % of the data was allocated as a performance-set to evaluate the classifier.

Support vector machine
Support Vector Machine (SVM) is a popular classifier for the binary separating (P300 vs non-P300 epochs) scheme due to its stability, low variance and generalization properties; a regularization term (C) and the maximum margin method [26].
Since SVM includes the dot product of the features, a nonlinear separation can be achieved by the Kernel trick [27]. The Gaussian kernel; the most popular kernel was used in this study and shown in Eq. (8) [26]: where T is the i'th feature vector and σ denotes the smoothing parameter, which models the degree of nonlinearity. By introducing the nonlinear Gaussian kernel SVM, two parameters were to be tuned: σ and C. Optimization of these parameters was achieved, through a 3-fold cross-validation (CV), by the Pattern Search (PS) algorithm included in the Global Optimization Toolbox in MATLAB. Based on a given starting point, it searches for the maximum performance in the {σ , C} space by a derivative-free direct search approach. Matthews correlation coefficient (φ) has been chosen as the performance criterion since it takes into account all four outcomes with a single value [28]. A value of φ = 1 indicates a perfect prediction, φ = 0 represents random prediction, and φ = −1, a complete mismatch. In each CV fold PS was run with ten different starting points and with one feature at a time, selected through a forward sequential search algorithm. The feature with the highest-averaged CV performance was kept, and a new round with two features was run. This was continued until the performance in the averaged test set did not improve. A pseudo-code for the algorithm is shown in Algorithm 1. The selected features from each subject is shown in Table 1.

Results and discussion
The MCP corrected permutation test was used to locate a subject specific P300 interval. The P300 interval area for all five subjects were successfully observed, and visualized in Fig. 7 which illustrates the t-values as a function of time. The significant t-values are colored in the red and blue, while the gray color representing non-significant time instances. P300, being a positive potential, the interesting areas are in the red color-map. Subject 1 had the smallest averaged statistical significant P300 interval (t(1, 9) = 5.78, p < 0.0153), while subject 4 had the largest averaged statistical significant P300 interval (t(1, 9) = 17.31, p < 0.0022). These time instances were used as the P300 interval for the SVM analysis. The result of the analysis is displayed in Table 1.
The best performance was found from subject 4 with an error rate of 0.23. The PS algorithm proved effective with subject 1 achieving a drop from 0.47 to 0.37 error rate. The φ coefficient for the performance-set displayed a better performance than random guessing. The AUC value, being above 0.5, indicated a separation between the two classes of epochs. The developed non-intrusive VR BCI system achieved an average error rate below 0.30. The performance is comparable to the literature but suffers from the children friendly setup. The setup was based on four electrodes and only one electrode for classification. A single recording electrode to capture the P300 response is rarely used as several electrodes can increase the amount of averaged epochs, as seen in the P300 controlled apartment by Bayliss [29] utilizing 9 electrodes. In the preprocessing step, the number of epochs for averaging was kept as low as five, which is usually in the order of 10 − 20. In the classification step, we detected P300 or non P300 epochs with no a-priori assumptions while many P300 classifiers uses a trial-based soft score measures in which one epoch (the most likely) is classified as a P300 epoch [30].
A careful tradeoff is needed between simplicity and performance, since performance is particularly important when BCI is used for neuro-rehabilitation. The feedback is considered, by the user, to be the correction solution and many false classifications will inhibit improvement. Besides the accuracy, a feedback which is fast enough is also important. Feedback from the P300 response happens after five averages of the target which corresponds to a time between 3.4 s to 5 s, depending on the pseudorandom order. This is a slower feedback than the usual neurofeedback methods which utilizes β and θ waves, and improvement in this area will be needed for a future setup.
The large variability in chosen features illustrated a low amount of data in the SVM analysis. Nevertheless, f 9 m was proven superior since it was selected frequently as the best feature to discriminate P300 epochs from non-P300 epochs. Only 40 % of the features selected were temporal features with the rest being template based features. Therefore, the template generating trials were an important and somewhat limiting factor in the classification method. ADHD subjects are expected to have larger inter-subject differences in the P300 potential reflecting cognitive and processing difficulties [31]. The T-SEARCH game may prove more difficult than the flashing ANISPELL game, from which the template signal was derived, and could easily increase the amount of false negatives from the template based features. Furthermore, as argued by Wolpaw et al. [32], the P300 is likely to change over time, reflecting task adaptation. In contrast to the above, as the user progresses, the template has to be updated to accommodate the changes in the latency and amplitude.

Conclusions
With the results and feedback from the healthy subjects who participated in the first-ofa-kind P300 based VR training system for ADHD subjects, it is apparent that the system meets the insufficiencies seen in previous neurofeedback studies. The automatic P300 detection classifier was trained through forward selection of features derived from a template of the P300 signal, and temporal patterns. Although the non-linear SVM only used five repetitions of the same stimuli, the classifier still managed to detect the P300 with an average error below 0.30.
The P300 was demonstrated to be connected to attention in the healthy subjects, and suggests a positive effect on ADHD subjects as well. With the results on this preliminary study, demonstrating our prototype system, we hope to encourage the neurofeedback community to expand their BCI systems with interactive settings, competitive games, and