Real-time brain computer interface using imaginary movements

Background: Brain Computer Interface (BCI) is the method of transforming mental thoughts and imagination into actions. A real-time BCI system can improve the quality of life of patients with severe neuromuscular disorders by enabling them to communicate with the outside world. In this paper, the implementation of a 2-class real-time BCI system based on the event related desynchronization (ERD) of the sensorimotor rhythms (SMR) is described. Methods: Off-line measurements were conducted on 12 healthy test subjects with 3 different feedback systems (cross, basket and bars). From the collected electroencephalogram (EEG) data, the optimum frequency bands for each of the subjects were determined first through an exhaustive search on 325 bandpass filters. The features were then extracted for the left and right hand imaginary movements using the Common Spatial Pattern (CSP) method. Subsequently, a Bayes linear classifier (BLC) was developed and used for signal classification. These three subject-specific settings were preserved for the on-line experiments with the same feedback systems. Results: Six of the 12 subjects were qualified for the on-line experiments based on their high off-line classification accuracies (CAs > 75 %). The overall mean on-line accuracy was found to be 80%. Conclusions: The subject-specific settings applied on the feedback systems have resulted in the development of a successful real-time BCI system with high accuracies.


Background
Brain Computer Interface (BCI) -the method of transforming mental thoughts and imagination into actions has been a very interesting and challenging research topic of neuroscience in recent years. The primary reason for such high interest is that it helps to improve the quality of life of patients with severe neuromuscular disorders. BCI based systems enable such patients to communicate with the outside world even without the output channels of peripheral nerves and muscles [1][2][3][4][5]. They can be used for the purpose of communication (e.g. spelling device) [6,7], interaction with external devices (e.g. controlling a wheelchair) [8,9], rehabilitation [10,11] and/or for monitoring the mental states [12,13].
It has been well studied that the imagination of movements of left and right hands results in the event-related desynchronizing (ERD) of the sensory motor rhythms (SMR) in the contralateral sensorimotor areas and event related synchronization (ERS) on the ipsilateral side [31,32]. The corresponding distinguishable features in the EEG signals can be used to design MI-based BCI systems [24][25][26][27][28][29][30]. Many BCI studies have reported good offline results with high accuracies. However, a BCI-system becomes interesting when it is able to work in real time. In this paper, a real-time MI-based BCI system was developed in which the imagination of the movements of left and right hands were tested resulting in a system with 2-class output [33][34][35]. Figure 1 illustrates the schematic of the BCI system which has been developed in our laboratory at the Technical University of Denmark (DTU), named as the DTU-BCI scheme from here on. Using this set-up, offline experiments were first conducted on 12 test subjects with the imaginary left/right hand movements as a calibration session to: (i) Determine the optimal frequencies that give the best discrimination between the two classes, (ii) Create a feature extraction filter that maximizes the distance between the two class features, and (iii) Train a classifier for online measurements. In the online measurements, first the online data are filtered with the optimal bandpass filters (obtained from the offline analysis), then the features are extracted using the feature extraction procedure from the offline analysis, and finally the feature vector is classified using the trained classifier.

Methods
In the DTU-BCI set-up shown in Fig. 1, 28 EEG surface electrodes placed on (and around) the motor cortex has been used [7,36,37]. Furthermore, EMG electrodes were placed on both the arm wrists during the offline measurements to verify the passivity of the arm muscles. Twelve healthy test-subjects (seven males and five females at an average age of 23 ± 2.6 years) took part in this study. None of them had previous history of neurological diseases or disorders that may influence the experiments. Each participant went through an Edinburgh Handedness test [38]. The handedness test showed that six males and all females were right handed, and only one male was left-handed. Each test-subject was given instruction about the measurement procedures and protocols before the first session. All subjects received remuneration for their participation.
Several studies have shown that the ERD signals can be localized at the sensorimotor cortex [39]. However, the SMR waves in the EEG are generally weak and it is impossible to classify the raw EEG directly [1,40]. Therefore, EEG data were processed in order to extract the relevant features of the SMR which are distinguishable to be used as different control signals in a BCI set-up.

Feature extraction
We used the Common Spatial Patterns (CSP) algorithm to extract the features from the collected EEG as it has been shown to be very efficient in extracting the features with 2-class BCI systems based on movement imagination [7,34]. In our approach, the CSP filter was found from the labeled offline data − a large matrix (V) of dimension N × T, where N(= 28) is the number of channels and T is the number of samples in each channel (depends on the window length). The data matrix contains 160 mixed trials, 80 trials of right hand imaginary movements (r−trials) and 80 trials of left hand imagery movements (l−trials).
Let V r and V l be the r−labeled and l−labeled trials, respectively. The corresponding covariance matrices, r and l , were estimated to calculate c , the composite spatial covariance matrix of the data.
where the bar represents the averages. Using the eigenvector and eigenvalues of c , we transformed the data into the eigenvector space: The next step was the whitening-transformation of the data. Using, W = λ − 1 2 c B T c , both r and l were whitened as: Note that both S r and S l share the same eigenvectors [41], and hence they can be expressed as: The fact that λ r +λ l = I leads to an important and beneficial condition. This implies that the eigenvector with the largest eigenvalue in S r has the smallest eigenvalue in S l . In order to reduce the dimension, the m (= 3 in our case) largest and m smallest eigenvalues with their corresponding eigenvectors were extracted and used for the data transformation. Using W, the final CSP filter was: The raw data matrix V was then projected onto the CSP space as follows: Thus, the dimension was reduced from 28 × T to 2 m × T. The variances along 2m rows in Z were calculated and normalized in order to extract the features (x).

Classification
The feature vector (x) extracted from the EEG data need to be classified as l or r. In this work, the Bayes linear classifier (BLC) is used which is known to be optimal when the attributes are independent given the class [42]. It is one of the simplest classifiers and is based on minimizing the classification error probability [41,43]. The classifier calculates the conditional probabilities P(w l |x) and P(w r |x), and the class with the largest probability given x is chosen. Assuming that both classes are Gaussian [40], and have equal covariance matrices, i.e., l = r = , we obtain: In order to perform an online classification using the equations above, the covariance matrix and the two class means μ l and μ r were to be known. These were calculated from the offline data using 3 × 5-fold cross-validation. In the online BCI, no crossvalidation processes were applied. Instead, data from a sliding window was first bandpass filtered and then CSP-filtered using parameters extracted from the offline measurements. Features of each online data segments were classified using the BLC. Each classification was then fed in to a voting system that gave the final classification accuracy (CA) after e.g., 8 data segments. The following parameters were used in the voting system: segment length range of 0.5 − 1.5 sec, the window overlapping of 90 and 95 %, and the number of segments of 6 − 15. All test parameters were individually selected using a graphical user interface.
Though the estimated parameters from offline analysis has been used in the online measurements, the data from the online measurements could show considerable variation [44]. These variations may be due to non-stationeries caused by the small changes in electrode positions, drying conductive gel or electrodes with high impedances, brain plasticity, especially after several sessions, or variations in the cognitive state of the user, e.g. motivation, attention etc.

BLC
This classifier uses the standard classification rules from Eqs. (10) and (11). The rules can be expressed as: Since the data are assumed to be Gaussian distributed, the expression (D r − D l ) is also Gaussian. To illustrate the decision criteria in BLC, Fig. 2 shows the probability density function (pdf ) of (D r − D l ) when performing the left and right hand movement imagination.
The red dotted line in the figure represents the pdf of (D r − D l ) when performing imaginary left hand movements and the blue solid line represents the pdf of (D r − D l ) when performing imaginary right hand movements. The vertical green line is the decision threshold (= 0). It is worth noting, that even though the subject-specific CSP filter maximizes the separation between the two classes, there will always be an overlap (due to the nature of the distribution). And because of the variations mentioned earlier, the means of (D r − D l ) for left and right imaginary movements are unlikely to be equidistant to the threshold. Therefore, this classifier may result in lower classification accuracy in online BCI.

Feedback systems
To ensure that the output of the online signal processing is utilized to its full, a proper feedback paradigm should be designed. A feedback system receives the control signal consisting of commands (left, right, no classifications), and convert them to a visualized event on the screen. We used three different feedback systems: Cross feedback, Basket feedback, and Bar feedback (Fig. 3).

Cross feedback
This consists of a blue cross in the middle of the screen, and two grey bars (the left goal and the right goal), on each side of the screen (Fig. 3a). The objective is to move the cross to the left or the right goal, by performing imaginary left or right hand tapping, respectively. At the beginning of each trial, one of the two goals gets red and becomes the target goal. The cross moves one step to the side that corresponds to the online command. In other words, if the final classification decision is left (l), the cross moves one step to the left. If right (r), then the cross moves one step to the right. The cross stays in the same position if the command is "no classification". Each session consist of ten pseudo-random trials, five left and five right. After each trial, a 5-s pause is given. Once the cross has reached one of the goals, the system saves the selection and ends the trial.

Basket feedback
This system consists of a blue ball at the top of the screen, and two goals at the bottom (Fig. 3b). When the trial begins, the ball starts falling with a constant speed. The objective is to move the ball, by means of imaginary hand movements, to the target goal (the one with red color). The ball speed can be adjusted before running the measurement. This system also had the same trial construction as the cross paradigm. Each test-subject went through this paradigm several times, and the ball speed has been varied for some of the subjects. Table 4 lists the mean trial durations (MTDs) and CAs of the subjects performing the Basket Feedback.

Bar feedback
There is no object to move in this feedback system. The feedback consists of two bars, one on each side of the screen. These bars are empty, and are filled gradually after each final classification (Fig. 3c). The trial begins with an arrow appearing on the middle, instructing the test subject, which imaginary movement to perform, i.e. which bar to fill up. If the arrow is pointing to the right, then the test subject has to imagine right hand movement in order to fill up the right bar, and vice versa. It takes 10 steps to fill a bar up. Once one of the bars is filled, a selection is made. Therefore, a trial has a minimum duration of ten steps and maximum duration of 19 successful steps. The step duration is the time between two final classifications.

Results and discussions
The most important experiments and the results are presented in this section. The results are based on about 100 h of measurements and many hours of measurements preparation.

Offline BCI
The preliminary offline measurements were carried out to deselect BCI illiterates and to calculate the filters and other parameters for the test subjects [45,46]. Each offline calibration measurement consisted of 80 left and 80 right trials in a pseudo-random order.

ERD plots
One way of visualizing the offline data is to generate ERD plots (showing the distribution of power) of the data from C3 and C4 electrode locations. The data from these locations are divided into segments corresponding to left and right hand movement trials. The trial averaged power is then calculated and the corresponding ERD plots for subject 9 is shown in Fig. 4.
The ERD plots show larger power in C4 than in C3 during left hand movement imagination. On the other hand, C3 has large power than C4 during right hand movement imagination. Recall that C3 and C4 are located on the left and right sensorimotor cortex hand areas, respectively. Recall also, that the left sensorimotor cortex is responsible for the contralateral (right) body movements, and vice versa. The ERD plots confirm this phenomenon during movement imagination. It is worth noting, that the power is not uniformly distributed along the frequency axis. Figure 4 shows that the most significant power changes occur at 5-12 Hz. Another active frequency band is 18-23 Hz. When comparing across the subjects, we do not find precisely same patterns regarding active frequency bands. This finding confirms our objective to select subject-specific frequency bands in online BCI.

Optimal offline results
The offline data were analyzed using a total of 325 bandpass filters. For each pass-band, a CSP filter was calculated and applied on the data. Finally, the filtered data were classified and the CA was found. These exhaustive analysis was performed on a computer cluster via the DTU Newton server. The optimal frequencies and accuracies of each test-subject are listed in Table 1.
It can be seen that subject 9 has the highest offline CA (99.38 %). Subjects 10 and 12 also have high CAs. It is important to note that the optimal frequency ranges are different across the subjects. Only two subjects (1 and 9) have optimal results with frequencies  Table 1 The offline results of all test-subjects. The optimal accuracies listed in the right column are found by analyzing the offline data using different bandpass filters. The optimal frequency ranges listed in the middle column are the bandpass filters that result in optimal classification with the optimal accuracy. While recording the EEG data, the electrode impedance was kept under 5 k roughly in the α-band. Remaining subjects have optimal frequencies either in the beta-band or combined α−β-band. Based on the offline results, the subjects were divided into different categories ( Table 2). This shows that half of the subjects had accuracies below 75 %. Since well-functioning online BCI requires relatively high accuracies, the subjects with accuracies above 75 % have been considered (six subjects: 2, 4, 7, 9, 10 and 12) for online measurements.

Cross feedback
This feedback system was tested 15 times in average for each of the six test subjects in order to reach the CAs and MTDs (shown in Table 3). Considering the CAs, it was found that all the test subjects except subject 2 could control the cross easily. Also, it is worth noting that in terms of the MTDs, two of the standard deviations are extremely high (in subjects 2 and 7). By studying the individual sessions of subject 7, it was found that the first session (MTD of 52 s) differed significantly from the rest of the sessions (9.89 ± 3.37 seconds). Regarding subject 2, the huge standard deviation is realistic, since low accuracy generally leads to frequent misclassifications, and thereby results in prolonged trials. The MTDs of this paradigm are generally high. Table 3 clearly show that all subjects, except subject 2 have CAs between 80 and 90 %. Subject 2 struggled to control the ball to move to the target side each time. The other subjects experienced the ease of moving the ball in the correct direction. The small standard   Table 4 lists the average ball speeds for each subject (inter-session and not intra-session ball speed variation). Based on these results, it has been proved that using the basket paradigm, 90 % CA and around 8-s MTD is achievable.

Bar feedback
This feedback system was tested on all online subjects, except subject 10 (could not participate due to personal reasons). In Table 3, the mean accuracies and trial durations for the Bar Feedback are given. Subjects 4, 7, 9, and 12 accomplished the sessions with a mean CA between 77.50 and 100.00 %. Subject 2's mean CA was higher than for the other two feedback systems, but still significantly lower than the other subjects. By observing the MTDs, it is found that the times are substantially lower than the trial durations for the two other feedback systems.

Inter-subject analysis
The online CAs along with the offline CAs for the 6 test subjects are illustrated in Fig. 5. It can be clearly seen that although subject 9 had the highest offline CA, the online results were the worst. Subject 7 on the other hand had the highest grand average online CA (94.17 %). It is worth noting that only subject 7 improved the CA from offline to the online. Subject 2 gave poor results compared with the other five subjects, both in terms of CA and MTD. By studying the metadata of the subjects, we found that subject 2 was the only left-handed subject among the six subjects. However, it is premature to conclude  that the left-handed people perform worse than the right-handed ones from one isolated case. It is been reported that BCI control does not work for 15-30 % of subjects [45], and therefore it could be that subject 2 belong to this group of subjects.

Inter-feedback analysis
From the results in Table 3, we can see that, (i) Although all three feedback systems resulted in more or less similar CAs, Cross feedback had shown significantly large deviations, (ii) The MTDs differ significantly from each other; 18.31, 12.35 and 8.08 s, respectively for the Cross, Basket and Bar feedbacks, and (iii) The cross feedback had the highest standard deviation of the mean trial duration compared to the two other paradigms. These findings indicate that the Cross feedback is not as stable as the other two paradigms. It is however possible to refine this paradigm e.g. by reducing the distance to the goals in order to reduce the MTD and at the same time improve the stability.

The learning effect
This is done by allowing one of the subjects (subject 4) to try the cross feedback on three (Mondays) consecutive weeks and the CAs are plotted in Fig. 6. It can be seen that a small but clear increase in the accuracy on the second and third measurement day. The red linear trend-line also confirms this improvement. Notice that not only the accuracy was improved each time, but also the standard deviation was reduced.

Trial duration vs. accuracy
Results from the online measurements showed, that when the CA was low, then reaching the target became a difficult task (e.g. in Cross feedback). Furthermore, the MTD became prolonged, since the wrong classifications should be corrected. However, the reverse relation is also valid: prolonged MTD leads to exhausting the user, leading to reduced imagery performance and thus a reduced CA. Therefore, CA and MTD affect each other. In Fig. 7, the mean CAs are plotted against their corresponding MTDs. The trend-line confirms that longer MTD is related to lower CA, and vice versa.

Feedback improvements
During the online measurements with the three feedback systems, few implementation errors were registered, and potential refinements were suggested. A general problem was detected in all three paradigms. At the beginning of each trial (except the first trial), the first few segments from the sliding window contained data from the previous trial due to overlapping. Therefore these 'old' data could affect the first few classifications of the new trial and in the worst case affect the first final classification of each trial (recall the voting process). Another potential improvement concerning all three feedback systems is to illustrate the stepwise moves (of the cross, ball and the bar fill) as continuous moves. Even though this change is only visual, it may minimize confusion, and prevent unconscious step-synchronized body movements. Finally the last common improvement is to instruct the subjects about the feedback paradigms (Cross and Basket: red target, Bars: direction arrow) at least few seconds before each trial. In the current paradigms, it was introduced concurrently with the beginning of the trial. Thus, the subjects spent up to couple of seconds to react on these commands while the segments from the sliding window were classified, possibly wrongly. In the following three subsections, specific improvements and changes of the paradigms are suggested.

Improvements of cross
Recall that there were ten steps to each target in the Cross feedback. The reason for implementing such a large distance to the targets was to ensure that there was enough space to correct wrong classifications. The analysis of the online measurements showed, that the Cross paradigm resulted in prolonged and varied MTDs. Since all online recordings have been saved, each online session was analyzed in order to investigate the movement behavior of the cross. In other words, the CA and MTD of each session were recalculated after reducing the target distance gradually down to one step. Figure 8 illustrates the mean CAs of each test subject as a function of number of steps to the targets. It shows that the CAs do not change significantly when reducing the target distance by few steps.
If we assume, that the maximum tolerable accuracy reduction is 10 %, then three or four steps to the targets would be enough. According to this analysis, if the target distance is reduced to four steps, the corresponding percentage change of the mean accuracies for the test subjects are calculated and is tabulated in Table 5. Note that some of the CAs increase when the target distance is reduced. It may be attributed to the fact that some subjects become exhausted when passing the final steps (due to prolonged imagery). And since fatigue may reduce the imagery performance, this could lead to faulty classifications.
The time reduction is also considered in this analysis. Figure 9 illustrates the estimated MTDs for each subject as a function of target distance. A huge time reduction is achieved if the distance is reduced. For instance, if the distance was four steps instead of ten, the averaged MTD would be 5.42 s. It is worth noting, that the time reduction curves in Fig. 9 were exponentially decreasing with decreased target distance. This finding confirms the fact that large distance to the targets leads to prolonged imagery, which in turn leads to  fatigue and thus bad performance. Table 5 summarizes the analytical results for target distance equal to four steps.

Improvements of basket
Since Basket feedback has limited MTD, reducing the target distance (number of steps to the side borders in this case) will have less effect. However, the large number of online measurements using Basket indicates that a distance of 20 steps is too long. Therefore, a distance reduction may result in a reduction in MTD.

Improvements of bars
Some subjects experienced, that the command arrow was thin and unclear. Another problem occurred, when the filling difference between the left and right bars were small. For instance, if both left and right bars were filled up with nine steps at the end of the trial, then the last step decides the class of the trial. Because the figure clears when the decision is made, the user will not be able to detect the decision. This problem can be solved by viewing the decision in the beginning of the 5-second pause. Fig. 9 The trial duration as a function of target distance

EMG contribution in online BCI
Besides visually inspecting the subjects during the measurements, the EMG recorded during the offline measurements were used to ensure that the ERDs of the SMR were due to movement imagination rather than due to real muscle movements. The spectrogram of the EMG data did not show a significant power change that could indicate that the subject was making a real hand flexion. It was found that the EMG analysis was in accordance with the visual inspection, which showed that EMG activity was negligible for all subjects. During the online measurements, EMG was not measured but was only visually inspected. However, real and imaginary movements do not result in same EEG spatial patterns. Therefore during online BCI, a CSP filter that is calculated from offline data with minimal EMG activity will not result in optimal feature extraction if real movements were performed. Consequently, real movements may probably lead to bad classification. This hypothesis was tested during few online sessions: the subject was told to perform real hand movements instead of imaginary movements. Many of the resulted classifications were incorrect, and the feedback showed a more or less random path of the cross.

Conclusions
This paper has focused on the challenges of developing a real-time BCI system using the desynchronization phenomenon of the SMR. The first part was to conduct offline calibration measurements to determine the optimal subject-specific parameters to use in the online part. Offline data were processed using CSP to extract the relevant features. BLC was trained using the labeled features from each data. Twelve test-subjects participated in the offline measurements and six of them qualified to participate in the online measurements. Three online feedback paradigms were designed and used (cross, basket, and bars) in this work. While all three paradigms resulted in similar CAs, the results of cross indicated instability. This was reflected by prolonged trial time, large standard deviation of the trial times, and the large deviation of the CAs. The overall online CA was 80 %. It was found, by studying possible improvements of cross, that reducing the target distance from ten steps to four steps resulted in 70 % reduction of MTD. This improvement will only reducethe CA by 2 %.