Cross-Subject Transfer Learning Improves the Practicality of Real-World Applications of Brain-Computer Interfaces

Cr oss-Subject T ransfer Learning Impr o v es the Practicality of Real-W orld A pplications of Brain-Computer Interfaces Kuan-Jung Chiang, Chun-Shu W ei, Member , IEEE , Masaki Nakanishi, Member , IEEE , and Tzyy-Ping Jung, F ellow , IEEE Abstract — Steady-state visual evok ed potential (SSVEP)- based brain-computer interfaces (BCIs) hav e shown its r o- bustness in facilitating high-efﬁciency communication. State- of-the-art training-based SSVEP decoding methods such as extended Canonical Correlation Analysis (CCA) and T ask- Related Component Analysis (TRCA) are the major players that elevate the efﬁciency of the SSVEP-based BCIs thr ough a calibration pr ocess. Howe ver , due to notable human variability across individuals and within individuals ov er time, calibration (training) data collection is non-negligible and often laborious and time-consuming, deteriorating the practicality of SSVEP BCIs in a real-w orld context. This study aims to dev elop a cross-subject transferring approach to reduce the need for collecting training data from a test user with a newly proposed least-squares transf ormation (LST) method. Study results show the capability of the LST in reducing the number of training templates r equired f or a 40-class SSVEP BCI. The LST method may lead to numer ous real-w orld applications using near -zero- training/plug-and-play high-speed SSVEP BCIs. I . I N T R O D U C T I O N Brain-computer interfaces (BCIs) allo w users to translate their intention into commands to control external devices, enabling an intuitiv e interface for disabled and/or non- disabled users [1]. Among various neuromonitoring modali- ties, electroencephalogram (EEG) is one of the most popular ones used for developing real-world BCI applications due to its non-in vasi veness, lo w cost, and high temporal resolution [1]. In recent studies, steady-state visual ev oked potentials (SSVEP), an intrinsic neural electrophysiological response to repetitive visual stimulation, has attracted increasing at- tention as a piv ot in implementing BCI systems because of its characteristic of robustness [2]. W ith recent adv ances in system design and signal processing, the performance of SSVEP-based BCIs has dramatically improved in the past decade [3]. Numbers of studies have reported a variety of BCI applications including text speller [3], [4], phone-dialing system [5], game controller [6], etc. T o de velop a real-world SSVEP-based BCI, a sophisticated algorithm to ef fectively decode SSVEPs plays an important role [7]. The target identiﬁcation process can traditionally be divided into two parts: 1) spatial ﬁltering, and 2) model ﬁtting [3], [7]. Spatial ﬁltering techniques, which include minimum ener gy combination (MEC) [8], canonical correla- tion analysis (CCA) [9], and task-related component analysis Research supported in part by Army Research Laboratory (W911NF-10- 2-0022) and a contract from Oculus VR, LLC. K.-J. Chiang, C.-S. W ei, M. Nakanishi, and T .-P . Jung* are with Sw artz Center for Computational Neuroscience, Institute of Neural Computation, Univ ersity of California, San Diego, La Jolla, CA 92093, USA (phone: 858-822-7555; fax: 858-822-7556; e-mail: jung@sccn.ucsd.edu (TRCA) [4], ha ve been introduced to enhance the signal-to- noise ratio (SNR) of SSVEPs by reducing the interference from the spontaneous EEG acti vities. After spatial ﬁltering, target stimuli are identiﬁed by ﬁtting the models of SSVEPs. Computer-generated SSVEP models consisting of sinusoidal signals have been widely used to detect SSVEPs without requiring any calibration data [8], [9]. In recent studies, it has been pro ven that individualized templates obtained from a calibration procedure could better characterize user -speciﬁc SSVEPs than the computer-generated models, leading to drastically impro ved classiﬁcation performance [4], [7]. In practice, howe ver , because of large human variability both across indi viduals and within indi viduals over time, we need to collect calibration (training) data from each individ- ual before each session. In general, a large amount of cali- bration data is required for both deriving spatial ﬁlters and templates. So, the calibration procedure is often laborious and time-consuming, hindering the practicality of BCIs in a real-world context. Many researchers hav e attempted to adopt transfer-learning techniques to shorten the calibration process without compromising classiﬁcation accurac y [10], [11]. For instance, Y uan et al. [10], [11] proposed subject-to-subject transfer learning methods, which transfer SSVEP data from existing subjects to new ones. More recently , Nakanishi et al. made it possible to transfer SSVEP data across sessions ev en with different electrode montages [4]. Although these approaches achiev ed better performance than training-free algorithms, none of them has reached comparable accuracy obtained by using indi vidualized calibration data. This study proposes using a least-squares transformation (LST) to facilitate cross-subject transferring of SSVEP data for reducing the calibration data/time and enhancing classiﬁ- cation accuracy for a ne w user . The LST method transforms the SSVEP data from existing subjects to ﬁt the SSVEP templates of a new user based on a small number of ne w templates. That is, the proposed SSVEP BCI can le verage the transformed data from other subjects and a small amount of calibration data from the new user, to dev elop the spatial ﬁlters for TRCA or other template-matching approaches. This new approach was e v aluated using a 40-class SSVEP dataset collected from eight subjects to assess its applicability in a high-speed SSVEP-based BCI speller . I I . M E T H O D S A. EEG Data The present study used the EEG data collected in our previous study [3]. F orty visual stimuli were presented on a 23.6-inch liquid-crystal display (LCD) screen with a refresh rate of 60 Hz and a resolution of 1,920 × 1,080 pixels. The stimuli were arranged in a 5 × 8 matrix as a virtual keyboard and tagged with 40 different frequencies (8.0 Hz to 15.8 Hz with an interval of 0.2 Hz) and 4 different phases (0, 0.5 π , π , and 1.5 π ). The horizontal and vertical intervals between two neighboring stimuli were 5 cm and 1.5 cm, respectively . The stimulation program was developed under MA TLAB (MathW orks, Inc.) using the Psychophysics T oolbox e xtensions [12]. The dataset contained nine-channel (Pz, PO5, PO3, POz, PO4, PO6, O1, Oz, and O2) EEG signals collected from eight subjects in two experiments conducted on different days. Both sessions consisted of 15 blocks, in which the subjects were asked to gaze at one of the visual stimuli indicated by the stimulus program in a random order for 0.7 s. The subjects went through 40 trials corresponding to all the visual stimuli in each block. After each stimulus of fset, the screen was blank for 0.5 s before the next trial began. The intervals between sessions were dif ferent across individuals. B. TRCA-based SSVEP detection TRCA is a data-driv en method to extract task-related com- ponents efﬁciently by ﬁnding a linear coefﬁcient that max- imizes their reproducibility during task periods [4]. Spatial ﬁltering based on TRCA has shown signiﬁcantly improving the performance of training-based SSVEP detections [4]. In addition, the TRCA-based method was able to successfully combined with the ﬁlter bank analysis, which decomposed EEG signals into sub-band components so that independent information embedded in the harmonic components can be efﬁciently extracted [13]. In the procedure of the TRCA-based method with ﬁlter bank analysis, individual calibration data for the n -th stim- ulus are denoted as x n ∈ R N C × N S × N T , n = 1 , 2 , ..., N F . Here N C is the number of channels, N S is the number of sampling points in each trial, N T is the number of trials, and N F is the number of visual stimuli (i.e., 40 in this study). In the training phase, the calibration data are di vided to N K sub-bands by a ﬁlter bank and become x k n ∈ R N C × N S × N T , k = 1 , 2 , ..., N K . The N K was set to ﬁ ve in this study . For each sub-band, spatial ﬁlters w k n ∈ R N C can be obtained by maximizing w T S w with a constraint based on the v ariance of reconstructed signal. Here, S is the sum of inter-trial cov ariance matrices, i.e. S = P i P j C i,j , where C i,j is the cov ariance matrices between i -th and j -th ( i 6 = j ) trials of multi-channel EEG. After obtaining the spatial ﬁlters, individual templates are prepared. The calibration data for n -th stimulus x k n are ﬁrst av eraged across all the training trials as ¯ x k n ∈ R N C × N S in each sub-band. The individual templates y k n ∈ R N S are obtained by applying the spatial ﬁlter to ¯ x k n as y k n = ( w k n ) T ¯ x k n . In the testing phase, single-trial testing data ˆ x ∈ R N C × N S also go through the ﬁlter bank analysis to be decomposed into N K sub-bands. The spatial ﬁlters w k n obtained in training phase are then applied to the testing signals ˆ x k ∈ R N C × N S in each sub-band. Feature v alues ρ k n are calculated Fig. 1. The procedure of transferring SSVEPs based on the least square error transformation. as correlation coef ﬁcients between the testing signals and the individual templates as ρ k n = r (( w k n ) T ˆ x k , y k n ) , where r ( a, b ) indicates the Pearsons correlation analysis between two variables a and b . A weighted sum of the ensemble correlation coefﬁcients corresponding to all the sub-bands was calculated as the ﬁnal feature for target identiﬁcation as ρ n = P N K k =1 α ( k ) · ρ k n . Here, the coefﬁcient α ( k ) was deﬁned as α ( k ) = k − 1 . 25 + 0 . 25 according to [13]. Finally , the target stimulus τ can be identiﬁed as τ = ar gmax n ρ n . C. Least-Squar es T ransformation (LST) Human EEG is known to present pervasi ve and elusive variability across indi viduals and ev en within a single subject ov er time[14], posing a major obstacle in EEG data ex- change across subjects. This study assumes that there exists a transformation of SSVEP signals from one subject and another . That is, if the single-trial SSVEP signals of a new user are denoted as x and the ones of another existing user are denoted as ´ x ( x , ´ x ∈ R N C × N S ), we aim to ﬁnd a transformation matrix P ∈ R N C × N C such that x = P ´ x . W e can acquire P by applying a channel-wise least-square regression gi ven x and ´ x , i.e. ﬁrst, perform a least-square regression with ´ x as inputs and the ﬁrst channel of x as the target and second, the second channel of x as the target and so on. (See Fig. 1.) Howe ver , to prev ent the interference of noise, instead of using the single-trial signals x , we use the averaged signal ¯ x , obtained by averaging multiple trials of the signals from the new user . These calibration trials are called template. Each trial of the existing users will be transformed to signals ´ x trans which are similar to ¯ x , i.e. ¯ x ≈ ´ x trans,i = P ´ x i ( i is the trial number). Finally , all trials of x and ´ x trans are pooled together as new training data for TRCA. T o v alidate the ef ﬁcacy of the LST in transferring SSVEP data, we herein compared the SSVEP decoding performance using three schemes (shown in Fig. 2): Fig. 2. The ﬂowchart of preparation of training data for three schemes. 1) Baseline: a self-decoding approach where all training data are collected from a ne w user (i.e., the conv entional individual-template-based method). 2) Subject-transfer without LST (w/o LST): the training data consist of a small number of templates from the new user and a large number of trials from other subjects without any transformation. 3) Subject-transfer with LST (w/ LST): the training data consist of a small number of templates from the ne w user and a lar ge number of trials from other subjects transformed using LST . A series of experiments were conducted to validate the performance of the proposed LST approach for the cross- subject transfer of SSVEP data. The simulation experiments focused on decoding performance in the context of real- world usage. Lea ve-one-subject-out (LOSO) cross-v alidation was employed, where a test subject plays a new user and the other subjects are existing users. When one session of the new user is being tested, the 15 trials for each stimulus was divided into 5 and 10 trials randomly as a template set and a test set. W e then tested the decoding performance in that session using different sizes (1-5 trials) of templates from the template set and performed classiﬁcation on the 10-trial test set with those three schemes. In w/o LST and w/ LST schemes, 1-5 trials of templates from the ne w user were used, concatenating with all trials from e xisting users without/with LST to form the training set for TRCA. Lastly , both sessions of the test subject were tested independently , and the random separation of template/test set was repeated 10 times. The decoding performance of each test subject was estimated by the av erage of 20 accuracies (2 session times 10 repeats). I I I . R E S U LT S Fig. 3 compares the overall performances across all 8 subjects using those three schemes: baseline, w/o LST, and w/ LST. The result showed that w/ LST outperformed both w/o LST and baseline for all subjects under most circumstances applying dif ferent template sizes. In particular , when the sizes of templates were relativ ely small (two or less), the LST scheme w as capable of retaining the accuracy . When the template size was greater than 2, in subjects 1, 3, 6 and 8, the LST scheme achiev ed higher accuracy than other approaches. The LST scheme did not achiev e the best accuracy only when the accuracy approximated 100% (no room for improvement). The overall SSVEP decoding performance is presented in T able I. A two-way repeated measures analysis of vari- ance (ANO V A) showed signiﬁcant main effects in schemes ( F (2 , 119) = 12 . 58 , p < 0 . 001 ) and template sizes ( F (4 , 119) = 10 . 32 , p < 0 . 001 ). The LST scheme achieved the highest overall performance regardless of the template size. In circumstances where the template size was large, w/ LST might not have superiority against the baseline. On the other hand, when template size is as small as 1, both w/ LST and w/o LST outperform baseline, but no signiﬁcant difference was found between these two data-transferring schemes. When template size is no less than 2, w/ LST outperformed w/o LST. In a nutshell, the LST demonstrates its efﬁcac y in transforming data across subjects and thus is useful for tackling insufﬁciency of individualized data. I V . D I S C U S S I O N S The study results suggest the efﬁcac y of the proposed LST method, which signiﬁcantly enhances SSVEP decoding performance particularly when training templates are limited. While the current state-of-the-art SSVEP decoding method, template-based method with TRCA-based spatial ﬁltering [4] (baseline), struggles with time-consuming calibration sessions, The LST is capable of le veraging existing data from other subjects and alleviating the poor decoding performance due to the lack of individual training data. As shown in Fig 3, the LST scheme presents high accuracy using a limited number of templates (down to 1 template per stimulus), and the accuracy increases with the template size. This study v alidated the efﬁcac y of the LST in transform- ing SSVEP templates across subjects against the pervasi ve human variability in the EEG data [14]. For most of the subjects, na ¨ ıve data transferring (w/o LST) led to a lower accuracy than that of the LST , and its performance did not T ABLE I O V E RA L L S S V E P D E C O DI N G A C C U R AC Y AC RO S S S U B JE C T S A G A I N ST T E MP L A T E S I ZE U SI N G L S T A N D O T H E R S C HE M E S . T emplate Size 1 2 3 4 5 Baseline 14.4% 55.1% 77.1% 83.5% 86.4% w/o LST 60.3%* 61.5% 62.5% 63.5%* 64.4%* w/ LST 70.5%* 79 . 2 % * 84 . 2 % * 87 . 4 % 89 . 1 % * p < 0 . 05 (vs. ’Baseline’, signed rank test) Bold : p < 0 . 05 (vs. ’w/o LST’, signed rank test) Fig. 3. The averaged SSVEP decoding accuracy against template size (1-5 trials per stimulus) across 2 sessions and 10 shufﬂed permutations for each subject. The error-bars present the standard errors across permutations. improv e with acquiring additional templates from the user . The comparison implies that the LST is able to transform SSVEP data to model the brain response across different subjects, obviating the impact of human variability . Finally , comparable performances were found using con- ventional TRCA approach (baseline) and the LST scheme when the template size grows to 5, suggesting that lev eraging a large amount of data from others has no observ able beneﬁt when ne wly collected individual templates are sufﬁcient. This is in line with the rationale of training-based SSVEP methods, which emphasize the importance of individualized calibration for SSVEP decoding. Nonetheless, the proposed LST method provides a satisfactory alternative source of training data and considerably reduces the calibration time for the prospecti ve plug-and-play high-speed BCI spellers based on SSVEPs. V . C O N C L U S I O N S This study proposes a cross-subject transfer method, LST , for transforming SSVEP data from one subject to another . The experimental results suggest the ef ﬁcacy of the LST method in alle viating the inter-subject variability in the SSVEP data and signiﬁcantly improve the transferring ef- ﬁciency . The improv ement in the SSVEP decoding accurac y using a limited template size from a new user was very promising, suggesting a practical approach towards an online high-speed SSVEP-based BCI system with minimal calibra- tion ef fort and maximal con venience and user -friendliness. Further study will be v alidating the LST method on different datasets including the data recorded with different headsets. R E F E R E N C E S [1] J. R. W olpaw , N. Birbaumer, D. J. McF arland, G. Pfurtscheller , and T . N. V aughan, Brain-computer interfaces for communication and control, Clin. Neurophysiol., vol. 113, no. 6, pp. 76791, 2002. [2] Y . W ang, X. Gao, B. Hong, C. Jia, and S. Gao, Brain-computer interfaces based on visual ev oked potentials, IEEE Eng. Med. Biol. Mag., vol. 27, no. 5, pp. 6471, 2008. [3] X. Chen, Y . W ang, M. Nakanishi, X. Gao, T .-P . Jung, and S. Gao, Highspeed spelling with a noninv asiv e brain-computer interface, Proc. Natl. Acad. Sci. U. S. A., vol. 112, no. 44, pp. E605867, 2015. [4] M. Nakanishi, Y . W ang, X. Chen, Y .-T . W ang, X. Gao, and T .-P . Jung, Enhancing detection of SSVEPs for a high-speed brain speller using task-related component analysis, IEEE T rans. Biomed. Eng., vol. 65, no. 1, pp. 10412, 2018. [5] Y . T . W ang, Y . W ang, and T . P . Jung, A cell-phone-based brain- computer interface for communication in daily life, J. Neural Eng., vol. 8, p. 025018, 2011. [6] P . Martinez, H. Bakardjian, and A. Cichocki, Fully online multicom- mand braincomputer interf ace with visual neurofeedback using SSVEP paradigm. Comput. Intell. Neurosci. v ol. 2007, p. 94561, 2007. [7] M. Nakanishi, Y . W ang, Y .-T . W ang, Y . Mitsukura, and T .-P . Jung, A high-speed brain speller using steady-state visual evoked potentials, Int. J. Neural Syst., v ol. 24, no. 6, p. 1450019, 2014. [8] O. Friman, I. V olosyak, and A. Graser, Multiple channel detection of steady-state visual ev oked potentials for braincomputer interfaces. IEEE Trans. Biomed. Eng. vol. 54(4), pp. 742790, 2006. [9] Z. Lin, C. Zhang, W . Wu, and X. Gao, Frequency recognition based on canonical correlation analysis for SSVEP-based BCIs, IEEE Trans. Biomed. Eng., vol. 54, no. 6, pp. 117276, 2007. [10] P . Y uan, X. Chen, Y . W ang, X. Gao, and S. Gao, Enhancing per- formances of SSVEP-based brain-computer interfaces via exploiting inter-subject information, J. Neural Eng., v ol. 12, p. 046006, 2015. [11] N. R.W aytowich, V . J. Lawhern, A.W . Bohannon, K. R. Ball, and B. J. Lance, Spectral transfer learning using information geometry for a user-independent brain-computer interf ace, Front. Neurosci., vol. 10, p. 430, 2016. [12] D. H. Brainard, The Psychophysics T oolbox, Spat. V is., vol. 10, no. 4, pp. 4336, 1997. [13] X. Chen, Y . W ang, S. Gao, T .-P . Jung, and X. Gao, Filter bank canonical correlation analysis for implementing a high-speed SSVEP- based braincomputer interface, J. Neural Eng., vol.12, no.4, 046008, 2015. [14] C.-S. W ei, Y .-P . Lin, Y .-T . W ang, C.-T . Lin, and T .-P . Jung, A subject- transfer frame work for ob viating inter - and intra-subject v ariability in EEG-based drowsiness detection, NeuroImage, vol. 174, pp. 40719, 2018. [15] C.-S. W ei, M. Nakanishi, K.-J. Chiang, and T .-P . Jung, Exploring Human V ariability in Steady-State V isual Evoked Potentials, in Proc. IEEE Int. Conf. Syst. Man, Cybern., 2018.

Cross-Subject Transfer Learning Improves the Practicality of Real-World Applications of Brain-Computer Interfaces

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment