Selective neuromodulation induced by alpha-based neurofeedback learning: A double-blind randomized study

Affiliations a Sorbonne Université, Institut du Cerveau Paris Brain Institute (ICM), INSERM U 1127, CNRS UMR 7225, Equipe Aramis, 75013, Paris, France b myBrain Technologies, 75010, Paris, France c INRIA, Aramis project-team, 75013, Paris, France e Institut du Cerveau Paris Brain Institute ICM, Centre MEG-EEG, Paris, France f CNRS, UMR 7225, F-75013, Paris, France g Inserm, U 1127, Paris, France h Sorbonne Université, Paris, France i Institut du Cerveau Paris Brain Institute ICM, Equipe CIA Cognitive Control, Interoception, Attention, 75013, Paris, France j AP-HP, Hôpital Pitié-Salpêtrière, Service de Psychiatrie Adulte, 75013, Paris, France k Institut du Cerveau Paris Brain Institute ICM, Equipe Experimental Neurosurgery, 75013, Paris, France

NF can be considered as a particular case of reinforcement learning, by which the subject is rewarded through the feedback for modulating the targeted brain activity. This modulation is usually learnt through a cognitive strategy enforced by task instruction and related to the cognitivo-motor ability to be improved or restored. The ensuing cognitive or motor improvements are typically used as the primary outcome measures to assess NFT efficiency, in particular in clinical applications [14], [18], [19]. Control conditions are quite variable, not only aiming at the closed loop between brain activity and feedback but also varying the task or the procedure [20]. For example, in clinical NF application studies, where NFT aimed at reducing behavioral symptoms associated with various disorders (anxiety [14]- [16], depression [10], [11], [21], addiction [22]- [25], attention deficit [4]- [8], [26]), NF performance was typically compared with other procedures, such as cognitive therapy, mental exercise, and treatment-as-usual [20]. Thus, the behavioral or clinical benefits of NFT may be related to an ensemble of specific and non-specific mechanisms, including psychosocial influences [27], cognitive and attentional/motivational factors [28], test-retest improvement, as well as spontaneous clinical improvement or cognitive development [17], contributing to the ongoing debate about NF efficiency [15], [19].
A key question concerns the neuromodulation associated with NFT, that is, induced by the closed loop between the subject's brain activity and feedback. Addressing this question requires a control condition equating every other aspect of the NF protocol-such as instruction, task, reward. In some NF studies, the control condition was based on linking the feedback to another brain activity than the targeted one [20]. Such control condition may pinpoint the cognitivo-motor changes specifically associated with the targeted brain activity [20]. However, it entails an incongruity between the activity driving the feedback and the task-hence the cognitive efforts-of the subject, with potential, confounding impacts not only on performance, but also on reward and motivation. A sham feedback (sham-FB) condition-where the feedback in the control group is matched on the experimental group feedback-is necessary to allow investigation of the neuromodulation specifically associated with NF. To date, few studies have used such control condition, with mixed results. For instance, Witte et al. did not find evidence for neuromodulation of the sensorimotor rhythm over multiple NFT sessions [29]. Naas et al. [30] used a daily NF protocol targeting upper alpha band over four days and did not reveal any significantly different modulation of the targeted activity between groups over time. In contrast, Beatty [31] showed an increase of the trained alpha activity in the NF group but not in the sham-FB condition during a unique NF session.
Here, we aimed at studying the neuromodulation specifically induced by alpha-based NFT over multiple sessions throughout 12 weeks, by contrasting it to a carefully controlled sham-FB condition in a randomized double-blind study within general population. We used a new compact and wearable EEG-based NF device (Melomind TM , myBrain Technologies, Paris, France) [32]. Participants, immersed in a relaxing soundscape delivered by headphones, were engaged in learning to decrease the volume of a sound indicator based on the individual alpha-band EEG activity recorded by two parietal dry electrodes. Alpha-band activity is used in most NF protocols aiming at anxiety reduction, stress-management or well-being [14], [15], [33], [34]. The participants in the sham-FB group received 'yoked' feedback, corresponding to the feedback of randomly-chosen subjects from the NF group at the same stage of learning. Hence, this feedback was similar in every aspect to the one in the NF group, except that it was not the result of an established closed loop between the subject's alpha-band activity and the auditory stream. In particular, it controlled for reward and performance across both groups [20]. We expected a selective increase of the targeted neural activity across the program only for the NF group. We also examined behavioral outcomes in terms of anxiety level, relaxation, and feeling of control.

Participants
Forty-eight healthy volunteers participated in this study (mean age: 33.3 years; age range: 18-60; see Supplementary Table S1 for more details). All declared having normal or corrected-to-normal vision, no hearing problem, no history of neurological or psychiatric disorders, no ongoing psychotropic drug treatment. Participants were randomly and blindly assigned either to the NF group-who received real NF-or to the control group-who received sham-FB, except for the first few participants who-for the purpose of the sham-FB-were assigned to the NF group. The experimenters were also blind to the assignment.
All participants gave written informed consent to participate in the study. All procedures were

EEG recording and preprocessing
Brain activity was recorded by two gold-coated dry electrodes placed on parietal regions (P3 and P4) with the CE marked Melomind TM device (myBrain Technologies, Paris, France; https://www.melomind.com/; Fig. 1). Ground and reference were silver fabric electrodes, placed on the left and right headphones respectively, in mastoid regions. EEG signals were amplified and digitized at a sampling rate of 250 Hz, band-pass filtered from 2 Hz to 30 Hz in real-time and sent to the mobile device by embedded electronics in the headset. The headset communicated via Bluetooth with a mobile NF application, which processed EEG data every second to give the user auditory feedback about his/her alpha-band activity (see below). Real-time estimation of signal quality was performed by a dedicated machine learning algorithm [35] to detect noisy segments before computing the feedback.

Experimental protocol
The protocol consisted of 12 NFT sessions, with one session per week (Fig. 2). Each session was composed of 7 exercises of 3 minutes (total: 21 minutes). At the beginning and end of each session, two-minute resting state recording was performed and the participant completed the Spielberger State-Trait Anxiety Inventory (STAI, Y-A form, in French, [36])-to assess his/her anxiety state level-

Neurofeedback training procedure
The NF paradigm targeted alpha rhythm centered on the individual alpha frequency (IAF). Before each NFT session, a 30-second calibration phase allowed computing IAF using an optimized, robust estimator dedicated to real-time IAF estimation based on spectral correction and prior IAF estimations [38]. Participants were instructed to close their eyes, be relaxed and try to reduce the auditory feedback volume by their own strategies throughout the exercises (see Supplementary Material for the reported mental strategies). A relaxing landscape background (e.g. forest sounds) was played at a constant, comfortable volume during each exercise. The audio feedback was an electronic chord sound added to this background with a volume intensity derived from EEG signals.
More precisely, the individual alpha amplitude was computed in consecutive 1-second epochs as the root mean square (RMS) of EEG activity filtered in IAF±1Hz band (NF index); it was normalized to the calibration baseline activity to obtain a 0-1 scale, which was used to modulate the intensity of the feedback sound in the NF group (0: silent feedback; 1: maximal volume). For the participants in the control group, the instruction was identical but they received sham-FB, which was the replayed feedback variations from randomly chosen subjects in the NF group at the same training level (i.e. session).

Data analysis NF index and learning score
For each participant and each training session, we computed the average value of the NF index (before normalization) for every exercise. Moreover, we built an NF learning score (ΔD(t))-from the NF index variations across exercises and sessions [39]: we computed the median value (med) of the NF index across the 7 exercises of the first session; then, for each session t, we computed D(t), the number of NF index values (1 by second) above or equal to med. This corresponded to a cumulated duration. It was divided by the total duration of the training session (21 minutes) in order to express D(t) by minute, and transformed into percent change relative to the first session, as follows (Eq. (1)):

Theta and low beta activities
To study the selectivity of the neuromodulation to the targeted alpha activity, we analyzed the between-session evolutions of theta (4-7Hz) and low beta (13-18Hz) activities. For each subject, on each exercise and session, theta activity was computed every second as the RMS of EEG activity filtered between 4 and 7Hz in 4-second sliding windows, on epochs with high or medium quality (see [35] for details about signal quality computation). We then averaged these RMS values for each session. Similar computation was performed between 13 and 18Hz for low beta activity.

Signal quality
As encouraged in [17], the quality of EEG signals was analyzed to test bad EEG data prevalence between groups and across sessions. For each participant, session, and exercise, the quality of each 1-second EEG epoch on each electrode was determined by a classification-based approach according to three labels: HIGHq, MEDq, and LOWq (see [35] for more details). A quality index Q was then computed for each electrode, during each exercise, as in Eq. (2): with: #HIGHq, #MEDq, #LOWq indicating the number of high, median, low quality epochs and N, the total number of quality labels during the session. Finally, the average value of Q was computed across both electrodes for each exercise.

Behavioral outcomes
The raw scores of the STAI-Y-A (between 20 and 80) and relax-VAS (between 0 and 10) were computed pre-and post-session. The subjective level of feedback control was measured within-and between-session on the control-VAS (between 0 and 10).
We used Linear Mixed Models (LMMs) [41], [42], because LMMs allow handling missing data, variability in effect sizes across participants, and unbalanced designs [43]. Available data in this study are detailed in Supplementary Table S2.
For all LMM analyses, the NF group at the first session was set as the level of reference in order to specifically estimate the effects of NFT in this group. A random effect structure with random intercept by participant (subject_id) was fit.
Similarly to [44], we used fixed effects of session, exercise, group, and the 2-way interactions between session and group and between exercise and group to analyze the within-and betweensession NFT effects on the NF index (Eq. (3) as coded in R): Y ~ 1 + session + exercise + group + session:group + exercises:group + 1|subject_id (3) Preliminary analyses indicated that the effect of exercises followed a U-curve. Therefore, the exercises were coded as a quadratic term, that is, exercises 1 to 7 were coded as 9, 4, 1, 0, 1, 4, and 9. The sessions were coded as a numeric variable between 0 and 11. Eq. (3) was also used for the analysis of the control-VAS scores.
For the analysis of NF learning score, theta activity, low beta activity, and the signal quality index, we used the following LMM equation (Eq. (4)): Y ~ 1 + session + group + session:group + 1|subject_id (4) For the behavioral outcomes (relax-VAS and STAI-Y-A), we used LMMs with session, phase (pre-or post-session), group, and the 2-way interactions between session and group and between phase and group as fixed effects (Eq. (5)): Y ~ 1 + session + phase + group + session:group + phase:group + 1|subject_id (5) For each model, parameter estimates ( ) for the effects of interest were obtained by fitting the models on the corresponding dependent variable, using the Restricted Maximum Likelihood (REML) approach. P-values were estimated via t-tests using the Satterthwaite approximation to degrees of freedom with lmerTest package [45]. We considered p < 0.05 as statistically significant. As a complement, we computed analysis of variance (anova) on each LMM to check for main effects when no interaction was found; these are reported in supplementary material.  (Fig. 4).

Selectivity of the neuromodulation on alpha activity
To investigate the selectivity of the neuromodulation relative to the targeted alpha activity, we analyzed EEG activity in two adjacent bands (theta and low beta). For theta activity, there was a which was not specific to the NF group. For low beta activity, no significant effect was observed (see Supplementary Tables S7, S9 and Supplementary Fig. S6 for details).

Signal quality
The estimate of the slope of the quality index evolution across sessions was significantly negative   Table   S15).

Discussion
We proposed a double-blind randomized study of the neuromodulation induced by alpha-based NFT over 12 weekly sessions using a strictly controlled sham-FB condition as control. NFT was performed with a wearable, dry sensor headset, which delivered intensity-modulated auditory feedback based on EEG signal amplitude in individual alpha frequency band. To avoid non-contingency between produced efforts and the resulting feedback evolution for the control group [20], the control condition consisted in delivering sham-FB-a feedback replayed from randomly chosen users of the NF group at the same training stage. Hence, all participants benefited from the proposed NFT experience, but only those of the NF group experienced a closed-loop between the feedback and their own alpha activity.
NF learning refers to the capacity to self-regulate a targeted activity in the desired direction across training [19]. Here, the NF group was expected to increase individual alpha activity across training sessions, reflecting lasting effects [46]. This was confirmed by the analyses of the NF index and the NF learning score, which both demonstrated a specific session effect in the NF relative to the control group. This finding demonstrates a specific neuromodulation induced by the closed loop between individual alpha activity and FB, which could suggest an Hebbian neuroplasticity associated with NFT [19]. The use of a randomized double-blind protocol together with the strict sham-FB control condition allowed us to control for the many potential confounding factors which may contribute to NFT effects. In particular, it allowed controlling for context, task, reward, and performance, avoiding potential motivational biases in NF versus control conditions [20]. Moreover, we examined if the neuromodulation was selective of the targeted activity, which is rarely studied [47], [48]. As proposed in [39], we analyzed two adjacent frequency bands (theta and low beta). We found an overall increase across sessions for the theta band, not specific to the NF group, and no significant change for the low beta band. This provided evidence for the selectivity of the neuromodulation associated with NFT. To the best of our knowledge, this is the first evidence of selective longitudinal alpha-band neuromodulation (over 12 weeks) in a double-blind randomized study implying healthy participants trained with a wearable dry sensors NF device.
Additionally, we uncovered some interesting aspects of the dynamics of alpha band activity during NF learning, with a U-shape pattern across exercises. Alpha activity is a spontaneous but complex rhythm associated with several cognitive states and processes. Its modulation has been predominantly related to vigilance, attention [49], [50], awake but relaxed state [33], [34], [51]- [56].
Here, the NFT procedure was designed to train participants to increase alpha activity with the ultimate aim to improve their relaxation state. Yet, the alpha activity change across exercises during the sessions seemed to reflect the different cognitive processes involved by the task: Continuous monitoring of the feedback may have required heightened focused attention [57], [58], error detection [59], [60], and working memory processes [61] during the first training minutes, allowing participants to progressively adapt their cognitive strategy and mental state to the task. This may have resulted in the initial decrease of the alpha activity along the training exercises [58]- [60]. Then, the subjects managed to adjust their cognitive strategy-mainly attentional focus on landscape sounds and attention defocusing-to upregulate their alpha activity. It is important to note that the within-session U-shape pattern of alpha activity was observed in both the NF and control groups.
This supports the idea that the sham-FB condition allowed us to rigorously control for the task performed by the subjects. Altogether, the U-shape exercise effect on alpha activity seemed to reflect the combination of multiple cognitive processes in relation with NF learning, while the specific neuromodulation of alpha activity induced by NFT was revealed in the longitudinal effect across the twelve sessions.
We also examined EEG data quality, because it can have a confounding effect on NF learning [17].
EEG quality decreased slightly across sessions in both groups. Thus, it did not seem to contribute to the neuromodulation observed in the NF group.
We investigated the behavioral changes in terms of relaxation and anxiety levels pre-and postsession and across the training program. There was a reduction of anxiety level and an increase of relaxation across sessions, without any group difference. This is likely explained by our study protocol. Thanks to the design of the sham-FB condition, participants in the control group experienced the same immersive, relaxing experience as the NF group. Thus, our behavioral results may be explained by non-specific mechanisms of the NFT, such as the relaxing training context, placebo effect, and repetition-related effect [17]. One may note that anxiety decreased more from pre-to post-session in the control than the NF group. The sham-FB condition may have allowed the subjects from the control group to be rewarded with somewhat less cognitive effort, which may have reduced their state anxiety level, for example related to their performance during the session [62]. Furthermore, the subjects of this study were low to moderately anxious, which might have contributed to the lack of difference between groups. Indeed, Hardt and Kamiya [63], in their alpha-upregulation NFT study, observed reduction of anxiety level for high but not low anxious subjects.
Further investigations with high anxious participants should allow to test if a specific behavioral effect of NF may be highlighted. Overall, our findings showed that NFT induced positive behavioral benefits for all participants but the part due to neuromodulation remained unclear. Indeed, the links between behavioral outcomes and neurophysiology are complex and include several factors [17], such as cognition, attention, motivation [28], training frequency [64], but also the choice of the neuromarker itself [19]. Further investigations should focus on the research of specific biomarkers related to psychophysiological factors, for example using neurophenomenology to study the link between neural activity modulation and participant's inner experience [65].
Finally, departing from other studies using sham-FB control groups [20], we asked participants to assess their feeling of control during the training [66]. We found an increase of the feeling of control across sessions in both groups, which suggests that participants of the control group were not aware of the non-contingency between their efforts and the feedback signal and had a qualitatively similar experience as those of the NF group. Moreover, the increase in the feeling of control across sessions was more marked in the NF group. This suggests that the more active nature of FB modulation achieved in the NF group relative to the control group may have resulted in a greater feeling of control, eventually associated with a sense of agency in task performance [66]. Importantly, this result supports the view that the neuromodulation observed in the NF group was due to the subject's gaining control over his/her brain signal, rather than motivational effects [20].
To conclude, our study demonstrated a selective and specific neuromodulation associated with NFT of alpha-band activity with a wearable dry-sensor EEG device. Even if the relationship between the targeted EEG modulation and behavioral outcomes is complex and remains to be fully elucidated, the recent development of such wearable EEG systems is promising to easily perform NF protocols to induce neuromodulation in various applications like Parkinson's disease, epilepsy, sleep, attention or anxiety disorders and better understand underlying mechanisms.