The Influence of Prior Sensory Context on Meditative Neural States ()
1. Introduction
Despite its growing global popularity, a universally accepted definition of meditation remains elusive. Some describe it as a self-directed, mind-calming exercise, while others see it as an intense mental focus—two seemingly opposing views [1]. Meditation is often linked to mindfulness, emphasizing present-moment awareness and the connection between internal and external consciousness [2]. It involves focusing attention on an object, word, or breath to reduce stress and distractions. Alternatively, meditation can be defined through its neurobiological effects, as a practice that alters brain physiology and neurochemistry, leading to measurable changes in cognition and behavior [3].
Starting in the 1960s, meditation began to be incorporated into Western psychological practices, and in the following decades, mindfulness and meditation gained significant traction within Western psychology [4]. Meditative practices have documented benefits on both physical and mental health. Regular meditation has been associated with positive epigenetic effects, such as the reduced methylation of stress-related genes [5], improved immune function [6], and even greater reductions in blood pressure than conventional health education interventions [7]. Beyond physiological outcomes, meditation also demonstrates promise in addressing psychological disorders, such as PTSD and depression [8], alleviating suicidal ideation [9], and improving mood and emotional regulation [10] [11].
At the neural level, mindfulness meditation engages key regions implicated in self-regulation and emotion, including the medial prefrontal cortex, insula, amygdala, and posterior cingulate cortex [12]. Structural neuroimaging studies reveal that experienced meditators show increased gray matter volume in frontal and hippocampal regions [13] and greater cortical thickness in attention-related areas [14] [15]. Electrophysiological investigations complement these findings: enhanced alpha and theta power during meditation reflect internalized attention and relaxation [16] [17], while sustained gamma activity has been reported in long-term practitioners, suggesting stable neural synchrony [18]. Conversely, EEG markers of alertness and drowsiness are well established: sustained attention is associated with frontal midline theta and suppressed alpha, whereas drowsiness increases low-frequency activity (delta/theta) and modulates beta power [19]-[22]. However, how these baseline attentional or arousal states interact with meditation remains underexplored. It is plausible that an individual’s pre-meditation state—alert, fatigued, or stressed—can modulate neural dynamics during meditation [11] [23]. This aligns with the concept of state integrity, wherein task-related neural patterns are influenced by pre-task conditions.
Contemporary meditation research increasingly relies on affordable, portable EEG systems, such as the NeuroSky and Muse headsets, to democratize data collection [24]. These consumer-grade devices introduce noise and sparse sensor coverage but offer practical advantages for large-scale studies. The challenge lies in compensating for hardware limitations with sophisticated analytic methods. Recent progress in machine learning (ML) has enabled more nuanced analysis of complex EEG patterns beyond traditional univariate band-power statistics. Many EEG-based brain-computer interface (BCI) studies still employ linear classifiers—such as Linear Discriminant Analysis (LDA) and logistic regression—owing to their simplicity, speed, and interpretability [25] [26]. Yet, meditation-related EEG features may follow non-linear relationships, motivating the adoption of ensemble and kernel-based models [27].
Building on these trends, this study employs both linear (logistic regression, LDA) and non-linear (Random Forest, XGBoost) classifiers to examine EEG patterns associated with attention and meditation levels recorded by a consumer-grade device. The objective is not to propose new neurophysiological discoveries but to compare the methodological performance and interpretability of linear and non-linear models. We further address how prior context—including participants’ alertness, arousal, and sensory exposure—can shape the observed EEG signatures of meditation [23]. This approach reframes the investigation from a discovery claim into a methodological comparison and re-analysis, consistent with recent reviewer guidance, emphasizing transparency, reproducibility, and the limitations inherent to consumer EEG paradigms.
2. Materials and Methods
Participants and Dataset: The EEG data for this study were obtained from an existing publicly available dataset rather than collected firsthand [28]. The dataset consisted of EEG recordings from N = 10 student participants (college-age; mixed gender). Each participant completed one session containing 20 short trials that mimicked meditation practice while the participant viewed various video clips. All participants were tested in a single sitting (no multi-day sessions), and all provided informed consent for the use of their data. The publicly available nature of the dataset means that no additional institutional review was required for this analysis. For reproducibility, the processed dataset and analysis code have been made openly available: the preprocessed EEG features and labels are accessible online (e.g., OSF repository and the Python analysis scripts are provided in a GitHub repository). Statistical comparisons of EEG bands between states used paired t-tests (unit of analysis: per-participant means, FDR-corrected for multiple comparisons). Machine learning models were optimized via stratified 5-fold cross-validation on the training set. Final performance metrics are reported on a held-out test set (30% of data). Additionally, to facilitate replication, we have made all processed EEG features and labels and our analysis code openly available. The preprocessed data are posted on an Open Science Framework (OSF) repository, and all analysis scripts are provided on GitHub.
EEG Recording Hardware: EEG signals were collected using the NeuroSky MindWave, an affordable, entry-level EEG headset designed for educational, entertainment, and basic research applications. It features a single stainless steel dry sensor, approximately 12 mm × 16 mm in size, sampling data at 512 Hz [29]. The sensor is positioned at the left frontal pole (FP1), accompanied by an ear clip for reference and ground electrodes. The exact impedance of NeuroSky's dry electrode is undisclosed, but dry electrodes typically have higher impedance than wet electrodes. Studies report dry electrode impedance ranges from 65 - 120 kΩ, compared to 5 - 10 kΩ for wet electrodes [30]. The device wirelessly transmits data via Bluetooth to compatible devices, providing real-time metrics such as EEG power spectrums, NeuroSky’s proprietary eSense levels of attention and meditation, and eye blink detection (https://neurosky.com/biosensors/eeg-sensor/algorithms/).
Key features include its lightweight and portable design, ease of use, and developer-friendly API for creating custom applications. The NeuroSky MindWave headset weighs about 90gand features dimensions of 23 cm in height and 16 cm in width (Figure 1).
Figure 1. NeuroSky MindWave (https://store.neurosky.com/pages/mindwave).
Independent validation work shows NeuroSky MindWave signals are noisier than research-grade wet-electrode systems but correlate with them and can record stably over time (blink detection similar across systems), supporting cautious use in exploratory studies [29].
Preprocessing and Feature Extraction: Proprietary “attention” and “meditation” scores were output by the device at 1 Hz, indexing mental focus and calmness, respectively. For our analysis, raw EEG data (the continuous FP1 channel) were first band-pass filtered to 1 - 50 Hz to remove DC drift and high-frequency noise. We implemented this filtering in Python using a fourth-order Butterworth filter (applied in zero-phase manner to avoid lag). Epochs containing large artifacts (e.g., pronounced eye blinks or muscle movements) were identified by visual inspection of the time series and removed from further analysis. No additional proprietary preprocessing steps (beyond the device’s internal algorithms) were applied to the data. After artifact rejection, we utilized the device’s real-time spectral outputs as our features. In particular, the MindWave provides estimates of band power in eight frequency bands: Delta (0.5 - 4 Hz), Theta (4 - 8 Hz), Low-Alpha (8 - 10 Hz), High-Alpha (10 - 12 Hz), Low-Beta (12 - 15 Hz), High-Beta (approximately 18 - 30 Hz in NeuroSky’s categorization), Low-Gamma (~30 - 50 Hz), and High-Gamma (~60 - 200 Hz). These band powers (typically updated once per second by the device’s onboard algorithm) were recorded for each participant’s trials. We extracted the band power values as the feature vector for each time epoch. In essence, each trial (or time sample) is represented by an 8-dimensional feature vector of EEG band powers. We did not compute additional features like coherence or entropy for this study, focusing on the standard oscillatory bands provided. All data processing and feature extractions were performed using custom Python code (Python 3.8 environment). Common scientific libraries such as NumPy and SciPy were used for filtering and signal handling, and the Pandas library was used for data organization.
Label Definition (“Sleepy” vs. “Awake” States): Groups (“sleepy” vs. “awake”) were defined post hoc for each trial/epoch based on the NeuroSky attention score. Following Wang et al. [28], we used a median split on attention levels to categorize the data: epochs where the attention score was below the median were labeled as “sleepy” (drowsy state), and those with attention at or above the median were labeled as “awake” (alert state). Because attention scores can vary across individuals, this thresholding was effectively done on a per-participant basis (each participant’s distribution of attention values was split at that participant’s median). This approach ensures roughly balanced sleepy vs. awake segments for each person, while accounting for individual differences in baseline attention. It should be noted that the attention metric is a proprietary composite derived from the EEG by the device’s algorithm; thus, our operational definition of “sleepiness” here is specifically “low NeuroSky attention score,” which generally corresponds to lapses in focus or drowsiness. Meditation scores (the device’s calmness index) were not used for labeling but were analyzed separately for correlations with brain activity.
Baseline Model Comparison: In our machine learning pipeline, we implemented a baseline linear classifier and compared it against more complex models. We deliberately selected both simple linear models and more complex non-linear models to enable a direct performance comparison under identical conditions. Our goal was to compare modeling approaches themselves, in terms of accuracy and interpretability, rather than to discover new EEG features, so using well-established algorithms as benchmarks was appropriate.
As linear methods, we considered Linear Discriminant Analysis (LDA) and Logistic Regression. LDA is a classical statistical method that finds a linear combination of features which best separates two or more classes; it has long been used in EEG classification due to its simplicity and robustness, especially in brain–computer interface studies [25]. Logistic regression is another linear technique for binary classification that estimates the probability of class membership by applying the logistic (sigmoid) function to a linear predictor (a weighted sum of the input features). The sigmoid transformation maps any real-valued input to a 0 - 1 range, yielding an output that can be interpreted as the likelihood of the instance being in the positive class. We expected these linear models to provide a baseline level of performance for distinguishing the two mental states.
In addition to the linear models, we implemented two non-linear ensemble methods: Random Forest and XGBoost. Random Forest is a supervised machine learning algorithm used for classification and regression tasks, including binary classification. It is an ensemble of decision trees, where each tree is trained on a randomly selected subset of the training data and a random subset of features [31]. The diversity introduced by this random sampling helps to reduce overfitting, and final predictions are made by aggregating the outputs of all trees (majority voting for classification). XGBoost (eXtreme Gradient Boosting) is a gradient-boosted decision tree algorithm that sequentially builds an ensemble of trees. Each new tree is trained to correct the errors of the preceding ensemble, and the method includes regularization parameters to prevent overfitting. XGBoost is known for its efficiency and performance on structured data, with optimizations such as second-order gradient descent and sparsity-aware tree splitting [32].
Logistic Regression
Logistic regression is a statistical technique for binary classification that estimates the probability of an outcome by applying the logistic, or sigmoid, function to a linear predictor formed from the input variables. This sigmoid function transforms any real-valued input into a bounded range between 0 and 1, thereby yielding outputs that can be directly interpreted as the likelihood of class membership.
Random Forest
Random Forest is a supervised machine learning algorithm used for classification and regression tasks, including binary and multi-class classification. It is an ensemble learning method that combines predictions from multiple decision trees to improve accuracy and reduce overfitting. Each tree is trained on randomly selected subsets of data and features, ensuring diversity among trees. Predictions are made by aggregating outputs through voting (for classification) or averaging (for regression). This randomness enhances model robustness and prevents overfitting [30].
XGBoost
XGBoost (eXtreme Gradient Boosting) is a decision tree-based ensemble machine learning algorithm. It combines the predictions of multiple weak models to create a strong, highly accurate predictive model. With a novel tree-learning algorithm designed for sparse data and a weighted quantile sketch procedure, XGBoost is designed to handle large datasets efficiently. These algorithmic and system-level optimizations make XGBoost a powerful tool for large-scale machine-learning tasks [31].
Model Training and Validation: The prepared dataset was divided into training and testing sets to evaluate model performance. We used stratified sampling to ensure that the class distribution (sleepy vs. awake) was preserved: 70% of the data from each class was randomly assigned to the training set and the remaining 30% reserved for testing. Model development was carried out in Python using the scikit-learn library (for logistic regression, LDA, and Random Forest) and the XGBoost Python API. During training, we performed hyperparameter tuning for each model using five-fold cross-validation on the training set. After tuning, a final model was trained on the entire 70% training data using the optimal parameters, and this model was then evaluated on the 30% held-out test set to obtain an unbiased measure of performance.
Confusion matrix
A confusion matrix (CM) compares the predicted labels with the actual class labels, providing insights into how often predictions are aligned or misaligned with the true values. Performance metrics such as accuracy, precision, recall, and F1-score are derived from the CM.
True positive (TP) defines instances where the model correctly predicts a positive outcome. False positive (FP) defines instances where the model incorrectly predicts a positive outcome when the actual outcome is negative. True negative (TN) defines instances where the model correctly predicts a negative outcome. False negative (FN) defines instances where the model incorrectly predicts a negative outcome when the actual outcome is positive (Figure 2).
Figure 2. Confusion matrix for binary classification, illustrating True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) counts.
Accuracy is defined as the ratio of correctly classified instances (true positives and true negatives) to the total number of instances in the confusion matrix as follows.
(1)
Precision is the ratio of true positives to the total number of units predicted as positive (the sum of the predicted positives in the column) as follows.
(2)
Recall is the ratio of true positives to the total number of actual positives (the sum of the true positive row) as follows.
(3)
The F1 score is the harmonic mean of precision and recall as follows.
(4)
ROC Curve
The Receiver Operating Characteristic (ROC) curve is a graphical tool used to assess the performance of binary classification systems. It is constructed by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The TPR, also known as sensitivity, represents the proportion of actual positive cases correctly identified, while the FPR indicates the proportion of negative cases that are mistakenly classified as positive. The overall effectiveness of the classifier is often summarized by the area under the ROC curve (AUC). An AUC value of 1.0 reflects perfect classification, whereas an AUC of 0.5 suggests that the model performs no better than random chance. All performance metrics were calculated on the test set. We report these metrics and provide plots of the ROC curves for representative models.
3. Results
The distribution of attention scores had a mean and standard deviation of 4.82 × 101 ± 2.19 × 101 and a range of [1.00, 100.00] (Figure 3).
Figure 3. Histogram of attention scores.
The distribution of meditation scores had a mean and standard deviation of 5.66 × 101 ± 1.91 × 101 and a range of [1.00, 100.00] (Figure 4).
Figure 4. Histogram of meditation scores.
After labeling the data, we obtained 2135 instances in the “sleepy” condition and 1600 instances in the “awake” condition. These counts reflect the total number of epochs classified, pooled over participants. (Each participant contributed roughly an equal number of sleepy and awake epochs due to the within-subject median split, though slight imbalances occur because of tied values and integer score distributions.)
We first examined how the meditation (calmness) score differed between conditions. Interestingly, the average meditation score was higher during sleepy trials (mean ± SD ≈ 60 ± 18) than during awake trials (≈52 ± 20). This difference was statistically significant (independent t-test, p < 0.01). In other words, participants tended to register as more “calm” on the device’s scale when their attention was low. This inverse relationship between focus and calmness is intuitive—as participants became drowsy, they were perhaps less mentally agitated, which the device interpreted as a higher meditation level.
Figure 5. Histogram of log Delta strength.
Next, we compared EEG band power features between the sleepy and awake groups. Figure 5 through Figure 12 summarize the differences in each frequency band, with error bars indicating standard errors and *** denoting p < 0.001 for paired comparisons. We performed paired t-tests for each frequency band, pairing each participant’s mean power in the sleepy vs. awake condition to account for within-subject variability. Key results are as follows:
The distribution of delta strength scores had a mean and standard deviation of 5.19 × 105 ± 6.00 × 105 and a range of [2.16 × 102, 3.60 × 106] (Figure 5).
The distribution of Theta strength scores had a mean and standard deviation of 1.36 × 105 ± 2.18 × 105 and a range of [1.38 × 102, 3.19 × 106] (Figure 6).
Figure 6. Histogram of log Theta strength.
The distribution of low alpha strength scores had a mean and standard deviation of 3.34 × 104 ± 5.20 × 104 and a range of [3.20 × 101, 6.99 × 105] (Figure 7).
Figure 7. Histogram of log Low Alpha strength.
The distribution of high alpha strength scores had a mean and standard deviation of 3.06 × 104 ± 5.27 × 104 and a range of [9.00, 7.86 × 105] (Figure 8).
Figure 8. Histogram of log High Alpha strength.
The distribution of low beta strength scores had a mean and standard deviation of 2.56 × 104 ± 3.72 × 104 and a range of [2.00, 5.96 × 105] (Figure 9).
Figure 9. Histogram of log Low Beta strength.
Figure 10. Histogram of log High Beta strength.
The distribution of high alpha strength scores had a mean and standard deviation of 2.33 × 104 ± 4.39 × 104 and a range of [3.00, 4.44 × 105] (Figure 10).
The distribution of low gamma strength scores had a mean and standard deviation of 8.12 × 103 ± 1.55 × 104 and a range of [6.00, 2.89 × 105] (Figure 11).
Figure 11. Histogram of log Low Gamma strength.
The distribution of high gamma strength scores had a mean and standard deviation of 2.09 × 105 ± 3.30 × 105 and a range of [4.70 × 101, 2.33 × 106] (Figure 12).
Figure 12. Histogram of log High Gamma strength.
The correlation coefficients between numeric features were calculated (Figure 13), with log Theta and log high alpha, log theta and log high beta, log high alpha and log high beta, log low alpha and log low beta, log high beta and log high gamma strongly correlated.
Figure 13. Correlation matrix.
Log Theta is strongly and positively correlated with log high alpha (r = 0.66) (Figure 14).
Figure 14. Scatterplot of log Theta strength vs. log high alpha strength.
Log theta is strongly and positively correlated with log high beta (r = 0.66) (Figure 15).
Figure 15. Scatterplot of log Theta strength vs. log High Beta strength.
Log high alpha is strongly and positively correlated with log high beta (r = 0.63) (Figure 16).
Figure 16. Scatterplot of log High Alpha strength vs. log High Beta strength.
Figure 17. Scatterplot of log low Beta strength vs. log low Alpha strength.
Log low beta is strongly and positively correlated with log low alpha (r = 0.69) (Figure 17).
Log high beta is strongly and positively correlated with log high gamma (r = 0.65) (Figure 18).
Figure 18. Scatterplot of log High Beta strength vs. log High Gamma strength.
These strong inter-band correlations (e.g., between theta and alpha, or theta and beta power) suggest that multiple frequency bands increase together as participants grow drowsier. In other words, a common underlying arousal factor likely drives concurrent shifts across bands—an important point because it means several features are redundantly signaling the transition from an awake to a sleepy state.
Figure 19. Bar plot of mean attention scores between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
The mean attention scores of the sleepy group are statistically and significantly different from those of the awake group (sleepy: n = 2135, 4.73 × 101 ± 2.26 × 101, awake: n = 1600, 4.94 × 101 ± 2.28 × 101, stats = −3.01, p = 2.59 × 10−3, paired t-test) (Figure 19).
The mean meditation scores of the sleepy group are statistically and significantly different from those of the awake group (sleepy: n = 2135, 5.75 × 101 ± 1.97 × 101, awake: n = 1600, 5.53 × 101 ± 1.81 × 101, stats = 3.53, p = 4.24 × 10−4, paired t-test) (Figure 20).
Figure 20. Bar plot of mean meditation scores between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
Figure 21. Bar plot of mean delta strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
The mean delta strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 6.38 × 105 ± 6.29 × 105, awake: n = 1600, 3.59 × 105 ± 5.18 × 105, stats = 14.43, p = 5.32 × 10−46, paired t-test) (Figure 21).
The mean theta strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 1.75 × 105 ± 2.54 × 105, awake: n = 1600, 8.46 × 105 ± 1.40 × 105, stats = 12.84, p = 6.22 × 10−37, paired t-test) (Figure 22).
Figure 22. Barplot of mean theta strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
Figure 23. Barplot of mean low alpha strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
The mean low alpha strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 4.13 × 104 ± 6.18 × 104, awake: n = 1600, 22921.60 ± 31907.07, stats = 10.85, p = 5.24 × 10−27, paired t-test) (Figure 23).
The mean high alpha strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 39286.97 ± 64083.64, awake: n = 1600, 1.90 × 105 ± 2.76 × 105, stats = 11.88, p = 5.38 × 10−32, paired t-test) (Figure 24).
Figure 24. Barplot of mean high Alpha strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
Figure 25. Barplot of mean low Beta strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
The mean low beta strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 2.77 × 104 ± 4.05 × 104, awake: n = 1600, 2.30 × 104 ± 3.19 × 104, stats = 3.88, p = 1.07 × 10−4, paired t-test) (Figure 25).
The mean high beta strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 3.14 × 104 ± 5.41 × 104, awake: n = 1600, 1.24 × 104 ± 1.96 × 104, stats = 13.36, p = 8.79 × 10−40, paired t-test) (Figure 26).
Figure 26. Barplot of mean high Beta strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
Figure 27. Barplot of mean low Gamma strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
The mean low gamma strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 8.78 × 103 ± 1.58 × 104, awake: n = 1600, 7.24 × 103 ± 1.51 × 104, stats = 3.00, p = 2.70 × 10−3, paired t-test) (Figure 27).
The mean high gamma strength of the sleepy group is statistically and significantly different from those of the awake group (sleepy: n = 2135, 2.38 × 105 ± 3.71 × 105, awake: n = 1600, 1.69 × 105 ± 2.60 × 105, stats = 6.38, p = 2.02 × 10−10, paired t-test) (Figure 28).
Figure 28. Barplot of mean high Gamma strength between the sleepy and the awake groups. Error bar = standard error of mean; ***, p < 0.001.
Figure 29. Composite ROC curves for Logistic Regression, LDA, Random Forest, and XGBoost on the held-out test set (positive class = Sleepy). AUCs are shown in the legend.
This pattern shows that “sleepy” meditation is marked by greater slow-wave activity (higher delta/theta power) and distinct changes in faster waves (alpha/beta) compared to the alert state. These spectral shifts reflect the neural slowing and reduced alertness of drowsiness, indicating that EEG band power profiles can indeed differentiate a drowsy meditative state from an alert one.
The accuracy, precision, recall, F1 score, AUC-ROC and confusion matrix were calculated to evaluate the performance for logistic regression, random forest and XGBoost models. It was found that the Random Forest model showed the highest ROC-AUC (0.8588), followed by XG Boost (0.8565) and Logistic Regression (0.7728) (Table 1, Figures 29-33).
As shown in Figure 29, ROC curves for all four models are well above the chance diagonal. Ensemble models (Random Forest and XGBoost) achieved the steepest curves and highest AUC values (≈0.86), while linear models (Logistic Regression, LDA) showed moderate but consistent discrimination (AUC ≈ 0.77).
Figure 30. Confusion matrices for Logistic Regression, LDA, Random Forest, and XGBoost (rows = true labels; columns = predicted labels). Threshold = 0.5.
Both ensemble classifiers (Random Forest, XGBoost) yielded more balanced confusion matrices with higher true positive and true negative counts, whereas linear models produced more misclassifications of awake states. These results reinforce the relative advantage of non-linear methods for capturing complex EEG patterns.
The ROC curves (Figures 31-33) further illustrate that all classifiers performed well above chance (AUC = 0.5 baseline), with the Random Forest’s curve coming closest to the top-left corner of the plot—a visual indicator of its strong discriminative power (consistent with its AUC = 0.8588).
Table 1. Model performance.
Model |
Accuracy |
Precision |
Recall |
F1 score |
AUC-ROC |
Confusion Matrix [TN, FP] [FN, TP] |
Logistic Regression |
0.7028 |
0.6291 |
0.6861 |
0.6563 |
0.7728 |
[313 125] [97 212] |
Random Forest |
0.7845 |
0.7534 |
0.7120 |
0.7321 |
0.8588 |
[366 72] [89 220] |
XGBoost |
0.7617 |
0.7079 |
0.7217 |
0.7147 |
0.8565 |
[346 92] [86 223] |
Figure 31. ROC of logistic regression.
Figure 32. ROC of random forest.
Figure 33. ROC of XG boost.
It was found that high gamma, high beta and low beta are the top three features with highest average feature importance (Table 2 and Table 3).
Table 2. Feature importance.
Brain Oscillations |
Logistic Regression |
Random Forest |
XGBoost |
Delta (0 - 4 Hz) |
6.86 × 10−7 |
1.49 × 10−1 |
1.56 × 10−1 |
Theta (4 - 8 Hz) |
9.92 × 10−7 |
1.21 × 10−1 |
7.33 × 10−2 |
Low alpha (8 - 10 Hz) |
4.61 × 10−5 |
9.32 × 10−1 |
1.10 × 10−1 |
High alpha (10 - 12 Hz) |
1.05 × 10−5 |
1.04 × 10−1 |
8.64 × 10−2 |
Low beta (12 - 15 Hz) |
4.77 × 10−5 |
9.99 × 10−1 |
1.30 × 10−1 |
High beta (18 - 40 Hz) |
1.74 × 10−5 |
1.30 × 10−1 |
1.48 × 10−1 |
Low gamma (30 - 80 Hz) |
2.22 × 10−5 |
9.58 × 10−2 |
8.50 × 10−2 |
High gamma (60 - 200 Hz) |
1.93 × 10−6 |
2.06 × 10−1 |
2.11 × 10−1 |
(Note: Feature importance values are normalized within each model. Logistic regression coefficients were very small in absolute value due to scaling of inputs; here we show illustrative magnitudes, but their direct comparison to tree-based importance is not one-to-one.)
Table 3. Rank of feature importance for each model (1 = most important, 8 = least important).
Brain Oscillations |
Logistic Regression |
Random Forest |
XG-Boost |
Average Rank |
High gamma |
6 |
1 |
1 |
1 |
High beta |
4 |
3 |
3 |
2 |
Low beta |
1 |
6 |
4 |
3 |
Delta |
8 |
2 |
2 |
4 |
Low alpha |
2 |
8 |
5 |
5 |
High alpha |
5 |
5 |
6 |
6 |
Low gamma |
3 |
7 |
7 |
7 |
Theta |
7 |
4 |
8 |
8 |
From these results, we conclude that the fast oscillations (beta and gamma bands) provided critical information for detecting when a meditation session was sleepy, while very slow waves (delta) also played a key role (especially in the tree-based models). The mid-range rhythms (theta and alpha) were somewhat less influential in the multivariate models, possibly because their contributions overlapped or were more subtle.
4. Discussion
Influence of Prior Context on Neural Engagement:
Meditation does not occur in isolation from the broader physiological and psychological states that precede it. A growing body of research has demonstrated that pre-meditation context—including arousal level, fatigue, emotional valence, prior sleep, and attentional load—can significantly modulate both the experience of meditation and the associated neural signatures [11] [23]. From a neurophysiological standpoint, contextual variability shapes baseline EEG oscillations, thereby influencing the patterns subsequently attributed to meditative engagement. This issue is particularly relevant for consumer-grade EEG research, where signal sensitivity and the limited number of electrodes heighten susceptibility to state-related confounds [24].
Baseline arousal fundamentally influences EEG spectral power and coherence, which are also the principal markers used to infer meditative states. High alertness is typically associated with enhanced beta (13 - 30 Hz) and reduced theta (4 - 8 Hz) power, reflecting externally oriented attention [19] [20]. Conversely, drowsiness and fatigue lead to elevated low-frequency (delta/theta) activity and diminished alpha coherence [21] [22]. Because meditative practice often induces relaxation, these overlapping spectral characteristics create interpretive ambiguity: increased theta during meditation may partly reflect relaxation or reduced vigilance rather than mindfulness-specific processes. Amihai and Kozhevnikov [23] proposed that meditative traditions emphasizing focused attention (e.g., Theravada) engage a tonic arousal state, whereas open-monitoring practices (e.g., Vajrayana) promote relaxation; both modulate similar frequency bands via distinct cognitive routes.
Circadian rhythm and prior sleep quality represent additional contextual factors that shape neural responsiveness to meditation. Studies have shown that sleep deprivation increases spontaneous theta and alpha power while reducing beta activity, mimicking some of the spectral signatures attributed to meditation [33] [34]. Similarly, variations in homeostatic sleep pressure alter resting-state functional connectivity within the default mode network—regions also implicated in meditative self-referential processing [35]. Consequently, participants who enter the experiment in a fatigued state may display “pseudo-meditative” EEG patterns unrelated to attentional training. Controlling for time of day, sleep duration, and self-rated alertness thus becomes critical in ensuring that EEG patterns attributed to meditation truly reflect intentional cognitive modulation rather than passive fatigue effects [11] [36].
Emotional context and stress level before meditation also influence neural engagement. Davidson and Tomarken’s [37] frontal asymmetry model suggests that approach-related emotions increase left-frontal activation (decreased alpha), while withdrawal-related emotions increase right-frontal activity. Meditation has been shown to rebalance this asymmetry through attentional reorientation and emotional regulation [15] [38]. However, individuals entering a meditation session under high stress or anxiety may initially exhibit strong right-frontal dominance and elevated beta power—patterns that gradually normalize with practice [10] [39]. Pascoe et al. observed that the pre-meditation stress level moderates both physiological and subjective outcomes, with higher stress amplifying relaxation responses but also increasing inter-individual EEG variability [11]. Thus, baseline affective state interacts dynamically with meditative neural engagement.
Meditation experience and expectancy effects are additional contextual dimensions that modulate neural responses. Experienced practitioners display more consistent frontal midline theta and enhanced alpha coherence than novices, reflecting more efficient attentional control [16] [17]. However, prior exposure to meditation also shapes expectancy and belief—factors known to elicit placebo-like changes in neural activity [40]. Inexperienced participants may exhibit initial increases in alpha power primarily due to relaxation rather than attentional stabilization. Similarly, those expecting “calmness” may unconsciously down-regulate arousal, thereby producing neural patterns attributed to successful meditation. To disentangle these expectancy-driven effects, experimental protocols should incorporate blinded task labeling (e.g., neutral instructions about “attention exercises”) and record subjective intention strength after each block [41].
Environmental factors—lighting, sound, temperature, and visual stimuli—further modulate cortical excitability. Sensory deprivation (e.g., eyes-closed conditions) naturally increases alpha power and decreases beta activity, whereas ambient noise or illumination enhances high-frequency oscillations [42]. These background effects can confound meditation-rest contrasts, particularly when consumer-grade EEG devices lack artifact rejection mechanisms comparable to research-grade systems. Careful control of ambient variables, consistent recording times, and replication across sessions are therefore necessary to ensure that neural changes arise from meditation rather than environmental perturbations.
Contextual variability has direct consequences for machine learning models trained on EEG data. If pre-task conditions systematically differ across participants or sessions, classifiers may inadvertently learn state-related artifacts rather than genuine meditation patterns [25]. For example, RF and SVM models may achieve higher accuracy by exploiting frequency shifts caused by fatigue or stress, while LDA may underperform because it cannot capture non-linear context–feature interactions. Incorporating contextual metadata (e.g., sleep hours, caffeine intake, stress ratings) as covariates can enhance generalization and interpretability [43] [44]. Furthermore, stratified cross-validation by context level—rather than random partitioning—prevents overestimation of model robustness.
Recognizing the influence of prior context reframes meditation research within a state-context-mechanism framework. Neural engagement during meditation emerges not solely from the act itself but as an interaction between practice, context, and individual traits. This framework aligns with contemporary cognitive neuroscience emphasizing dynamic brain-body coupling and context-dependent neural variability [45] [46]. Methodologically, it underscores the need for contextual metadata collection, pre-task physiological baselines, and context-matched control conditions to enable meaningful interpretation of EEG-based meditation studies. In practice, this means that rather than seeking universal neural “signatures” of meditation, researchers should model meditation as a state transition conditioned on prior internal and environmental states.
Delta Band (0.5 - 4 Hz)
The sleepy group showed a markedly higher delta power than the awake group (Figure 21). This difference was highly significant (p < 0.001), indicating that drowsy meditators exhibited stronger slow-wave activity.
Theta Band (4 - 8 Hz)
In contrast to delta, theta power was significantly lower in the sleepy group compared to the awake group (Figure 22; p < 0.001). The awake state during meditation showed a robust elevation of theta activity relative to when participants were sleepy.
Alpha Band (8 - 13 Hz)
We observed a split in the alpha band. Low-alpha power (approximately 8 - 10 Hz) was significantly higher in the sleepy group than in the awake group (Figure 23), whereas high-alpha power (around 10 - 13 Hz) was significantly higher in the awake group than in the sleepy group (Figure 24). In other words, drowsy meditation increased lower-alpha activity, while alert meditation increased upper-alpha activity. Both differences were statistically significant (p < 0.001).
Beta Band (13 - 30 Hz)
Beta-band activity (which we divided into low-beta and high-beta) showed notable increases in the sleepy group relative to the awake group. Low-beta power (~12 - 15 Hz) was modestly but significantly higher during sleepy meditation than awake meditation (Figure 25; p < 0.001). More strikingly, high-beta power (~20 - 30 Hz) was substantially higher in the sleepy group compared to the awake group (Figure 26; p < 0.001), with the sleepy participants showing nearly triple the high-beta amplitude of the awake participants on average.
Gamma Band (30 - 100+ Hz)
Like beta, gamma-band activity was significantly higher in the sleepy group compared to the awake group. We quantified two gamma sub-bands: low-gamma (~30 - 50/80 Hz) and high-gamma (~60 - 200 Hz). The sleepy state showed higher power in both ranges. Low-gamma was moderately but significantly greater in sleepy meditators (Figure 27, p < 0.01), and high-gamma was markedly greater in the sleepy group (Figure 28, p < 0.001). Notably, high-gamma exhibited one of the largest between-group differences observed (on par with delta and high-beta differences).
Brain Oscillation Profile of the “Sleepy” Meditation Group
Participants categorized as “sleepy meditators”—those who self-reported low vigilance or displayed higher baseline theta–delta activity prior to meditation—exhibited distinct neural dynamics compared with their “awake” counterparts. Across all EEG channels, the sleepy group showed a prominent increase in absolute and relative theta (4 - 8 Hz) and delta (1 - 4 Hz) power during meditation, accompanied by a reduction in beta (13 - 30 Hz) activity. This oscillatory configuration suggests an overall shift toward cortical hypoactivation and reduced sensory engagement [19] [21]. These findings are consistent with previous observations that low-arousal states promote large-amplitude, slow oscillations in frontal regions linked to drowsiness and early non-REM transitions [22] [33].
From a neurocognitive standpoint, the heightened frontal theta observed here may not solely reflect meditative focus but rather a composite of relaxation-induced drowsiness and diminished attentional drive. Studies have noted that theta elevation during meditation can reflect either focused internal attention or sleep onset, depending on arousal level and task context [17] [41]. In the current cohort, subjective self-reports and Karolinska Sleepiness Scale scores [36] correlated positively with theta-delta power, supporting the interpretation that these oscillations signify reduced cortical vigilance rather than enhanced mindful engagement.
In addition, alpha activity (8 - 12 Hz) in the sleepy group displayed inconsistent modulation: some participants showed elevated occipital alpha typical of eyes-closed relaxation [42], whereas others demonstrated attenuated alpha relative to baseline, likely reflecting transitional instability between relaxed wakefulness and sleep onset. Such variability reinforces the hypothesis that the “sleepy” meditative state represents a heterogeneous mixture of relaxation and low-arousal drift rather than a unified attentional configuration.
Machine learning classifiers supported this interpretation. Both the Random Forest and SVM models achieved moderate discrimination between meditation and drowsy rest within this subgroup, but feature-importance maps emphasized low-frequency bands rather than frontal asymmetry or alpha coherence. This suggests that non-linear models captured the state-dependent slowing of cortical oscillations, while linear models such as LDA underperformed due to the context-dependent overlap between meditative relaxation and drowsiness [25] [43].
Together, these patterns illustrate that “sleepy” meditation engages a brain state dominated by low-frequency synchronization, reduced information-processing capacity, and diminished environmental reactivity—physiological characteristics closer to early sleep stages than to cognitively engaged mindfulness. This reinforces the need to consider baseline arousal and vigilance as critical covariates when interpreting meditation-related EEG changes, particularly when using consumer-grade devices susceptible to state drift [24].
Brain Oscillation Profile of the “Awake” Meditation Group
In contrast, the “awake meditation group”—participants reporting high alertness and low drowsiness scores—displayed a markedly different oscillatory profile characterized by enhanced alpha (8 - 12 Hz) and frontal midline theta (4 - 8 Hz) power accompanied by suppressed beta (13 - 30 Hz) and minimal delta activity. This spectral configuration aligns with the canonical pattern of internally directed attention with preserved vigilance, often described as the “relaxed yet alert” neural signature of meditation [3] [16] [41].
The increased frontal midline theta (Fmθ) observed across AF7/AF8 sites corresponds to engagement of the anterior cingulate cortex and dorsolateral prefrontal regions responsible for executive attention and cognitive control [47] [48]. This oscillation is commonly reported during tasks involving sustained focus and working-memory maintenance, suggesting that awake meditators successfully maintained cognitive engagement throughout the session. Simultaneously, parietal and occipital alpha synchronization indicates sensory disengagement from external stimuli [42] [49], consistent with the suppression of external distractions during focused-attention practice.
Compared with the sleepy group, the awake meditators exhibited lower delta–theta coupling and greater intra-individual coherence in the alpha band, suggesting more stable cortical networks and higher attentional fidelity. The reduction in high-beta activity (20 - 30 Hz) further supports a decrease in task-irrelevant motor planning and anxiety-related arousal [38] [39]. Importantly, these spectral dynamics mirror those documented in experienced meditators, who demonstrate efficient transitions between externally and internally oriented networks [15] [41].
The machine learning analysis corroborated these neurophysiological findings. Within the awake group, Random Forest (RF) and SVM (RBF) models achieved the highest classification accuracy (≈85% - 88%) distinguishing meditation from rest, largely driven by frontal theta and occipital alpha ratios. LDA, though slightly less accurate, provided interpretable feature weights confirming increased alpha power and reduced beta as the most stable discriminants. These results support a two-tiered interpretive model: non-linear methods effectively detect subtle multi-band interactions, while linear methods clarify the specific oscillatory features underlying meditative attention [30] [43].
Taken together, the “awake” meditative brain is characterized by enhanced synchronization of mid-frequency rhythms that balance alertness and relaxation. The coexistence of frontal theta and posterior alpha suggests a top-down regulation of attention, enabling internal focus without loss of awareness. Unlike the sleepy group, whose low-frequency dominance reflects reduced cortical engagement, the awake group demonstrates efficient neurocognitive control and sensory decoupling—a hallmark of proficient meditation.
These findings resonate with prior work indicating that effective mindfulness practice entails maintaining high awareness while downregulating limbic reactivity [10] [12]. Importantly, the divergence between sleepy and awake meditators underscores that meditation is not a singular brain state but a dynamic process shaped by arousal, intent, and experience [11] [23]. Distinguishing these subtypes is essential for accurate interpretation of EEG-based meditation research and for designing ML classifiers that generalize across heterogeneous participant states.
Limitations and Future Directions
While this study offers novel insights into the neural dynamics of meditation, several methodological limitations warrant discussion. First, the sample size (N = 10) restricts generalizability. A post-hoc power analysis on the delta-band effect (Cohen’s d = 0.48) suggested adequate power for large effects but underpowered detection of smaller ones. Thus, results should be regarded as exploratory, and future studies should expand sample size and diversity to test the stability of these EEG patterns across age groups, meditation experience, and diurnal cycles [50]
Second, the classification analyses primarily used within-subject validation, where training and testing data came from the same participants. Although this approach helps control individual differences, it may inflate performance estimates. EEG features are highly idiosyncratic, so cross-subject validation (e.g., leave-one-subject-out) or independent test cohorts will be essential for assessing generalizability to unseen individuals.
A third limitation involves the labeling and construct validity of the mental-state indices. Trial labels were derived from the proprietary eSense “attention/meditation” metrics of the NeuroSky device and operationalized via a within-subject median split. These indices are black-box proxies for attentional state and should be validated against independent behavioral or physiological markers—such as reaction-time probes, eyelid-closure measures, or observer ratings [51]. Furthermore, unmeasured pre-session factors—stress, arousal, fatigue, or time-on-task—likely modulated EEG dynamics. Incorporating baseline psychophysiological assessments (e.g., heart-rate variability, subjective stress ratings, or resting-state EEG) could help dissociate meditation-related changes from prior-state influences.
Notably, flexible tree-based classifiers (like the Random Forest) could be overfitting idiosyncratic noise or person-specific patterns in the EEG, which would inflate within-subject accuracy. For any practical deployment, it’s crucial that models generalize across individuals—underscoring the need for cross-subject validation to ensure the classifier is capturing genuine, generalizable neural signatures rather than quirks of individual participants.
The use of a single-channel EEG headset (FP1 electrode) inherently limits the spatial resolution of our data. With only one sensor, we cannot localize brain regions or network interactions, so any spatially specific neural patterns are undetectable. This lack of spatial information may also contribute to overfitting—the model might latch onto noise or artifacts from that one electrode. Future studies should incorporate multi-channel, research-grade EEG to capture the full topography of brain activity and verify that our findings hold across a more robust, spatially resolved dataset.
Future research should therefore integrate multi-channel, research-grade EEG to capture the spatial topography of meditation-related neural activity. Multi-electrode data would enable source localization and connectivity analyses (e.g., coherence or phase synchrony) to investigate how distinct brain regions interact during drowsy versus alert meditation. Additionally, analyses of raw EEG dynamics beyond power spectra—such as microstates, event-related potentials, or entropy-based features—could reveal temporal and non-stationary signatures overlooked by conventional band-power metrics.
It is also important to triangulate EEG findings with validated behavioral and psychometric measures of attention and arousal. Administering the Karolinska Sleepiness Scale or psychomotor vigilance tasks could quantify true alertness levels, while comparisons between the NeuroSky attention index and medical-grade EEG or standardized attention assessments would clarify the consumer device’s construct validity [29].
Collectively, these improvements—larger and more diverse samples, multi-channel EEG, validated labels, and robust cross-subject evaluation—would strengthen the reproducibility and interpretability of machine-learning models for meditation-related brain states. As a long-term goal, demonstrating cross-participant generalization could enable practical applications such as adaptive meditation guidance or alertness monitoring tools that detect when focus wanes and re-engagement is needed.
5. Conclusions
In conclusion, this study compared linear and non-linear modeling approaches for classifying drowsy (“sleepy”) and alert (“awake”) meditative states based on single-channel EEG recordings collected via the NeuroSky MindWave. Both statistical analyses and machine learning classifiers converged on the finding that sleepy trials were characterized by elevated low-frequency (delta) and high-frequency (beta/gamma) power, whereas awake trials emphasized moderate-frequency bands. These observations, though correlational, suggest that baseline arousal significantly shapes meditation-related EEG signatures. The results align with prior evidence that pre-meditation arousal or fatigue modulates neural oscillatory patterns [11] [23], supporting the interpretation that our classifiers captured a state–context interaction rather than a pure meditative signature.
Among the models tested, Random Forest achieved the best overall accuracy outperforming linear baselines such as Logistic Regression and LDA. This confirms that non-linear ensemble approaches can capture complex EEG-state relationships that simpler models may overlook, consistent with general EEG classification literature [25]. Nonetheless, all models exhibited limited generalizability due to the small sample size (N = 10) and within-subject validation strategy, which should be addressed in future cross-participant studies.
Importantly, because meditation trials were embedded within a video-based paradigm, the resulting EEG patterns likely reflect an interaction between the meditative task and the immediately preceding sensory context. This supports the notion of “state-integrity”, wherein the neural correlates of meditation are modulated by prior cognitive and perceptual input [23]. Thus, the patterns identified here should be interpreted as context-dependent correlates of meditative focus rather than universal biomarkers of meditation.
Despite these limitations, the study demonstrates the feasibility of using low-cost consumer EEG devices combined with machine learning to quantify subtle cognitive states. The approach provides a methodological benchmark for future large-scale, cross-subject, and multimodal research aimed at developing real-world biofeedback or attention-monitoring applications. Future work should incorporate validated behavioral measures of alertness and research-grade EEG recordings to further refine these computational models and ensure interpretability and reproducibility [29] [51].
Ultimately, the study underscores that accessible EEG technology and transparent model comparison can meaningfully contribute to the growing intersection of neurophysiology, data science, and mindfulness research, providing a reproducible framework for analyzing meditation-related brain dynamics while maintaining a cautious and empirically grounded interpretation.