Physiological Signals as Predictors of Mental Workload: Evaluating Single Classifier and Ensemble Learning Models

Main Article Content

Nailul Izzah
Auditya Purwandini Sutarto
Ade Hendi
Maslakhatul Ainiyah
Muhammad Nubli bin Abdul Wahab


mental workload, heart rate variability, machine learning, prediction, cognitive, support vector machine


With a growing emphasis on cognitive processing in occupational tasks and the prevalence of wearable sensing devices, understanding and managing mental workload has broad implications for safety, efficiency, and well-being. This study aims to develop machine learning (ML) models for predicting mental workload using Heart Rate Variability (HRV) as a representation of the Autonomic Nervous System (ANS) physiological signals. A laboratory experiment, involving 34 participants, was conducted to collect datasets. All participants were measured during baseline, two cognitive tests, and recovery, which were further separated into binary classes (rest vs workload). A comprehensive evaluation was conducted on several ML algorithms, including both single (Support Vector Machine/SVM and Naïve Bayes) and ensemble learning (Gradient Boost and AdaBoost) classifiers and incorporating selected features and validation approaches. The findings indicate that most HRV features differ significantly during periods of mental workload compared to rest phases. The SVM classifier with knowledge domain selection and leave-one-out cross-validation technique is the best model (68.385). These findings highlight the potential to predict mental workload through interpretable features and individualized approaches even with a relatively simple model. The study contributes not only to the creation of a new dataset for specific populations (such as Indonesia) but also to the potential implications for maintaining human cognitive capabilities. It represents a further step toward the development of a mental workload recognition system, with the potential to improve decision-making where cognitive readiness is limited and human error is increased.


Download data is not yet available.


[1] M. S. Young, K. A. Brookhuis, C. D. Wickens, and P. A. Hancock, “State of science: mental workload in ergonomics,” Ergonomics, vol. 58, no. 1, pp. 1–17, Jan. 2015, doi: 10.1080/00140139.2014.956151.
[2] R. L. Charles and J. Nixon, “Measuring mental workload using physiological measures: A systematic review,” Applied Ergonomics, vol. 74, no. September 2016, pp. 221–232, 2019, doi: 10.1016/j.apergo.2018.08.028.
[3] C. D. Wickens, “Multiple resources and mental workload,” Human Factors, vol. 50, no. 3, pp. 449–455, 2008, doi: 10.1518/001872008X288394.
[4] G. Orru and L. Longo, “The Evolution of Cognitive Load Theory and the Measurement of Its Intrinsic, Extraneous and Germane Loads: A Review,” in Human Mental Workload: Models and Applications, Springer International Publishing, 2019, pp. 23–48. doi: 10.1007/978-3-030-14273-5_3.
[5] P. Vanneste et al., “Towards measuring cognitive load through multimodal physiological data,” Cognition, Technology and Work, vol. 23, no. 3, pp. 567–585, 2021, doi: 10.1007/s10111-020-00641-0.
[6] F. Paas, J. E. Tuovinen, H. Tabbers, and P. W. M. Van Gerven, “Cognitive Load Measurement as a Means to Advance Cognitive Load Theory,” Educational Psychologist, vol. 38, no. 1, pp. 63–71, Jan. 2003, doi: 10.1207/S15326985EP3801_8.
[7] R. McKendrick, B. Feest, A. Harwood, and B. Falcone, “Theories and Methods for Labeling Cognitive Workload: Classification and Transfer Learning,” Frontiers in Human Neuroscience, vol. 13, no. September, pp. 1–20, 2019, doi: 10.3389/fnhum.2019.00295.
[8] M. Gjoreski et al., “Cognitive Load Monitoring with Wearables-Lessons Learned from a Machine Learning Challenge,” IEEE Access, vol. 9, pp. 103325–103336, 2021, doi: 10.1109/ACCESS.2021.3093216.
[9] D. Tao, H. Tan, H. Wang, X. Zhang, X. Qu, and T. Zhang, “A systematic review of physiological measures of mental workload,” International Journal of Environmental Research and Public Health, vol. 16, no. 15, pp. 1–23, 2019, doi: 10.3390/ijerph16152716.
[10] K. F. A. Lee, W. S. Gan, and G. Christopoulos, “Biomarker-informed machine learning model of cognitive fatigue from a heart rate response perspective,” Sensors, vol. 21, no. 11, pp. 1–16, 2021, doi: 10.3390/s21113843.
[11] M. Gjoreski et al., “Datasets for cognitive load inference using wearable sensors and psychological traits,” Applied Sciences (Switzerland), vol. 10, no. 11, 2020, doi: 10.3390/app10113843.
[12] S. S. Panicker and P. Gayathri, “A survey of machine learning techniques in physiology based mental stress detection systems,” Biocybernetics and Biomedical Engineering, vol. 39, no. 2, pp. 444–469, 2019, doi: 10.1016/j.bbe.2019.01.004.
[13] G. Vos, K. Trinh, Z. Sarnyai, and M. Rahimi Azghadi, “Generalizable machine learning for stress monitoring from wearable devices: A systematic literature review,” International Journal of Medical Informatics, vol. 173, no. February, p. 105026, 2023, doi: 10.1016/j.ijmedinf.2023.105026.
[14] A. W. K. Gaillard, “Comparing the concepts of mental load and stress,” Ergonomics, vol. 36, no. 9, pp. 991–1005, 1993, doi: 10.1080/00140139308967972.
[15] C. L. Bong, K. Fraser, and D. Oriot, “Cognitive Load and Stress in Simulation,” in Comprehensive Healthcare Simulation: Pediatrics, V. J. Grant and A. Cheng, Eds., in Comprehensive Healthcare Simulation. Cham: Springer International Publishing, 2016, pp. 3–17. doi: 10.1007/978-3-319-24187-6_1.
[16] M. Lohani, B. R. Payne, and D. L. Strayer, “A Review of Psychophysiological Measures to Assess Cognitive States in Real-World Driving,” Front. Hum. Neurosci., vol. 13, p. 57, Mar. 2019, doi: 10.3389/fnhum.2019.00057.
[17] C. Chen, C. Li, C.-W. Tsai, and X. Deng, “Evaluation of Mental Stress and Heart Rate Variability Derived from Wrist-Based Photoplethysmography,” in 2019 IEEE Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability (ECBIOS), IEEE, May 2019, pp. 65–68. doi: 10.1109/ECBIOS.2019.8807835.
[18] F. Shaffer and J. P. Ginsberg, “An Overview of Heart Rate Variability Metrics and Norms,” Frontiers in Public Health, vol. 5, no. 258, 2017, doi: 10.3389/fpubh.2017.00258.
[19] X. Arakaki et al., “The connection between heart rate variability (HRV), neurological health, and cognition: A literature review,” Front. Neurosci., vol. 17, p. 1055445, Mar. 2023, doi: 10.3389/fnins.2023.1055445.
[20] G. Forte, F. Favieri, and M. Casagrande, “Heart rate variability and cognitive function: A systematic review,” Frontiers in Neuroscience, vol. 13, no. JUL, pp. 1–11, 2019, doi: 10.3389/fnins.2019.00710.
[21] T. Pham, Z. J. Lau, S. H. A. Chen, and D. Makowski, “Heart rate variability in psychology: A review of HRV indices and an analysis tutorial,” Sensors, vol. 21, no. 12, pp. 1–20, 2021, doi: 10.3390/s21123998.
[22] K. Mohanavelu et al., “Cognitive Workload Analysis of Fighter Aircraft Pilots in Flight Simulator Environment,” Def. Sc. Jl., vol. 70, no. 2, pp. 131–139, Mar. 2020, doi: 10.14429/dsj.70.14539.
[23] S. H. Fairclough, L. Venables, and A. Tattersall, “The influence of task demand and learning on the psychophysiological response,” International Journal of Psychophysiology, vol. 56, no. 2, pp. 171–184, 2005, doi: 10.1016/j.ijpsycho.2004.11.003.
[24] M. Fallahi, M. Motamedzade, R. Heidarimoghadam, A. R. Soltanian, and S. Miyake, “Effects of mental workload on physiological and subjective responses during traffic density monitoring: A field study,” Applied Ergonomics, vol. 52, pp. 95–103, Jan. 2016, doi: 10.1016/j.apergo.2015.07.009.
[25] J. A. Veltman and A. W. K. Gaillard, “Physiological workload reactions to increasing levels of task difficulty,” Ergonomics, vol. 41, no. 5, pp. 656–669, 1998, doi: 10.1080/001401398186829.
[26] J. M. Splawn and M. E. Miller, “Prediction of perceived workload from task performance and heart rate measures,” Proceedings of the Human Factors and Ergonomics Society, no. April, pp. 778–782, 2013, doi: 10.1177/1541931213571170.
[27] K. Pettersson, J. Tervonen, J. Narvainen, P. Henttonen, I. Maattanen, and J. Mantyjarvi, “Selecting Feature Sets and Comparing Classification Methods for Cognitive State Estimation,” Proceedings - IEEE 20th International Conference on Bioinformatics and Bioengineering, BIBE 2020, pp. 683–690, 2020, doi: 10.1109/BIBE50027.2020.00115.
[28] G. Giannakakis, K. Marias, and M. Tsiknakis, “A stress recognition system using HRV parameters and machine learning techniques,” 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, ACIIW 2019, pp. 269–272, 2019, doi: 10.1109/ACIIW.2019.8925142.
[29] K. Ross et al., “Toward Dynamically Adaptive Simulation: Multimodal Classification of User Expertise Using Wearable Devices,” Sensors, vol. 19, no. 19, p. 4270, Oct. 2019, doi: 10.3390/s19194270.
[30] M. Radovic, M. Ghalwash, N. Filipovic, and Z. Obradovic, “Minimum redundancy maximum relevance feature selection approach for temporal gene expression data,” BMC Bioinformatics, vol. 18, no. 1, p. 9, Dec. 2017, doi: 10.1186/s12859-016-1423-9.
[31] C. Rudin and J. Radin, “Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From an Explainable AI Competition,” Harvard Data Science Review, vol. 1, no. 2, Nov. 2019, doi: 10.1162/99608f92.5a8a3a3d.
[32] Y. Mao et al., “How data scientists work together with domain experts in scientific collaborations: To find the right answer or to ask the right qestion?,” Proceedings of the ACM on Human-Computer Interaction, vol. 3, no. 237, 2019, doi: 10.1145/3361118.
[33] B. Mahesh, E. Prassler, T. Hassan, and J. U. Garbas, “Requirements for a Reference Dataset for Multimodal Human Stress Detection,” 2019 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom Workshops 2019, pp. 492–498, 2019, doi: 10.1109/PERCOMW.2019.8730884.
[34] H. F. Posada-Quintero and J. B. Bolkhovsky, “Machine learning models for the identification of cognitive tasks using autonomic reactions from heart rate variability and electrodermal activity,” Behavioral Sciences, vol. 9, no. 4, 2019, doi: 10.3390/bs9040045.
[35] R. Brickenkamp, “Test d2, Attentional Performance Test.” Hogrefe, Göttingen, Germany, 1994.
[36] S. T. Mueller and B. J. Piper, “The Psychology Experiment Building Language (PEBL) and PEBL Test Battery,” Journal of Neuroscience Methods, vol. 222, pp. 250–259, Jan. 2014, doi: 10.1016/j.jneumeth.2013.10.024.
[37] S. Laborde, E. Mosley, and J. F. Thayer, “Heart rate variability and cardiac vagal tone in psychophysiological research - Recommendations for experiment planning, data analysis, and data reporting,” Frontiers in Psychology, vol. 8, no. 213, 2017, doi: 10.3389/fpsyg.2017.00213.
[38] S. B. Kotsiantis, D. Kanellopoulos, and P. E. Pintelas, “Data Preprocessing for Supervised Leaning,” vol. 1, no. 1, 2006.
[39] A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. Sebastopol, California: O’Reilly Media, Inc, 2019.
[40] S. Laborde, M. Raab, and N. P. Kinrade, “Is the ability to keep your mind sharp under pressure reflected in your heart? Evidence for the neurophysiological bases of decision reinvestment,” Biological Psychology, vol. 100, no. 1, pp. 34–42, 2014, doi: 10.1016/j.biopsycho.2014.05.003.
[41] K. Dahal, B. Bogue-Jimenez, and A. Doblas, “Global Stress Detection Framework Combining a Reduced Set of HRV Features and Random Forest Model,” Sensors, vol. 23, no. 11, p. 5220, May 2023, doi: 10.3390/s23115220.
[42] T. G. Dietterich, “Ensemble Methods in Machine Learning,” in Lecture Notes in Computer Science, Berlin, Heidelberg, 2000, pp. 1–15. doi: 10.1007/3-540-45014-9_1.
[43] A. Saini, “Guide on Support Vector Machine (SVM) Algorithm,” 2023. (accessed Aug. 11, 2023).
[44] B. Alam, “Naive Bayes Classifier Python Tutorial 2023,” 2022. (accessed Aug. 11, 2023).
[45] R. Kumar, “A Comparitive Study Between AdaBoost and Gradient Boost ML Algorithm,” 2020. (accessed Aug. 11, 2023).
[46] Y. Xu and R. Goodacre, “On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning,” J. Anal. Test., vol. 2, no. 3, pp. 249–262, Jul. 2018, doi: 10.1007/s41664-018-0068-2.
[47] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, and B. Thirion, “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[48] W. Kraaij et al., “Personalized support for well-being at work: an overview of the SWELL project,” User Modeling and User-Adapted Interaction, vol. 30, no. 3, pp. 413–446, 2020, doi: 10.1007/s11257-019-09238-3.
[49] K. Nkurikiyeyezu, A. Yokokubo, and G. Lopez, “Effect of person-specific biometrics in improving generic stress predictive models,” Sensors and Materials, vol. 32, no. 2, pp. 703–722, 2020, doi: 10.18494/SAM.2020.2650.
[50] J. Tervonen, K. Pettersson, and J. Mäntyjärvi, “Ultra-short window length and feature importance analysis for cognitive load detection from wearable sensors,” Electronics (Switzerland), vol. 10, no. 5, pp. 1–19, 2021, doi: 10.3390/electronics10050613.