Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea.
Department of Psychiatry, Seoul National University College of Medicine, Seoul, Republic of Korea.
J Med Internet Res. 2024 Nov 13;26:e65994. doi: 10.2196/65994.
Assessing the complex and multifaceted symptoms of patients with acute psychiatric disorders proves to be significantly challenging for clinicians. Moreover, the staff in acute psychiatric wards face high work intensity and risk of burnout, yet research on the introduction of digital technologies in this field remains limited. The combination of continuous and objective wearable sensor data acquired from patients with deep learning techniques holds the potential to overcome the limitations of traditional psychiatric assessments and support clinical decision-making.
This study aimed to develop and validate wearable-based deep learning models to comprehensively predict patient symptoms across various acute psychiatric wards in South Korea.
Participants diagnosed with schizophrenia and mood disorders were recruited from 4 wards across 3 hospitals and prospectively observed using wrist-worn wearable devices during their admission period. Trained raters conducted periodic clinical assessments using the Brief Psychiatric Rating Scale, Hamilton Anxiety Rating Scale, Montgomery-Asberg Depression Rating Scale, and Young Mania Rating Scale. Wearable devices collected patients' heart rate, accelerometer, and location data. Deep learning models were developed to predict psychiatric symptoms using 2 distinct approaches: single symptoms individually (Single) and multiple symptoms simultaneously via multitask learning (Multi). These models further addressed 2 problems: within-subject relative changes (Deterioration) and between-subject absolute severity (Score). Four configurations were consequently developed for each scale: Single-Deterioration, Single-Score, Multi-Deterioration, and Multi-Score. Data of participants recruited before May 1, 2024, underwent cross-validation, and the resulting fine-tuned models were then externally validated using data from the remaining participants.
Of the 244 enrolled participants, 191 (78.3%; 3954 person-days) were included in the final analysis after applying the exclusion criteria. The demographic and clinical characteristics of participants, as well as the distribution of sensor data, showed considerable variations across wards and hospitals. Data of 139 participants were used for cross-validation, while data of 52 participants were used for external validation. The Single-Deterioration and Multi-Deterioration models achieved similar overall accuracy values of 0.75 in cross-validation and 0.73 in external validation. The Single-Score and Multi-Score models attained overall R² values of 0.78 and 0.83 in cross-validation and 0.66 and 0.74 in external validation, respectively, with the Multi-Score model demonstrating superior performance.
Deep learning models based on wearable sensor data effectively classified symptom deterioration and predicted symptom severity in participants in acute psychiatric wards. Despite lower computational costs, Multi models demonstrated equivalent or superior performance than Single models, suggesting that multitask learning is a promising approach for comprehensive symptom prediction. However, significant variations were observed across wards, which presents a key challenge for developing clinical decision support systems in acute psychiatric wards. Future studies may benefit from recurring local validation or federated learning to address generalizability issues.
评估急性精神障碍患者复杂且多方面的症状对临床医生来说极具挑战性。此外,急性精神病房的工作人员面临高强度的工作和倦怠的风险,但该领域引入数字技术的研究仍然有限。从患者身上连续获取的客观可穿戴传感器数据与深度学习技术相结合,有可能克服传统精神评估的局限性,并支持临床决策。
本研究旨在开发和验证基于可穿戴设备的深度学习模型,以全面预测韩国各地不同急性精神病房的患者症状。
从 3 家医院的 4 个病房招募被诊断为精神分裂症和心境障碍的参与者,并在他们住院期间使用手腕可穿戴设备进行前瞻性观察。经过培训的评估员使用简明精神病评定量表、汉密尔顿焦虑量表、蒙哥马利-阿斯伯格抑郁评定量表和杨氏躁狂评定量表对患者进行定期临床评估。可穿戴设备收集患者的心率、加速度计和位置数据。使用两种不同的方法开发深度学习模型来预测精神症状:分别预测单个症状(Single)和通过多任务学习同时预测多个症状(Multi)。这些模型进一步解决了两个问题:个体内的相对变化(恶化)和个体间的绝对严重程度(评分)。因此,为每个量表分别开发了四种配置:Single-Deterioration、Single-Score、Multi-Deterioration 和 Multi-Score。在 2024 年 5 月 1 日之前招募的参与者的数据进行了交叉验证,然后使用其余参与者的数据对生成的微调模型进行外部验证。
在应用排除标准后,244 名入组参与者中有 191 名(78.3%;3954 人天)被纳入最终分析。参与者的人口统计学和临床特征以及传感器数据的分布在不同的病房和医院之间存在显著差异。139 名参与者的数据用于交叉验证,而 52 名参与者的数据用于外部验证。Single-Deterioration 和 Multi-Deterioration 模型在交叉验证中的整体准确率均为 0.75,在外部验证中的准确率均为 0.73。Single-Score 和 Multi-Score 模型在交叉验证中的整体 R²值分别为 0.78 和 0.83,在外部验证中的 R²值分别为 0.66 和 0.74,Multi-Score 模型表现更优。
基于可穿戴传感器数据的深度学习模型可有效分类症状恶化,并预测急性精神病房患者的症状严重程度。尽管计算成本较低,但 Multi 模型的表现与 Single 模型相当或更优,这表明多任务学习是全面预测症状的一种有前途的方法。然而,各病房之间存在显著差异,这是开发急性精神病房临床决策支持系统的关键挑战。未来的研究可能受益于局部反复验证或联邦学习来解决可推广性问题。