Ji Jun, Dong Wentian, Li Jiaqi, Peng Jingzhu, Feng Chaonan, Liu Rujia, Shi Chuan, Ma Yantao
College of Computer Science and Technology, Qingdao University, Qingdao, China.
Beijing Wanling Pangu Science and Technology Ltd., Beijing, China.
Front Neurol. 2024 Jul 4;15:1394210. doi: 10.3389/fneur.2024.1394210. eCollection 2024.
Depressive and manic states contribute significantly to the global social burden, but objective detection tools are still lacking. This study investigates the feasibility of utilizing voice as a biomarker to detect these mood states. Methods:From real-world emotional journal voice recordings, 22 features were retrieved in this study, 21 of which showed significant differences among mood states. Additionally, we applied leave-one-subject-out strategy to train and validate four classification models: Chinese-speech-pretrain-GRU, Gate Recurrent Unit (GRU), Bi-directional Long Short-Term Memory (BiLSTM), and Linear Discriminant Analysis (LDA).
Our results indicated that the Chinese-speech-pretrain-GRU model performed the best, achieving sensitivities of 77.5% and 54.8% and specificities of 86.1% and 90.3% for detecting depressive and manic states, respectively, with an overall accuracy of 80.2%.
These findings show that machine learning can reliably differentiate between depressive and manic mood states via voice analysis, allowing for a more objective and precise approach to mood disorder assessment.
抑郁和躁狂状态对全球社会负担有重大影响,但仍缺乏客观的检测工具。本研究探讨了将声音作为生物标志物来检测这些情绪状态的可行性。方法:在本研究中,从现实世界的情绪日记语音记录中提取了22个特征,其中21个在不同情绪状态间存在显著差异。此外,我们采用留一法策略来训练和验证四种分类模型:中文语音预训练门控循环单元(Chinese-speech-pretrain-GRU)、门控循环单元(GRU)、双向长短期记忆网络(BiLSTM)和线性判别分析(LDA)。
我们的结果表明,中文语音预训练GRU模型表现最佳,检测抑郁和躁狂状态的灵敏度分别为77.5%和54.8%,特异度分别为86.1%和90.3%,总体准确率为80.2%。
这些发现表明,机器学习可以通过语音分析可靠地区分抑郁和躁狂情绪状态,从而为情绪障碍评估提供一种更客观、精确的方法。