Hashim Nik Wahidah, Wilkes Mitch, Salomon Ronald, Meggs Jared, France Daniel J
Department of Electrical Engineering, Vanderbilt University, Nashville, Tennessee.
Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, Tennessee.
J Voice. 2017 Mar;31(2):256.e1-256.e6. doi: 10.1016/j.jvoice.2016.06.006. Epub 2016 Jul 26.
The aim of the present study was to determine if acoustic measures of voice, characterizing specific spectral and timing properties, predict clinical ratings of depression severity measured in a sample of patients using the Hamilton Depression Rating Scale (HAMD) and Beck Depression Inventory (BDI-II).
This is a prospective study.
Voice samples and clinical depression scores were collected prospectively from consenting adult patients who were referred to psychiatry from the adult emergency department or primary care clinics. The patients were audio-recorded as they read a standardized passage in a nearly closed-room environment. Mean Absolute Error (MAE) between actual and predicted depression scores was used as the primary outcome measure.
The average MAE between predicted and actual HAMD scores was approximately two scores for both men and women, and the MAE for the BDI-II scores was approximately one score for men and eight scores for women. Timing features were predictive of HAMD scores in female patients while a combination of timing features and spectral features was predictive of scores in male patients. Timing features were predictive of BDI-II scores in male patients.
Voice acoustic features extracted from read speech demonstrated variable effectiveness in predicting clinical depression scores in men and women. Voice features were highly predictive of HAMD scores in men and women, and BDI-II scores in men, respectively. The methodology is feasible for diagnostic applications in diverse clinical settings as it can be implemented during a standard clinical interview in a normal closed room and without strict control on the recording environment.
本研究的目的是确定语音的声学测量指标(表征特定的频谱和时间特性)能否预测使用汉密尔顿抑郁量表(HAMD)和贝克抑郁量表(BDI-II)对患者样本进行测量的抑郁严重程度的临床评分。
这是一项前瞻性研究。
前瞻性地收集了从成人急诊科或初级保健诊所转介到精神科的成年患者的语音样本和临床抑郁评分。患者在近乎封闭的房间环境中朗读标准化段落时进行录音。实际抑郁评分与预测抑郁评分之间的平均绝对误差(MAE)用作主要结局指标。
预测的HAMD评分与实际HAMD评分之间的平均MAE,男性和女性均约为两分;BDI-II评分的MAE,男性约为一分,女性约为八分。时间特征可预测女性患者的HAMD评分,而时间特征和频谱特征的组合可预测男性患者的评分。时间特征可预测男性患者的BDI-II评分。
从朗读语音中提取的语音声学特征在预测男性和女性的临床抑郁评分方面显示出不同的有效性。语音特征分别对男性和女性的HAMD评分以及男性的BDI-II评分具有高度预测性。该方法在各种临床环境中的诊断应用中是可行的,因为它可以在正常封闭房间的标准临床访谈期间实施,并且无需对录音环境进行严格控制。