Suppr超能文献

基于 MRI 数据的用于 discs 退变准确可靠分类的深度学习模型。

A Deep Learning Model for the Accurate and Reliable Classification of Disc Degeneration Based on MRI Data.

机构信息

From the Institute for Orthopaedic Research and Biomechanics, University Hospital Ulm, Ulm, Germany.

IRCCS Istituto Ortopedico Galeazzi, Milan, Italy.

出版信息

Invest Radiol. 2021 Feb 1;56(2):78-85. doi: 10.1097/RLI.0000000000000709.

Abstract

OBJECTIVES

Although magnetic resonance imaging-based formalized grading schemes for intervertebral disc degeneration offer improved reproducibility compared with purely subjective ratings, their intrarater and interrater reliability are not nearly good enough to be able to detect small to medium effects in clinical longitudinal studies. The aim of this study thus was to develop a method that enables automatic and therefore reproducible and reliable evaluation of disc degeneration based on conventional clinical image data and Pfirrmann's grading scheme.

MATERIALS AND METHODS

We propose a classifier based on a deep convolutional neural network that we trained on a large, manually evaluated data set of 1599 patients (7948 intervertebral discs). To improve upon the status quo, we focused on the quality of the training data and performed extensive hyperparameter optimization. We assessed the potential benefits of optimizing loss functions beyond common cross-entropy loss, such as soft kappa loss, ordinal cross-entropy loss, or regression losses. We furthermore experimented with ways to mitigate class imbalance by pooling classes or using class-weighted loss functions. During model development and hyperparameter optimization, we used a fixed 90%/10% training/validation set split. To estimate real-world prediction performance, we performed 10-fold cross-validation.

RESULTS

The evaluated image data results in a Gaussian degeneration grade distribution, and thus grades 1 and 5 are slightly underrepresented in the training set. Our default cross-entropy-based classifier achieves a reliability of κ = 0.92 (Cohen κ), an average sensitivity of 90.2%, and an average precision of 92.5%. In 99.2% of validation cases, the network's prediction deviates at most 1 Pfirrmann grades from the ground truth. Framed as an ordinal regression problem, the mean absolute error between the ground truth and the prediction is 0.08 Pfirrmann grade with a correlation of r = 0.96. The results of the 10-fold cross validation confirm those performance estimates, indicating no substantial overfitting. More sophisticated loss functions, class-based loss weighting, or class pooling did not lead to improved classification performance overall.

CONCLUSIONS

With a reliability of κ > 0.9, our system clearly outperforms average human interrater as well as intrarater reliability. With an average sensitivity of more than 90%, our classifier also surpasses state-of-the-art machine learning solutions for automatically grading disc degeneration.

摘要

目的

与纯粹的主观评分相比,基于磁共振成像的规范化分级方案可提高椎间盘退变的可重复性,但它们的组内和组间可靠性还远远不够,无法在临床纵向研究中检测到小到中等的效果。因此,本研究的目的是开发一种方法,该方法能够基于常规临床图像数据和 Pfirrmann 分级方案实现自动且可重复和可靠的椎间盘退变评估。

材料和方法

我们提出了一种基于深度卷积神经网络的分类器,该分类器是在一个大型的、经过人工评估的 1599 名患者(7948 个椎间盘)数据集上进行训练的。为了改进现状,我们专注于训练数据的质量,并进行了广泛的超参数优化。我们评估了优化损失函数(如软 kappa 损失、有序交叉熵损失或回归损失)以超越常见的交叉熵损失的潜在好处。我们还尝试通过对类进行池化或使用类加权损失函数来减轻类不平衡的影响。在模型开发和超参数优化过程中,我们使用固定的 90%/10%训练/验证集拆分。为了估计实际的预测性能,我们进行了 10 倍交叉验证。

结果

评估的图像数据导致了正态分布的退变等级分布,因此在训练集中,等级 1 和 5 略为不足。我们基于默认的交叉熵分类器的可靠性为 κ = 0.92(Cohen κ),平均灵敏度为 90.2%,平均精度为 92.5%。在 99.2%的验证情况下,网络的预测与真实值之间的差异最大为 1 个 Pfirrmann 等级。将其作为有序回归问题,真实值与预测值之间的平均绝对误差为 0.08 Pfirrmann 等级,相关系数 r = 0.96。10 倍交叉验证的结果证实了这些性能估计,表明没有明显的过拟合。更复杂的损失函数、基于类的损失加权或类池化并没有总体上提高分类性能。

结论

本系统的可靠性 κ > 0.9,明显优于平均人类组内和组间可靠性。我们的分类器的平均灵敏度超过 90%,也超过了用于自动分级椎间盘退变的最先进的机器学习解决方案。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验