Dohmen Melanie, Klemens Mark A, Baltruschat Ivo M, Truong Tuan, Lenga Matthias
Bayer AG, Radiology, Berlin, Germany.
Sci Rep. 2025 Jan 31;15(1):3853. doi: 10.1038/s41598-025-87358-0.
Image-to-image translation can create large impact in medical imaging, as images can be synthetically transformed to other modalities, sequence types, higher resolutions or lower noise levels. To ensure patient safety, these methods should be validated by human readers, which requires a considerable amount of time and costs. Quantitative metrics can effectively complement such studies and provide reproducible and objective assessment of synthetic images. If a reference is available, the similarity of MR images is frequently evaluated by SSIM and PSNR metrics, even though these metrics are not or too sensitive regarding specific distortions. When reference images to compare with are not available, non-reference quality metrics can reliably detect specific distortions, such as blurriness. To provide an overview on distortion sensitivity, we quantitatively analyze 11 similarity (reference) and 12 quality (non-reference) metrics for assessing synthetic images. We additionally include a metric on a downstream segmentation task. We investigate the sensitivity regarding 11 kinds of distortions and typical MR artifacts, and analyze the influence of different normalization methods on each metric and distortion. Finally, we derive recommendations for effective usage of the analyzed similarity and quality metrics for evaluation of image-to-image translation models.
图像到图像的转换在医学成像中会产生巨大影响,因为图像可以被合成转换为其他模态、序列类型、更高分辨率或更低噪声水平。为确保患者安全,这些方法应由人类读者进行验证,这需要大量时间和成本。定量指标可以有效地补充此类研究,并为合成图像提供可重复且客观的评估。如果有参考图像,MR图像的相似性通常通过结构相似性指数(SSIM)和峰值信噪比(PSNR)指标进行评估,尽管这些指标对特定失真不敏感或过于敏感。当没有可供比较的参考图像时,非参考质量指标可以可靠地检测特定失真,如模糊。为了概述失真敏感性,我们定量分析了11种用于评估合成图像的相似性(参考)指标和12种质量(非参考)指标。我们还纳入了一个关于下游分割任务的指标。我们研究了对11种失真和典型MR伪影的敏感性,并分析了不同归一化方法对每个指标和失真的影响。最后,我们得出了关于有效使用所分析的相似性和质量指标来评估图像到图像转换模型的建议。