Montaha Sidratul, Azam Sami, Bhuiyan Md Rahad Islam, Chowa Sadia Sultana, Mukta Md Saddam Hossain, Jonkman Mirjam
Department of Computer Science, University of Calgary, Calgary, Canada.
Faculty of Science and Technology, Charles Darwin University, Casuarina, Australia.
Digit Health. 2024 May 15;10:20552076241251660. doi: 10.1177/20552076241251660. eCollection 2024 Jan-Dec.
Early diagnosis of breast cancer can lead to effective treatment, possibly increase long-term survival rates, and improve quality of life. The objective of this study is to present an automated analysis and classification system for breast cancer using clinical markers such as tumor shape, orientation, margin, and surrounding tissue. The novelty and uniqueness of the study lie in the approach of considering medical features based on the diagnosis of radiologists.
Using clinical markers, a graph is generated where each feature is represented by a node, and the connection between them is represented by an edge which is derived through Pearson's correlation method. A graph convolutional network (GCN) model is proposed to classify breast tumors into benign and malignant, using the graph data. Several statistical tests are performed to assess the importance of the proposed features. The performance of the proposed GCN model is improved by experimenting with different layer configurations and hyper-parameter settings.
Results show that the proposed model has a 98.73% test accuracy. The performance of the model is compared with a graph attention network, a one-dimensional convolutional neural network, and five transfer learning models, ten machine learning models, and three ensemble learning models. The performance of the model was further assessed with three supplementary breast cancer ultrasound image datasets, where the accuracies are 91.03%, 94.37%, and 89.62% for Dataset A, Dataset B, and Dataset C (combining Dataset A and Dataset B) respectively. Overfitting issues are assessed through k-fold cross-validation.
Several variants are utilized to present a more rigorous and fair evaluation of our work, especially the importance of extracting clinically relevant features. Moreover, a GCN model using graph data can be a promising solution for an automated feature-based breast image classification system.
乳腺癌的早期诊断可带来有效的治疗,可能提高长期生存率,并改善生活质量。本研究的目的是提出一种使用肿瘤形状、方向、边缘和周围组织等临床标志物的乳腺癌自动分析和分类系统。该研究的新颖性和独特性在于基于放射科医生的诊断来考虑医学特征的方法。
利用临床标志物生成一个图,其中每个特征由一个节点表示,它们之间的连接由通过皮尔逊相关方法得出的边表示。提出了一种图卷积网络(GCN)模型,使用图数据将乳腺肿瘤分类为良性和恶性。进行了几项统计测试以评估所提出特征的重要性。通过试验不同的层配置和超参数设置来提高所提出的GCN模型的性能。
结果表明,所提出的模型测试准确率为98.73%。将该模型的性能与图注意力网络、一维卷积神经网络、五个迁移学习模型、十个机器学习模型和三个集成学习模型进行了比较。使用三个补充性乳腺癌超声图像数据集进一步评估了该模型的性能,其中数据集A、数据集B和数据集C(结合数据集A和数据集B)的准确率分别为91.03%、94.37%和89.62%。通过k折交叉验证评估了过拟合问题。
利用了几种变体对我们的工作进行更严格和公平的评估,特别是提取临床相关特征的重要性。此外,使用图数据的GCN模型对于基于特征的乳腺图像自动分类系统可能是一个有前途的解决方案。