Suppr超能文献

用于层次数据二元分类的专家混合网络扩展

Extension of mixture-of-experts networks for binary classification of hierarchical data.

作者信息

Ng Shu-Kay, McLachlan Geoffrey J

机构信息

Department of Mathematics, University of Queensland, Brisbane, Qld 4072, Australia.

出版信息

Artif Intell Med. 2007 Sep;41(1):57-67. doi: 10.1016/j.artmed.2007.06.001. Epub 2007 Jul 16.

Abstract

OBJECTIVE

For many applied problems in the context of medically relevant artificial intelligence, the data collected exhibit a hierarchical or clustered structure. Ignoring the interdependence between hierarchical data can result in misleading classification. In this paper, we extend the mechanism for mixture-of-experts (ME) networks for binary classification of hierarchical data. Another extension is to quantify cluster-specific information on data hierarchy by random effects via the generalized linear mixed-effects model (GLMM).

METHODS AND MATERIAL

The extension of ME networks is implemented by allowing for correlation in the hierarchical data in both the gating and expert networks via the GLMM. The proposed model is illustrated using a real thyroid disease data set. In our study, we consider 7652 thyroid diagnosis records from 1984 to early 1987 with complete information on 20 attribute values. We obtain 10 independent random splits of the data into a training set and a test set in the proportions 85% and 15%. The test sets are used to assess the generalization performance of the proposed model, based on the percentage of misclassifications. For comparison, the results obtained from the ME network with independence assumption are also included.

RESULTS

With the thyroid disease data, the misclassification rate on test sets for the extended ME network is 8.9%, compared to 13.9% for the ME network. In addition, based on model selection methods described in Section 2, a network with two experts is selected. These two expert networks can be considered as modeling two groups of patients with high and low incidence rates. Significant variation among the predicted cluster-specific random effects is detected in the patient group with low incidence rate.

CONCLUSIONS

It is shown that the extended ME network outperforms the ME network for binary classification of hierarchical data. With the thyroid disease data, useful information on the relative log odds of patients with diagnosed conditions at different periods can be evaluated. This information can be taken into consideration for the assessment of treatment planning of the disease. The proposed extended ME network thus facilitates a more general approach to incorporate data hierarchy mechanism in network modeling.

摘要

目的

对于医学相关人工智能背景下的许多应用问题,所收集的数据呈现出层次或聚类结构。忽略层次数据之间的相互依赖可能导致误导性的分类。在本文中,我们扩展了用于层次数据二元分类的专家混合(ME)网络机制。另一个扩展是通过广义线性混合效应模型(GLMM)利用随机效应量化数据层次上特定聚类的信息。

方法与材料

通过在门控网络和专家网络中允许层次数据的相关性,利用GLMM实现ME网络的扩展。使用真实的甲状腺疾病数据集对所提出的模型进行了说明。在我们的研究中,我们考虑了1984年至1987年初的7652条甲状腺诊断记录,这些记录包含20个属性值的完整信息。我们将数据以85%和15%的比例独立随机划分为训练集和测试集。基于错误分类的百分比,测试集用于评估所提出模型的泛化性能。为了进行比较,还包括了在独立假设下从ME网络获得的结果。

结果

对于甲状腺疾病数据,扩展后的ME网络在测试集上的错误分类率为8.9%,而ME网络为13.9%。此外,根据第2节中描述的模型选择方法,选择了一个具有两个专家的网络。这两个专家网络可以被视为对发病率高和低的两组患者进行建模。在发病率低的患者组中检测到预测的特定聚类随机效应之间存在显著差异。

结论

结果表明,扩展后的ME网络在层次数据的二元分类方面优于ME网络。对于甲状腺疾病数据,可以评估不同时期确诊患者相对对数优势的有用信息。该信息可用于评估疾病的治疗计划。因此,所提出的扩展ME网络促进了一种更通用的方法,将数据层次机制纳入网络建模。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验