Suppr超能文献

度校正块模型的模型选择

Model selection for degree-corrected block models.

作者信息

Yan Xiaoran, Shalizi Cosma, Jensen Jacob E, Krzakala Florent, Moore Cristopher, Zdeborová Lenka, Zhang Pan, Zhu Yaojia

机构信息

Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA.

Statistics Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

出版信息

J Stat Mech. 2014 May;2014(5). doi: 10.1088/1742-5468/2014/05/P05007.

Abstract

The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for undertaking this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its overall degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction and point to a general approach to model selection in network analysis.

摘要

网络模型的激增引发了具有挑战性的模型选择问题

数据稀疏且全局相关,模型通常是高维的且有大量潜在变量。这些问题共同意味着常用的模型选择标准在网络中不能正常工作。我们通过考虑将图划分为具有与网络其余部分链接的同质模式的社区或节点块这一关键网络分析问题,来说明这些挑战,并展示解决这些挑战的一种方法。进行此操作的标准工具是随机块模型,在该模型下,两个节点之间链接的概率仅是它们所属块的函数。这在每个块内强加了一个同质的度分布;这可能不现实,因此度校正块模型为每个节点添加一个参数,以调节其总体度。普通块模型和度校正块模型之间的选择很重要,因为它们对社区的推断非常不同。我们基于随机块模型下对数似然比分布的新的大图渐近性,提出了一种在标准块模型和度校正块模型之间进行模型选择的第一个有原则且易于处理的方法,发现与稀疏图的经典结果有很大偏差。我们还使用信念传播为随机块模型和度校正模型下的对数似然开发了线性时间近似。对模拟网络和真实网络的应用表明与我们的近似结果非常吻合。因此,我们的结果既解决了决定度校正的实际问题,又指出了网络分析中模型选择的一般方法。

相似文献

1
Model selection for degree-corrected block models.
J Stat Mech. 2014 May;2014(5). doi: 10.1088/1742-5468/2014/05/P05007.
2
Infinite-degree-corrected stochastic block model.
Phys Rev E Stat Nonlin Soft Matter Phys. 2014 Sep;90(3):032819. doi: 10.1103/PhysRevE.90.032819. Epub 2014 Sep 29.
4
Degree-corrected distribution-free model for community detection in weighted networks.
Sci Rep. 2022 Sep 7;12(1):15153. doi: 10.1038/s41598-022-19456-2.
5
A Regularized Stochastic Block Model for the robust community detection in complex networks.
Sci Rep. 2019 Sep 13;9(1):13247. doi: 10.1038/s41598-019-49580-5.
6
Combining a popularity-productivity stochastic block model with a discriminative-content model for general structure detection.
Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jul;88(1):012807. doi: 10.1103/PhysRevE.88.012807. Epub 2013 Jul 8.
7
Entropy of stochastic blockmodel ensembles.
Phys Rev E Stat Nonlin Soft Matter Phys. 2012 May;85(5 Pt 2):056122. doi: 10.1103/PhysRevE.85.056122. Epub 2012 May 30.
8
Spectral redemption in clustering sparse networks.
Proc Natl Acad Sci U S A. 2013 Dec 24;110(52):20935-40. doi: 10.1073/pnas.1312486110. Epub 2013 Nov 25.
9
Tensor Network Message Passing.
Phys Rev Lett. 2024 Mar 15;132(11):117401. doi: 10.1103/PhysRevLett.132.117401.
10
Causal Artificial Intelligence Models of Food Quality Data.
Food Technol Biotechnol. 2024 Mar;62(1):102-109. doi: 10.17113/ftb.62.01.24.8301.

引用本文的文献

1
Quantifying metadata relevance to network block structure using description length.
Commun Phys. 2024;7(1):331. doi: 10.1038/s42005-024-01819-y. Epub 2024 Oct 11.
3
An empirical Bayes approach to stochastic blockmodels and graphons: shrinkage estimation and model selection.
PeerJ Comput Sci. 2022 Jul 6;8:e1006. doi: 10.7717/peerj-cs.1006. eCollection 2022.
4
Scalable Estimation of Epidemic Thresholds via Node Sampling.
Sankhya Ser A. 2022;84(1):321-344. doi: 10.1007/s13171-021-00249-0. Epub 2021 Jul 7.
5
6
Demarcating geographic regions using community detection in commuting networks with significant self-loops.
PLoS One. 2020 Apr 29;15(4):e0230941. doi: 10.1371/journal.pone.0230941. eCollection 2020.
7
Mapping the community structure of the rat cerebral cortex with weighted stochastic block modeling.
Brain Struct Funct. 2020 Jan;225(1):71-84. doi: 10.1007/s00429-019-01984-9. Epub 2019 Nov 23.
9
Stochastic block models: A comparison of variants and inference methods.
PLoS One. 2019 Apr 23;14(4):e0215296. doi: 10.1371/journal.pone.0215296. eCollection 2019.

本文引用的文献

1
Improving Simulation-Based Algorithms for Fitting ERGMs.
J Comput Graph Stat. 2012 Dec 13;21(4):920-939. doi: 10.1080/10618600.2012.679224.
2
Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications.
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Dec;84(6 Pt 2):066106. doi: 10.1103/PhysRevE.84.066106. Epub 2011 Dec 12.
3
Inference and phase transitions in the detection of modules in sparse networks.
Phys Rev Lett. 2011 Aug 5;107(6):065701. doi: 10.1103/PhysRevLett.107.065701. Epub 2011 Aug 2.
4
The interplay between microscopic and mesoscopic structures in complex networks.
PLoS One. 2011;6(8):e21282. doi: 10.1371/journal.pone.0021282. Epub 2011 Aug 1.
5
Mixed Membership Stochastic Blockmodels.
J Mach Learn Res. 2008 Sep;9:1981-2014.
6
Stochastic blockmodels and community structure in networks.
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Jan;83(1 Pt 2):016107. doi: 10.1103/PhysRevE.83.016107. Epub 2011 Jan 21.
7
Missing and spurious interactions and the reconstruction of complex networks.
Proc Natl Acad Sci U S A. 2009 Dec 29;106(52):22073-8. doi: 10.1073/pnas.0908366106. Epub 2009 Dec 14.
8
A nonparametric view of network models and Newman-Girvan and other modularities.
Proc Natl Acad Sci U S A. 2009 Dec 15;106(50):21068-73. doi: 10.1073/pnas.0907096106. Epub 2009 Nov 23.
9
Food web models: a plea for groups.
Ecol Lett. 2009 Jul;12(7):652-62. doi: 10.1111/j.1461-0248.2009.01321.x. Epub 2009 May 6.
10
Hierarchical structure and the prediction of missing links in networks.
Nature. 2008 May 1;453(7191):98-101. doi: 10.1038/nature06830.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验