Suppr超能文献

统一流形逼近和投影(UMAP)在水生生态学中生物指标的无约束排序和分类中的应用。

The application of Uniform Manifold Approximation and Projection (UMAP) for unconstrained ordination and classification of biological indicators in aquatic ecology.

机构信息

University of Niš, Faculty of Sciences and Mathematics, Department of Biology and Ecology, Višegradska 33, 18000 Niš, Serbia.

School for Resource and Environmental Studies, Dalhousie University, Halifax, Canada.

出版信息

Sci Total Environ. 2022 Apr 1;815:152365. doi: 10.1016/j.scitotenv.2021.152365. Epub 2021 Dec 25.

Abstract

The analysis of community structure in studies of freshwater ecology often requires the application of dimensionality reduction to process multivariate data. A high number of dimensions (number of taxa/environmental parameters × number of samples), nonlinear relationships, outliers, and high variability usually hinder the visualization and interpretation of multivariate datasets. Here, we proposed a new statistical design using Uniform Manifold Approximation and Projection (UMAP), and community partitioning using Louvain algorithms, to ordinate and classify the structure of aquatic biota in two-dimensional space. We present this approach with a demonstration of five previously published datasets for diatoms, macrophytes, chironomids (larval and subfossil), and fish. Principal Component Analysis (PCA) and Ward's clustering were also used to assess the comparability of the UMAP approach compared to traditional approaches for ordination and classification. The ordination of sampling sites in 2-dimensional space showed a much denser, and easier to interpret, grouping using the UMAP approach in comparison to PCA. The classification of community structure using the Louvain algorithm in UMAP ordinal space showed a high classification strength for data with a high number of dimensions than the cluster patterns obtained with the use of a Ward's algorithm in PCA. Environmental gradients, presented via heat maps, were overlayed with the ordination patterns of aquatic communities, confirming that the ordinations obtained by UMAP were ecologically meaningful. This is the first study that has applied a UMAP approach with classification using Louvain algorithms on ecological datasets. We show that the performance of local and global structures, as well as the number of clusters determined by the algorithm, make this approach more powerful than traditional approaches.

摘要

在淡水生态学研究中,对群落结构的分析通常需要应用降维方法来处理多变量数据。高维数(分类单元/环境参数的数量×样本数量)、非线性关系、离群值和高变异性通常会阻碍对多变量数据集的可视化和解释。在这里,我们提出了一种新的统计设计,使用均匀流形逼近和投影(UMAP)进行维度约简,并使用 Louvain 算法进行群落划分,以二维空间中的水生生物群落结构进行排序和分类。我们使用五个先前发表的数据集(硅藻、大型植物、摇蚊(幼虫和亚化石)和鱼类)来演示这种方法。主成分分析(PCA)和 Ward 聚类也被用于评估 UMAP 方法与传统排序和分类方法的可比性。与 PCA 相比,使用 UMAP 方法对采样点进行二维空间排序时,分组更加密集,也更容易解释。在 UMAP 有序空间中使用 Louvain 算法对群落结构进行分类的结果表明,与使用 Ward 算法在 PCA 中获得的聚类模式相比,该方法在高维数数据的分类强度更高。通过热图呈现的环境梯度与水生群落的排序模式重叠,证实了 UMAP 获得的排序具有生态意义。这是第一项将 UMAP 方法与使用 Louvain 算法进行分类应用于生态数据集的研究。我们表明,该方法的局部和全局结构的性能以及算法确定的聚类数量使该方法比传统方法更强大。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验