Adonis结果P值小于0.05,一定代表两组样品物种构成差异显著吗?
前情回顧
方差分析基本概念:方差分析中的“元”和“因素”是什么?
PERMANOVA原理解釋:這個(gè)統(tǒng)計(jì)檢驗(yàn)可用于判斷PCA/PCoA等的分群效果是否顯著!
實(shí)戰(zhàn)1:畫(huà)一個(gè)帶統(tǒng)計(jì)檢驗(yàn)的PCoA分析結(jié)果
配對(duì)檢驗(yàn):畫(huà)一個(gè)帶統(tǒng)計(jì)檢驗(yàn)的PcOA分析結(jié)果 (再進(jìn)一步,配對(duì)比較)
你的adonis用對(duì)了嗎?不同因素的順序竟然對(duì)結(jié)果有很大影響
為PERMANOVA/Adonis分析保駕護(hù)航,檢驗(yàn)數(shù)據(jù)離散度
非參數(shù)檢驗(yàn)也不是什么都不需要關(guān)注,比如上面提到的因素順序和方差加和方式是一個(gè)需要注意的點(diǎn)。除此之外,非參數(shù)多元方差分析應(yīng)用時(shí)還有下面這些注意事項(xiàng):
PERMANOVA檢驗(yàn)沒(méi)有考慮變量之間的共線性關(guān)系,因此也不能夠用于探索這種關(guān)系。
嵌套或分層設(shè)計(jì) (Nested or hierarchical designs)時(shí)需要考慮合適的置換策略。
需要明確哪些樣品是真正可以交換的 (exchangeable)。
PERMANOVA有個(gè)假設(shè)是balanced designs (不同分組的樣本數(shù)相等), 非平衡設(shè)計(jì)也能處理。
如果不同組的樣品在檢測(cè)指標(biāo)構(gòu)成的空間的中心點(diǎn)沒(méi)有差別,但每個(gè)組內(nèi)檢測(cè)指標(biāo)離散度較大,也會(huì)導(dǎo)致獲得顯著性的P值。
在解釋結(jié)果時(shí),需要同時(shí)評(píng)估數(shù)據(jù)離散度的影響。
vegdist評(píng)估數(shù)據(jù)離散度,再解釋adonis的結(jié)果
前面我們用下面的代碼檢驗(yàn)了Managment對(duì)物種組成差異影響的顯著程度,獲得P-value=0.002 < 0.05,表示管理方式對(duì)物種組成有顯著影響。但這一影響是否受到每個(gè)分組里面數(shù)據(jù)離散程度的影響呢?
library(vegan) data(dune) data(dune.env) # 基于bray-curtis距離進(jìn)行計(jì)算 set.seed(1) dune.div <- adonis2(dune ~ Management, data = dune.env, permutations = 999, method="bray")dune.div## Permutation test for adonis under reduced model ## Terms added sequentially (first to last) ## Permutation: free ## Number of permutations: 999 ## ## adonis2(formula = dune ~ Management, data = dune.env, permutations = 999, method = "bray") ## Df SumOfSqs R2 F Pr(>F) ## Management 3 1.4686 0.34161 2.7672 0.002 ** ## Residual 16 2.8304 0.65839 ## Total 19 4.2990 1.00000 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1我們還需要利用betadisper評(píng)估下每組樣本物種組成的多元一致性 (Multivariate homogeneity of groups dispersions (variances))。如下代碼計(jì)算出P=0.168表示不同分組樣品檢測(cè)指標(biāo)的離散度(方差)沒(méi)有顯著差異。那么,adonis檢測(cè)出的差異就是因?yàn)槊拷M數(shù)據(jù)在空間的中心點(diǎn)不同造成的,進(jìn)一步說(shuō)明Management對(duì)物種組成有顯著影響。
# 計(jì)算加權(quán)bray-curtis距離 dune.dist <- vegdist(dune, method="bray", binary=F)# One measure of multivariate dispersion (variance) for a group of samples # is to calculate the average distance of group members to the group centroid # or spatial median in multivariate space. # To test if the dispersions (variances) of one or more groups are different, # the distances of group members to the group centroid are subject to ANOVA. # This is a multivariate analogue of Levene's test for homogeneity of variances # if the distances between group members and # group centroids is the Euclidean distance. dispersion <- betadisper(dune.dist, group=dune.env$Management) permutest(dispersion)## ## Permutation test for homogeneity of multivariate dispersions ## Permutation: free ## Number of permutations: 999 ## ## Response: Distances ## Df Sum Sq Mean Sq F N.Perm Pr(>F) ## Groups 3 0.13831 0.046104 1.9506 999 0.159 ## Residuals 16 0.37816 0.023635從下面的圖上也可以看出,4種管理方式下樣品在空間的中心點(diǎn)相距較遠(yuǎn)。(也可以參考前面如何美化這個(gè)圖)
plot(dispersion, hull=FALSE, ellipse=TRUE) ##sd ellipseQ: When running adonis (vegan package) I got an r2 = 0.45, andp = 0.001. When I ran the betadisper and ran a subsequent permutation test I got an F = 1 and p = 0.3.
A: A non-significant result in betadisper is not necessarily related to a significant/non-significant result in adonis. The two tests are testing different hypothesis. The former testshomogeneity of dispersion among groups (regions in your case), which is a condition (assumption) for adonis. The latter tests no difference in ‘location’, that is, tests whether composition among groups is similar or not. You may have the centroids of two groups in NMS at a very similar position in the ordination space, but if theirdispersions are quite different, adonis will give you a significant p-value, thus, the result is heavily influenced not by thedifference in composition between groups but bydifferences in composition within groups (heterogeneous dispersion, and thus a measure of beta diversity). In short, your results are fine, you are meeting the ‘one assumption’ for adonis (homogeneous dispersion) and thus you are certain that results from adonis are ‘real’ and not an artifact of heterogeneous dispersions. For more information you can read Anderson (2006) Biometrics 62(1):245-253 and Anderson (2006) Ecology Letters 9(6):683-693. Hope this helps!
https://stats.stackexchange.com/questions/212137/betadisper-and-adonis-in-r-am-i-interpreting-my-output-correctly
數(shù)據(jù)離散度不同而中心點(diǎn)一致,adonis也可能顯著
下面我們看一個(gè)模擬的例子,模擬出3套群體的物種豐度表,群體1、群體2、群體3的物種空間的中心點(diǎn)一致,而物種豐度的離散度依次變小,PERMANOVA檢驗(yàn)顯著,betadisper結(jié)果也顯著,這時(shí)解釋數(shù)據(jù)時(shí)就要小心。這個(gè)導(dǎo)致顯著的原因是什么。
set.seed(1) num <- 30 # 控制每個(gè)物種的均值 mean <- seq(10,120,by=10) # 控制離散度 disp <- c(5,50,200)# 模擬3組樣品的數(shù)據(jù);直接是轉(zhuǎn)置后的物種豐度表 sites.a <- as.data.frame(mapply(rnbinom, n=num, size=disp[1], mu=mean)) rownames(sites.a) <- paste('site.a', 1:num, sep=".") colnames(sites.a) <- paste('Species',letters[1:length(mean)], sep=".")sites.b <- as.data.frame(mapply(rnbinom, n=num, size=disp[1:2], mu=mean)) rownames(sites.b) <- paste('site.b', 1:num, sep=".") colnames(sites.b) <- paste('Species',letters[1:length(mean)], sep=".")sites.c <- as.data.frame(mapply(rnbinom, n=num, size=disp, mu=mean)) rownames(sites.c) <- paste('site.c', 1:num, sep=".") colnames(sites.c) <- paste('Species',letters[1:length(mean)], sep=".")otu_table_t <- rbind(sites.a,sites.b,sites.c) otu_table_t[sample(1:90,5),]## Species.a Species.b Species.c Species.d Species.e Species.f Species.g Species.h Species.i Species.j ## site.c.22 13 15 43 29 49 72 24 102 75 96 ## site.a.26 8 23 46 29 25 15 91 49 58 54 ## site.a.13 14 30 47 56 18 77 111 128 90 53 ## site.a.14 5 15 17 56 37 75 81 59 63 58 ## site.b.21 15 24 8 33 28 42 108 74 76 64 ## Species.k Species.l ## site.c.22 139 142 ## site.a.26 87 129 ## site.a.13 33 47 ## site.a.14 164 183 ## site.b.21 52 103生成Metadata數(shù)據(jù),包含樣品的分組信息。目的就是檢驗(yàn)不同組的物種構(gòu)成是否有顯著差異。
metadata <- data.frame(Sample=rownames(otu_table_t), Group=rep(c("A","B","C"), each=num)) rownames(metadata) <- metadata$Sample metadata[sample(1:90,5),,drop=F]## Sample Group ## site.a.28 site.a.28 A ## site.b.12 site.b.12 B ## site.a.20 site.a.20 A ## site.b.10 site.b.10 B ## site.a.10 site.a.10 APCoA和NMDS分析可視化不同組樣品物種組成的差異度
統(tǒng)計(jì)分析前,先直觀的看一下不同組樣本在物種定義的空間上的分布。
為什么要畫(huà)個(gè)圖:參考 - 什么是安斯庫(kù)姆四重奏?為什么統(tǒng)計(jì)分析之前必須要作圖?
# 計(jì)算加權(quán)bray-curtis距離 otu_dist <- vegdist(otu_table_t, method="bray", binary=F)otu_pcoa <- cmdscale(otu_dist, k=3, eig=T)otu_pcoa_points <- as.data.frame(otu_pcoa$points) sum_eig <- sum(otu_pcoa$eig) eig_percent <- round(otu_pcoa$eig/sum_eig*100,1)colnames(otu_pcoa_points) <- paste0("PCoA", 1:3)otu_pcoa_result <- cbind(otu_pcoa_points, metadata)從PCoA的結(jié)果上來(lái)看,A,B,C三個(gè)組在第一、第二、第三主坐標(biāo)軸沒(méi)有明顯的區(qū)分開(kāi)。
library(ggplot2) library(patchwork)ggplot(otu_pcoa_result, aes(x=PCoA1, y=PCoA2, color=Group)) +labs(x=paste("PCoA 1 (", eig_percent[1], "%)", sep=""),y=paste("PCoA 2 (", eig_percent[2], "%)", sep="")) +geom_point(size=4) + stat_ellipse(level=0.9) +theme_classic() + coord_fixed() +ggplot(otu_pcoa_result, aes(x=PCoA1, y=PCoA3, color=Group)) +labs(x=paste("PCoA 1 (", eig_percent[1], "%)", sep=""),y=paste("PCoA 3 (", eig_percent[3], "%)", sep="")) +geom_point(size=4) + stat_ellipse(level=0.9) +theme_classic() + coord_fixed()從NMDS結(jié)果看,A,B,C三組也區(qū)分不開(kāi)。
otu_mds <- metaMDS(otu_table_t, k=5) #using all the defaults## Square root transformation ## Wisconsin double standardization ## Run 0 stress 0.1131245 ## Run 1 stress 0.1131233 ## ... New best solution ## ... Procrustes: rmse 0.0003155417 max resid 0.001341899 ## ... Similar to previous best ## Run 2 stress 0.1131243 ## ... Procrustes: rmse 0.0009154324 max resid 0.00352237 ## ... Similar to previous best ## Run 3 stress 0.1131238 ## ... Procrustes: rmse 0.0002307456 max resid 0.001378836 ## ... Similar to previous best ## Run 4 stress 0.1131239 ## ... Procrustes: rmse 0.0002008885 max resid 0.0008441584 ## ... Similar to previous best ## Run 5 stress 0.1131233 ## ... Procrustes: rmse 0.0004594988 max resid 0.00248363 ## ... Similar to previous best ## Run 6 stress 0.1136538 ## Run 7 stress 0.1131231 ## ... New best solution ## ... Procrustes: rmse 6.187922e-05 max resid 0.0002788433 ## ... Similar to previous best ## Run 8 stress 0.1131234 ## ... Procrustes: rmse 0.000457399 max resid 0.002017475 ## ... Similar to previous best ## Run 9 stress 0.1131243 ## ... Procrustes: rmse 0.0003620819 max resid 0.001329571 ## ... Similar to previous best ## Run 10 stress 0.1131235 ## ... Procrustes: rmse 0.0001788438 max resid 0.0008840311 ## ... Similar to previous best ## Run 11 stress 0.1131248 ## ... Procrustes: rmse 0.0004674201 max resid 0.001960981 ## ... Similar to previous best ## Run 12 stress 0.1131231 ## ... New best solution ## ... Procrustes: rmse 0.0003807188 max resid 0.001578129 ## ... Similar to previous best ## Run 13 stress 0.1131238 ## ... Procrustes: rmse 0.0004016239 max resid 0.002178598 ## ... Similar to previous best ## Run 14 stress 0.113123 ## ... New best solution ## ... Procrustes: rmse 0.0001931854 max resid 0.0007886561 ## ... Similar to previous best ## Run 15 stress 0.1176584 ## Run 16 stress 0.1131244 ## ... Procrustes: rmse 0.000621146 max resid 0.002339344 ## ... Similar to previous best ## Run 17 stress 0.1131237 ## ... Procrustes: rmse 0.0004553297 max resid 0.0019548 ## ... Similar to previous best ## Run 18 stress 0.1131236 ## ... Procrustes: rmse 0.000454603 max resid 0.001894929 ## ... Similar to previous best ## Run 19 stress 0.1131241 ## ... Procrustes: rmse 0.0005855289 max resid 0.002455173 ## ... Similar to previous best ## Run 20 stress 0.113124 ## ... Procrustes: rmse 0.0005247607 max resid 0.001899271 ## ... Similar to previous best ## *** Solution reachedotu_mds_scores <- as.data.frame(scores(otu_mds)) otu_mds_scores <- cbind(otu_mds_scores, metadata)library(ggplot2) ggplot(data=otu_mds_scores, aes(x=NMDS1,y=NMDS2,colour=Group)) + geom_point(size=4) + stat_ellipse(level = 0.9) +theme_classic()進(jìn)行Adonis檢驗(yàn)和數(shù)據(jù)離散度評(píng)估
adonis結(jié)果顯示Pr(>F)<0.05,統(tǒng)計(jì)顯著;不同組之間的物種組成存在顯著差異。這與PCoA和NMDS的結(jié)果還是有些不一致的。那這個(gè)統(tǒng)計(jì)差異是怎么來(lái)的呢?
library(vegan) adon.results<-adonis(otu_dist ~ Group, data=metadata, perm=999) print(adon.results)## ## Call: ## adonis(formula = otu_dist ~ Group, data = metadata, permutations = 999) ## ## Permutation: free ## Number of permutations: 999 ## ## Terms added sequentially (first to last) ## ## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F) ## Group 2 0.10752 0.053760 2.4707 0.05375 0.001 *** ## Residuals 87 1.89300 0.021759 0.94625 ## Total 89 2.00052 1.00000 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1betadisper檢驗(yàn)Pr(>F)<0.05表明不同組的數(shù)據(jù)在空間分布的離散度顯著不同。這是導(dǎo)致adonis結(jié)果顯著的主要原因。不同分組之間物種的構(gòu)成的顯著不同不是體現(xiàn)在物種空間中心點(diǎn)的變化,而是物種空間離散度的變化。
mod <- betadisper(otu_dist, metadata$Group) permutest(mod)## ## Permutation test for homogeneity of multivariate dispersions ## Permutation: free ## Number of permutations: 999 ## ## Response: Distances ## Df Sum Sq Mean Sq F N.Perm Pr(>F) ## Groups 2 0.157498 0.078749 80.188 999 0.001 *** ## Residuals 87 0.085439 0.000982 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1用一組可視化來(lái)展示這個(gè)差異的成因
把每組樣本抽提出來(lái),分別繪制下PCoA的樣品分布,可以看出,每組樣品在PCoA定義的空間上中心點(diǎn)是很相近的,而樣品分散程度不同。也就是說(shuō)分組內(nèi)樣品的多樣性反應(yīng)到了不同分組的物種構(gòu)成差異上了,這個(gè)“顯著”的差異是不是我們關(guān)注的,需要自己來(lái)判斷了。
# extract the centroids and the site points in multivariate space. centroids<-data.frame(grps=rownames(mod$centroids),data.frame(mod$centroids)) vectors<-data.frame(group=mod$group,data.frame(mod$vectors))# to create the lines from the centroids to each point we will put it in a format that ggplot can handle seg.data<-cbind(vectors[,1:3],centroids[rep(1:nrow(centroids),as.data.frame(table(vectors$group))$Freq),2:3]) names(seg.data)<-c("group","v.PCoA1","v.PCoA2","PCoA1","PCoA2")# create the convex hulls of the outermost points grp1.hull<-seg.data[seg.data$group=="A",1:3][chull(seg.data[seg.data$group=="A",2:3]),] grp2.hull<-seg.data[seg.data$group=="B",1:3][chull(seg.data[seg.data$group=="B",2:3]),] grp3.hull<-seg.data[seg.data$group=="C",1:3][chull(seg.data[seg.data$group=="C",2:3]),] all.hull<-rbind(grp1.hull,grp2.hull,grp3.hull)library(gridExtra)panel.a<-ggplot() +geom_polygon(data=all.hull[all.hull=="A",],aes(x=v.PCoA1,y=v.PCoA2),colour="black",alpha=0,linetype="dashed") +geom_segment(data=seg.data[1:30,],aes(x=v.PCoA1,xend=PCoA1,y=v.PCoA2,yend=PCoA2),alpha=0.30) + geom_point(data=centroids[1,1:3], aes(x=PCoA1,y=PCoA2),size=4,colour="red",shape=16) + geom_point(data=seg.data[1:30,], aes(x=v.PCoA1,y=v.PCoA2),size=2,shape=16) +labs(title="A",x="",y="") +coord_cartesian(xlim = c(-0.2,0.2), ylim = c(-0.25,0.2)) +theme_classic() + theme(legend.position="none")panel.b<-ggplot() + geom_polygon(data=all.hull[all.hull=="B",],aes(x=v.PCoA1,y=v.PCoA2),colour="black",alpha=0,linetype="dashed") +geom_segment(data=seg.data[31:60,],aes(x=v.PCoA1,xend=PCoA1,y=v.PCoA2,yend=PCoA2),alpha=0.30) + geom_point(data=centroids[2,1:3], aes(x=PCoA1,y=PCoA2),size=4,colour="red",shape=17) + geom_point(data=seg.data[31:60,], aes(x=v.PCoA1,y=v.PCoA2),size=2,shape=17) +labs(title="B",x="",y="") +coord_cartesian(xlim = c(-0.2,0.2), ylim = c(-0.25,0.2)) +theme_classic() + theme(legend.position="none")panel.c<-ggplot() + geom_polygon(data=all.hull[all.hull=="C",],aes(x=v.PCoA1,y=v.PCoA2),colour="black",alpha=0,linetype="dashed") +geom_segment(data=seg.data[61:90,],aes(x=v.PCoA1,xend=PCoA1,y=v.PCoA2,yend=PCoA2),alpha=0.30) +geom_point(data=centroids[3,1:3], aes(x=PCoA1,y=PCoA2),size=4,colour="red",shape=15) + geom_point(data=seg.data[61:90,], aes(x=v.PCoA1,y=v.PCoA2),size=2,shape=15) + labs(title="C",x="",y="") +coord_cartesian(xlim = c(-0.2,0.2), ylim = c(-0.25,0.2)) +theme_classic() + theme(legend.position="none")panel.d<-ggplot() + geom_polygon(data=all.hull,aes(x=v.PCoA1,y=v.PCoA2),colour="black",alpha=0,linetype="dashed") +geom_segment(data=seg.data,aes(x=v.PCoA1,xend=PCoA1,y=v.PCoA2,yend=PCoA2),alpha=0.30) + geom_point(data=centroids[,1:3], aes(x=PCoA1,y=PCoA2,shape=grps),size=4,colour="red") + geom_point(data=seg.data, aes(x=v.PCoA1,y=v.PCoA2,shape=group),size=2) + labs(title="All",x="",y="") +coord_cartesian(xlim = c(-0.2,0.2), ylim = c(-0.25,0.2)) +theme_classic() + theme(legend.position="none")grid.arrange(panel.a,panel.b,panel.c,panel.d,nrow=1)PERMANOVA的作者對(duì)這個(gè)問(wèn)題的看法
Marti Anderson: “[…] Although there is also no explicit assumption regarding the homogeneity of spread within each group, PERMANOVA, like ANOSIM (Clarke 1993), will be sensitive to differences in spread (variability) among groups. Thus, if a significant difference between groups is detected using PERMANOVA, then this could be due to differences in location, differences in spread, or a combinationof the two. Perhaps the best approach is to perform a separate test for homogeneity (e.g., using the program PERMDISP) including pair-wise comparisons, as well as examining the average within and between-group distances and associated MDS plots. This will help to determine the nature of the difference between any pair of groups, whether it be due to location, spread, or a combination of the two. […]”
參考
https://www.scribbr.com/frequently-asked-questions/one-way-vs-two-way-anova/
MANOVA的前提假設(shè) https://www.real-statistics.com/multivariate-statistics/multivariate-analysis-of-variance-manova/manova-assumptions/ ?https://www.statology.org/manova-assumptions/
https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide.php
https://www.yunbios.net/h-nd-570.html
https://mp.weixin.qq.com/s/v_k4Yhe9rBWM9y9A3P3wQw
https://mp.weixin.qq.com/s?__biz=MzUzMjA4Njc1MA==&mid=2247484678&idx=1&sn=f95418a311e639704e9848545efc7fd7&scene=21#wechat_redirect
https://chrischizinski.github.io/rstats/vegan-ggplot2/
https://chrischizinski.github.io/rstats/adonis/
https://chrischizinski.github.io/rstats/ordisurf/
https://www.rdocumentation.org/packages/vegan/versions/1.11-0/topics/adonis
https://www.jianshu.com/p/dfa689f7cafd
https://stats.stackexchange.com/questions/312302/adonis-in-vegan-order-of-variables-non-nested-with-one-degree-of-freedom-for
https://stats.stackexchange.com/questions/188519/adonis-in-vegan-order-of-variables-or-use-of-strata?noredirect=1
https://github.com/vegandevs/vegan/issues/229
https://stats.stackexchange.com/questions/476256/adonis-vs-adonis2
清晰解釋Type I, Type II, Type III https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/
清晰解釋Type I, Type II, Type III https://stats.stackexchange.com/questions/60362/choice-between-type-i-type-ii-or-type-iii-anova
https://thebiobucket.blogspot.com/2011/08/two-way-permanova-adonis-with-custom.html#more
adonis的前提條件 https://thebiobucket.blogspot.com/2011/04/assumptions-for-permanova-with-adonis.html#more
作者的論文 https://static1.squarespace.com/static/580e3c475016e191c523a0e2/t/5813ba8b5016e1a5b61f454a/1477687949842/Anderson_et_al-2013-ANOSIM+vs.+PERMANOVA.pdf
離散度 adonis https://chrischizinski.github.io/rstats/adonis/
往期精品(點(diǎn)擊圖片直達(dá)文字對(duì)應(yīng)教程)
機(jī)器學(xué)習(xí)
后臺(tái)回復(fù)“生信寶典福利第一波”或點(diǎn)擊閱讀原文獲取教程合集
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來(lái)咯,堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)總結(jié)
以上是生活随笔為你收集整理的Adonis结果P值小于0.05,一定代表两组样品物种构成差异显著吗?的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: linux图形界面编程基本知识
- 下一篇: su官网