生活随笔
收集整理的這篇文章主要介紹了
高斯混合模型GMM理论和Python实现
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
- https://github.com/Sean16SYSU/MachineLearningImplement
簡述
高斯混合模型,就是說用多個高斯函數去描述不同的元素分布。
通過EM方法來迭代生成不同的高斯模型的各個參數。
具體的EM算法的理論網上很多,但推薦各位先看完這個算法思路之后,再去看理論推導就更加好了。
更新方法
- μi′=∑j=1mηji?xj∑j=1mηji\mu_i^{'} = \frac{\sum_{j=1}^m{\eta_{ji} * x_j}}{\sum_{j=1}^m{\eta_{ji}}}μi′?=∑j=1m?ηji?∑j=1m?ηji??xj??
- Σi′=∑j=1mηji?(xj?μi′)?(xj?μi′)T∑j=1mηji\Sigma_i^{'} = \frac{\sum_{j=1}^m{\eta_{ji} * (x_j - \mu_i^{'}) * (x_j - \mu_i^{'})^T}}{\sum_{j=1}^m{\eta_{ji}}}Σi′?=∑j=1m?ηji?∑j=1m?ηji??(xj??μi′?)?(xj??μi′?)T?
- αi′=∑j=1mηjim\alpha_i^{'} = \frac{\sum_{j=1}^m{\eta_{ji}}}{m}αi′?=m∑j=1m?ηji??
m是點數量ηij=P(zj=i∣xj)\eta_{ij}=P(z_j=i|x_j)ηij?=P(zj?=i∣xj?) 這個概率就是在第iii個高斯模型下樣本xjx_jxj?的概率
通過這樣不斷地迭代,最后,用這個ηij\eta_{ij}ηij?來計算最后的聚類結果
即,對于每一個樣本,屬于概率最高的高斯分布的所對應的高斯分布。
Python實現
from sklearn
import datasets
import numpy
as np
import matplotlib
.pyplot
as pltiris
= datasets
.load_iris
()
from scipy
import stats
def GMMs(X
, k
=3, steps
=10):def p(x
, mu
, sigma
):n
= len(x
)div
= (2 * np
.pi
) ** (n
/ 2) * (abs(np
.linalg
.det
(sigma
)) ** 0.5)expOn
= -0.5 * ( np
.dot
( (x
- mu
).T
, np
.dot
(np
.linalg
.inv
(sigma
), (x
- mu
)) ) ) return np
.exp
(expOn
) / div
def init(X
):_
, n
= X
.shape
return np
.random
.rand
(k
, n
), 2 * np
.random
.rand
(k
, n
, n
) + 1, np
.random
.rand
(k
)mus
, sigmas
, alphas
= init
(X
)mat
= np
.zeros
((len(X
), k
))for times
in range(steps
):for j
, x
in enumerate(X
):temp
, tempP
= 0, 0for i
in range(k
):tempP
= p
(x
, mus
[i
], sigmas
[i
])temp
+= tempPmat
[j
][i
] = alphas
[i
] * tempPmat
[j
] /= temp
for i
in range(k
):mus
[i
] = np
.dot
(mat
[:, i
].T
, X
) / sum(mat
[:, i
])temp
= np
.zeros
(sigmas
[0].shape
)for j
in range(len(X
)):data
= (X
[j
] - mus
[i
]).reshape
(4, 1)temp
+= mat
[j
][i
] * np
.dot
(data
, data
.T
)temp
/= sum(mat
[:, i
])sigmas
[i
] = tempalphas
[i
] = sum(mat
[:, i
]) / len(X
)Ans
= np
.zeros
(len(X
))for j
, x
in enumerate(X
):temp
, tempP
= 0, 0for i
in range(k
):tempP
= p
(x
, mus
[i
], sigmas
[i
])temp
+= tempPmat
[j
][i
] = alphas
[i
] * tempPmat
[j
] /= tempAns
[j
] = np
.argmax
(mat
[j
])return Ans
test_y
= GMMs
(iris
.data
, steps
=20)
from sklearn
.decomposition
import PCA
?
X_reduced
= PCA
(n_components
=2).fit_transform
(iris
.data
)
plt
.scatter
(X_reduced
[:, 0], X_reduced
[:, 1], c
=test_y
, cmap
=plt
.cm
.Set1
)
def evaluate(y
, t
):a
, b
, c
, d
= [0 for i
in range(4)]for i
in range(len(y
)):for j
in range(i
+1, len(y
)):if y
[i
] == y
[j
] and t
[i
] == t
[j
]:a
+= 1elif y
[i
] == y
[j
] and t
[i
] != t
[j
]:b
+= 1elif y
[i
] != y
[j
] and t
[i
] == t
[j
]:c
+= 1elif y
[i
] != y
[j
] and t
[i
] != t
[j
]:d
+= 1return a
, b
, c
, d
def external_index(a
, b
, c
, d
, m
):JC
= a
/ (a
+ b
+ c
)FMI
= np
.sqrt
(a
**2 / ((a
+ b
) * (a
+ c
)))RI
= 2 * ( a
+ d
) / ( m
* (m
+ 1) )return JC
, FMI
, RI
def evaluate_it(y
, t
):a
, b
, c
, d
= evaluate
(y
, t
)return external_index
(a
, b
, c
, d
, len(y
))
Indexvalue
| JC | 0.8187638512681605 |
| FMI | 0.9003627122239571 |
| RI | 0.921766004415011 |
總結
以上是生活随笔為你收集整理的高斯混合模型GMM理论和Python实现的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。