當(dāng)前位置：首頁 > 编程语言 > python >内容正文

python

FCM算法实现Python（简洁版）

發(fā)布時(shí)間：2025/4/16 python 15 豆豆

生活随笔收集整理的這篇文章主要介紹了 FCM算法实现Python（简洁版）小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

FCM算法

關(guān)于聚類的所有項(xiàng)目都放到了這個(gè)上面

https://github.com/Sean16SYSU/MachineLearningImplement/tree/master/Clustering

全名為Fuzzy C-Means，是一種聚類算法。

Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method (developed by Dunn in 1973 and improved by Bezdek in 1981) is frequently used in pattern recognition. It is based on minimization of the following objective function:
FCM是一種聚類的方法，可以允許一個(gè)數(shù)據(jù)屬于兩個(gè)或以上的類。這種方法（由Dunn在1973年提出，由Bezdek在1981年改進(jìn)。）被頻繁地用于模式識(shí)別。

J. C. Dunn (1973): “A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters”, Journal of Cybernetics 3: 32-57
J. C. Bezdek (1981): “Pattern Recognition with Fuzzy Objective Function Algoritms”, Plenum Press, New York

基于最小化下面的目標(biāo)函數(shù):

$Jm=∑i=1N∑j=1Cuijm∣∣xi?cj∣∣2J_m = \sum^{N}_{i=1}{\sum^{C}_{j=1}{u_{ij}^m||x_i-c_j||^2}}$

m是任何大于1的實(shí)數(shù)
$u_{ij}$ 是 $x_i$ 在類j上隸屬度
其中 $x_i$ 是第i個(gè)d維測(cè)量向量
$c_j$ 是一個(gè)d維的類中心

FCM主要通過迭代更新u和c來得到最終的解。

迭代終點(diǎn)一般選取兩個(gè)：

當(dāng)u變化的無窮范數(shù)小于某個(gè)值時(shí)
當(dāng)?shù)螖?shù)達(dá)到某個(gè)極限時(shí)候

這個(gè)過程可能會(huì)收斂到局部最小值或者是鞍點(diǎn)上

算法流程

初始化矩陣U （描述每個(gè)點(diǎn)在不同的類中的概率）

通過U算類中心

Cj=∑i=1Nuijm?xi∑i=1NuijmC_j=\frac{\sum^{N}_{i=1}{u_{ij}^m * x_i}}{\sum^{N}_{i=1}{u_{ij}^m}}

更新U，通過之前算出來的類中心

uij=1∑k=1C(∣∣xi?cj∣∣∣∣xi?ck∣∣)2m?1u_{ij} = \frac{1}{\sum^{C}_{k=1}{(\frac{||x_i-c_j||}{||x_i-c_k||})^{\frac{2}{m-1}}}}

如果u變化的無窮范小于某個(gè)值的時(shí)候，就停下來，否者就轉(zhuǎn)到第2步

之前也寫過這個(gè)算法，當(dāng)時(shí)是用其他用途有點(diǎn)趕，就沒有仔細(xì)研究。
現(xiàn)在自己仔細(xì)研究了一輪之后，實(shí)現(xiàn)了這個(gè)算法，大概只用20行左右就可以完成這個(gè)算法FCM，相比之前的在github上看到的100行左右的代碼，這個(gè)簡(jiǎn)直可以說是改進(jìn)大了很多。

Python 實(shí)現(xiàn)

import numpy as npdef FCM(X, c_clusters=3, m=2, eps=10):membership_mat = np.random.random((len(X), c_clusters))membership_mat = np.divide(membership_mat, np.sum(membership_mat, axis=1)[:, np.newaxis])while True:working_membership_mat = membership_mat ** mCentroids = np.divide(np.dot(working_membership_mat.T, X), np.sum(working_membership_mat.T, axis=1)[:, np.newaxis])n_c_distance_mat = np.zeros((len(X), c_clusters))for i, x in enumerate(X):for j, c in enumerate(Centroids):n_c_distance_mat[i][j] = np.linalg.norm(x-c, 2)new_membership_mat = np.zeros((len(X), c_clusters))for i, x in enumerate(X):for j, c in enumerate(Centroids):new_membership_mat[i][j] = 1. / np.sum((n_c_distance_mat[i][j] / n_c_distance_mat[i]) ** (2 / (m - 1)))if np.sum(abs(new_membership_mat - membership_mat)) < eps:breakmembership_mat = new_membership_matreturn np.argmax(new_membership_mat, axis=1)

終止條件那個(gè)可以再優(yōu)化下，但基本框架就是這樣。

使用：

導(dǎo)入數(shù)據(jù)：

from sklearn import datasetsiris = datasets.load_iris()

評(píng)估

def evaluate(y, t):a, b, c, d = [0 for i in range(4)]for i in range(len(y)):for j in range(i+1, len(y)):if y[i] == y[j] and t[i] == t[j]:a += 1elif y[i] == y[j] and t[i] != t[j]:b += 1elif y[i] != y[j] and t[i] == t[j]:c += 1elif y[i] != y[j] and t[i] != t[j]:d += 1return a, b, c, ddef external_index(a, b, c, d, m):JC = a / (a + b + c)FMI = np.sqrt(a**2 / ((a + b) * (a + c)))RI = 2 * ( a + d ) / ( m * (m + 1) )return JC, FMI, RIdef evaluate_it(y, t):a, b, c, d = evaluate(y, t)return external_index(a, b, c, d, len(y))

測(cè)試

test_y = FCM(iris.data)

得到評(píng)估

evaluate_it(iris.target, test_y)

對(duì)于這三個(gè)指數(shù)來說，效果高了很多。

(0.7487508922198429, 0.856326530612245, 0.8935099337748345)

畫圖

from sklearn.decomposition import PCA import numpy as np import matplotlib.pyplot as pltX_reduced = PCA(n_components=2).fit_transform(iris.data) plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=test_y, cmap=plt.cm.Set1)

原圖：

plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=iris.target, cmap=plt.cm.Set1)

圖片上也能看到效果比其他的都要好，而且，經(jīng)過測(cè)試發(fā)現(xiàn)，FCM算法會(huì)比KMeans更加穩(wěn)定（很多，很多倍）。

總結(jié)

以上是生活随笔為你收集整理的FCM算法实现Python（简洁版）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： PCA主成分分析以及Python实现（阅
下一篇：【解决方案】调用multiprocess