當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

SURF算法与源码分析、上

發布時間：2025/7/25 编程问答 14 豆豆

生活随笔收集整理的這篇文章主要介紹了 SURF算法与源码分析、上小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

FROM：http://www.cnblogs.com/ronny/p/4045979.html

如果說SIFT算法中使用DOG對LOG進行了簡化，提高了搜索特征點的速度，那么SURF算法則是對DoH的簡化與近似。雖然SIFT算法已經被認為是最有效的，也是最常用的特征點提取的算法，但如果不借助于硬件的加速和專用圖像處理器的配合，SIFT算法以現有的計算機仍然很難達到實時的程度。對于需要實時運算的場合，如基于特征點匹配的實時目標跟蹤系統，每秒要處理8-24幀的圖像，需要在毫秒級內完成特征點的搜索、特征矢量生成、特征矢量匹配、目標鎖定等工作，這樣SIFT算法就很難適應這種需求了。SURF借鑒了SIFT中簡化近似的思想，把DoH中的高斯二階微分模板進行了簡化，使得模板對圖像的濾波只需要進行幾個簡單的加減法運算，并且，這種運算與濾波器的尺度無關。實驗證明，SURF算法較SIFT在運算速度上要快3倍左右。

1. 積分圖像

SURF算法中要用到積分圖像的概念。借助積分圖像，圖像與高斯二階微分模板的濾波轉化為對積分圖像的加減運算。

積分圖像中任意一點(i,j)的值ii(i,j)，為原圖像左上角到點(i,j)相應的對角線區域灰度值的總和，即

ii(i,j)=∑r≤i,?c≤jp(r,c)

式中，p(r,c)表示圖像中點(r,c)的灰度值，ii(i,j)可以用下面兩式迭代計算得到

S(i,j)=S(i,j?1)+p(i,j)

ii(i,j)=ii(i?1,j)+S(i,j)

式中，S(i,j)表示一列的積分，且S(i,?1)=0,ii(?1,j)=0。求積分圖像，只需要對原圖像所有像素進行一遍掃描。

OpenCV中提供了用于計算積分圖像的接口

/* * src :輸入圖像，大小為M*N * sum: 輸出的積分圖像，大小為（M+1）*（N+1） * sdepth：用于指定sum的類型，-1表示與src類型一致 */ void integral(InputArray src, OutputArray sum, int sdepth = -1);

值得注意的是OpenCV里的積分圖大小比原圖像多一行一列，那是因為OpenCV中積分圖的計算公式為：

ii(i,j)=∑r<i,?c<jp(r,c)

一旦積分圖計算好了，計算圖像內任何矩形區域的像素值的和只需要三個加法，如上圖所示。

2. DoH近似

在斑點檢測這篇文章中已經提到過，我們可以利用Hessian矩陣行列式的極大值檢測斑點。下面我們給出Hessian矩陣的定義。

給定圖像I中的一個點x(i,j)，在點x處，尺度為σ的Hessian矩陣H(x,σ)定義如下：

H(x,σ)=[Lxx(x,σ)Lxy(x,σ)Lxy(x,σ)Lyy(x,σ)]

式中，Lxx(x,σ)是高斯二階微分?2g(σ)?x2在點x處與圖像I的卷積，Lx,y(x,σ)和Lyy(x,σ)具有類似的含義。

下面顯示的是上面三種高斯微分算子的圖形。

但是利用Hessian行列式進行圖像斑點檢測時，有一個缺點。由于二階高斯微分被離散化和裁剪的原因，導致了圖像在旋轉奇數倍的π/4時，即轉換到模板的對角線方向時，特征點檢測的重復性降低（也就是說，原來特征點的地方，可能檢測不到特征點了）。而在π/2時，特征點檢測的重現率真最高。但這一小小的不足不影響我們使用Hessian矩陣進行特征點的檢測。

為了將模板與圖產像的卷積轉換為盒子濾波運算，我們需要對高斯二階微分模板進行簡化，使得簡化后的模板只是由幾個矩形區域組成，矩形區域內填充同一值，如下圖所示，在簡化模板中白色區域的值為正數，黑色區域的值為負數，灰度區域的值為0。

對于σ=1.2的高斯二階微分濾波器，我們設定模板的尺寸為9×9的大小，并用它作為最小尺度空間值對圖像進行濾波和斑點檢測。我們使用Dxx、Dxy和Dyy表示模板與圖像進行卷積的結果。這樣，便可以將Hessian矩陣的行列式作如下的簡化。

Det(H)=LxxLyy–LxyLxy=DxxLxxDxxDyyLyyDyy?DxyLxyDxyDxyLxyDxy≈DxxDyy–(wDxy)2

濾波器響應的相關權重w是為了平衡Hessian行列式的表示式。這是為了保持高斯核與近似高斯核的一致性。

w=|Lxy(σ)|F|Dxx(σ)F||Lxx(σ)|F|Dxy(σ)F|=0.912

其中|X|F為Frobenius范數。理論上來說對于不同的σ的值和對應尺寸的模板尺寸，w值是不同的，但為了簡化起見，可以認為它是同一個常數。

使用近似的Hessian矩陣行列式來表示圖像中某一點x處的斑點響應值，遍歷圖像中所有的像元點，便形成了在某一尺度下琉璃點檢測的響應圖像。使用不同的模板尺寸，便形成了多尺度斑點響應的金字塔圖像，利用這一金字塔圖像，就可以進行斑點響應極值點的搜索，其過程完全與SIFT算法類同。

3. 尺度空間表示

通常想要獲取不同尺度的斑點，必須建立圖像的尺度空間金字塔。一般的方法是通過不同σ的高斯函數，對圖像進行平滑濾波，然后重采樣圖像以獲得更高一層的金字塔圖像。SIFT特征檢測算法中就是通過相鄰兩層圖像金字塔相減得到DoG圖像，然后再在DoG圖像上進行斑點和邊緣檢測工作的。

由于采用了盒子濾波和積分圖像，所以，我們并不需要像SIFT算法那樣去直接建立圖像金字塔，而是采用不斷增大盒子濾波模板的尺寸的間接方法。通過不同尺寸盒子濾波模板與積分圖像求取Hessian矩陣行列式的響應圖像。然后在響應圖像上采用3D非最大值抑制，求取各種不同尺度的斑點。

如前所述，我們使用9×9的模板對圖像進行濾波，其結果作為最初始的尺度空間層（此時，尺度值為s=1.2，近似σ=1.2的高斯微分），后續的層將通過逐步放大濾波模板尺寸，以及放大后的模板不斷與圖像進行濾波得到。由于采用盒子濾波和積分圖像，濾波過程并不隨著濾波模板尺寸的增加而使運算工作量增加。

與SIFT算法類似，我們需要將尺度空間劃分為若干組（Octaves）。一個組代表了逐步放大的濾波模板對同一輸入圖像進行濾波的一系列響應圖。每個組又由若干固定的層組成。由于積分圖像離散化的原因，兩個層之間的最小尺度變化量是由高斯二階微分濾波器在微分方向上對正負斑點響應長度l0決定的，它是盒子濾波器模板尺寸的1/3。對于9×9的模板，它的l0=3。一下層的響應長度至少應該在l0的基礎上增加2個像素，以保證一邊一個像素，即l0=5。這樣模板的尺寸就為15×15。以此類推，我們可以得到一個尺寸增大模板序列，它們的尺寸分別為：9×9，15×15，21×21，27×27，黑色、白色區域的長度增加偶數個像素，以保證一個中心像素的存在。

????????

采用類似的方法來處理其他幾組的模板序列。其方法是將濾波器尺寸增加量翻倍（6，12，24，38）。這樣，可以得到第二組的濾波器尺寸，它們分別為15，27，39，51。第三組的濾波器尺寸為27，51，75，99。如果原始圖像的尺寸仍然大于對應的濾波器尺寸，尺度空間的分析還可以進行第四組，其對應的模板尺寸分別為51，99，147和195。下圖顯示了第一組至第三組的濾波器尺寸變化。

在通常尺度分析情況下，隨著尺度的增大，被檢測到的斑點數量迅速衰減。所以一般進行3-4組就可以了，與此同時，為了減少運算量，提高計算的速度，可以考慮在濾波時，將采樣間隔設為2。

對于尺寸為L的模板，當用它與積分圖運算來近似二維高斯核的濾波時，對應的二維高斯核的參數σ=1.2×L9，這一點至關重要，尤其是在后面計算描述子時，用于計算鄰域的半徑時。

4. 興趣點的定位

為了在圖像及不同尺寸中定位興趣點，我們用了3×3×3鄰域非最大值抑制。具體的步驟基本與SIFT一致，而且Hessian矩陣行列式的最大值在尺度和圖像空間被插值。

下面顯示了我們用的快速Hessian檢測子檢測到的興趣點。

5. SURF源碼解析

這份源碼來自OpenCV nonfree模塊。

這里先介紹SURF特征點定位這一塊，關于特征點的描述下一篇文章再介紹。

5.1 主干函數 fastHessianDetector

特征點定位的主干函數為fastHessianDetector，該函數接受一個積分圖像，以及尺寸相關的參數，組數與每組的層數，檢測到的特征點保存在vector<KeyPoint>類型的結構中。

static void fastHessianDetector(const Mat& sum, const Mat& msum, vector<KeyPoint>& keypoints,int nOctaves, int nOctaveLayers, float hessianThreshold) {/*first Octave圖像采樣的步長，第二組的時候加倍，以此內推增加這個值，將會加快特征點檢測的速度，但是會讓特征點的提取變得不穩定*/const int SAMPLE_STEP0 = 1;int nTotalLayers = (nOctaveLayers + 2)*nOctaves; // 尺度空間的總圖像數int nMiddleLayers = nOctaveLayers*nOctaves; // 用于檢測特征點的層的總數，也就是中間層的總數 vector<Mat> dets(nTotalLayers); // 每一層圖像對應的 Hessian行列式的值vector<Mat> traces(nTotalLayers); // 每一層圖像對應的 Hessian矩陣的跡的值vector<int> sizes(nTotalLayers); // 每一層用的 Harr模板的大小vector<int> sampleSteps(nTotalLayers); // 每一層用的采樣步長 vector<int> middleIndices(nMiddleLayers); // 中間層的索引值 keypoints.clear();// 為上面的對象分配空間，并賦予合適的值int index = 0, middleIndex = 0, step = SAMPLE_STEP0;for (int octave = 0; octave < nOctaves; octave++){for (int layer = 0; layer < nOctaveLayers + 2; layer++){/*這里sum.rows - 1是因為 sum是積分圖，它的大小是原圖像大小加1*/dets[index].create((sum.rows - 1) / step, (sum.cols - 1) / step, CV_32F); // 這里面有除以遍歷圖像用的步長traces[index].create((sum.rows - 1) / step, (sum.cols - 1) / step, CV_32F);sizes[index] = (SURF_HAAR_SIZE0 + SURF_HAAR_SIZE_INC*layer) << octave;sampleSteps[index] = step;if (0 < layer && layer <= nOctaveLayers)middleIndices[middleIndex++] = index;index++;}step *= 2;}// Calculate hessian determinant and trace samples in each layerfor (int i = 0; i < nTotalLayers; i++){calcLayerDetAndTrace(sum, sizes[i], sampleSteps[i], dets[i], traces[i]);}// Find maxima in the determinant of the hessianfor (int i = 0; i < nMiddleLayers; i++){int layer = middleIndices[i];int octave = i / nOctaveLayers;findMaximaInLayer(sum, msum, dets, traces, sizes, keypoints, octave, layer, hessianThreshold, sampleSteps[layer]);}std::sort(keypoints.begin(), keypoints.end(), KeypointGreater()); }

5.2 計算Hessian矩陣的行列式與跡calcLayerDetAndTrace

這個函數首先定義了尺寸為9的第一層圖像的三個模板。模板分別為一個3×5、3×5、4×5的二維數組表示，數組的每一行表示一個黑白塊的位置參數。函數里只初始化了第一層圖像的模板參數，后面其他組其他層的Harr模板參數都是用resizeHaarPattern這個函數來計算的。這個函數返回的是一個SurfHF的結構體，這個結構體由兩個點及一個權重構成。

struct SurfHF {int p0, p1, p2, p3;float w;SurfHF() : p0(0), p1(0), p2(0), p3(0), w(0) {} };

resizeHaarPattern這個函數非常的巧妙，它把模板中的點坐標。轉換到在積分圖中的相對（模板左上角點）坐標。

static void resizeHaarPattern(const int src[][5], SurfHF* dst, int n, int oldSize, int newSize, int widthStep) {float ratio = (float)newSize / oldSize;for (int k = 0; k < n; k++){int dx1 = cvRound(ratio*src[k][0]);int dy1 = cvRound(ratio*src[k][1]);int dx2 = cvRound(ratio*src[k][2]);int dy2 = cvRound(ratio*src[k][3]);/*巧妙的坐標轉換*/dst[k].p0 = dy1*widthStep + dx1; // 轉換為一個相對距離，距離模板左上角點的在積分圖中的距離 !!important!!dst[k].p1 = dy2*widthStep + dx1; dst[k].p2 = dy1*widthStep + dx2;dst[k].p3 = dy2*widthStep + dx2;dst[k].w = src[k][4] / ((float)(dx2 - dx1)*(dy2 - dy1));// 原來的+1,+2用覆蓋的所有像素點平均。 } }

在用積分圖計算近似卷積時，用的是calcHaarPattern函數。這個函數比較簡單，只用知道左上與右下角坐標即可。

inline float calcHaarPattern(const int* origin, const SurfHF* f, int n) {/*orgin即為積分圖，n為模板中黑白塊的個數 */double d = 0;for (int k = 0; k < n; k++)d += (origin[f[k].p0] + origin[f[k].p3] - origin[f[k].p1] - origin[f[k].p2])*f[k].w;return (float)d; }

最終我們可以看到了整個calcLayerDetAndTrack的代碼

static void calcLayerDetAndTrace(const Mat& sum, int size, int sampleStep,Mat& det, Mat& trace) {const int NX = 3, NY = 3, NXY = 4;const int dx_s[NX][5] = { { 0, 2, 3, 7, 1 }, { 3, 2, 6, 7, -2 }, { 6, 2, 9, 7, 1 } };const int dy_s[NY][5] = { { 2, 0, 7, 3, 1 }, { 2, 3, 7, 6, -2 }, { 2, 6, 7, 9, 1 } };const int dxy_s[NXY][5] = { { 1, 1, 4, 4, 1 }, { 5, 1, 8, 4, -1 }, { 1, 5, 4, 8, -1 }, { 5, 5, 8, 8, 1 } };SurfHF Dx[NX], Dy[NY], Dxy[NXY];if (size > sum.rows - 1 || size > sum.cols - 1)return;resizeHaarPattern(dx_s, Dx, NX, 9, size, sum.cols);resizeHaarPattern(dy_s, Dy, NY, 9, size, sum.cols);resizeHaarPattern(dxy_s, Dxy, NXY, 9, size, sum.cols);/* The integral image 'sum' is one pixel bigger than the source image */int samples_i = 1 + (sum.rows - 1 - size) / sampleStep; // 最大能遍歷到的行坐標，因為要減掉一個模板的尺寸int samples_j = 1 + (sum.cols - 1 - size) / sampleStep; // 最大能遍歷到的列坐標/* Ignore pixels where some of the kernel is outside the image */int margin = (size / 2) / sampleStep;for (int i = 0; i < samples_i; i++){/*坐標為(i,j)的點是模板左上角的點，所以實際現在模板分析是的i+margin,j+margin點處的響應*/const int* sum_ptr = sum.ptr<int>(i*sampleStep);float* det_ptr = &det.at<float>(i + margin, margin); // 左邊空隙為 marginfloat* trace_ptr = &trace.at<float>(i + margin, margin);for (int j = 0; j < samples_j; j++){float dx = calcHaarPattern(sum_ptr, Dx, 3);float dy = calcHaarPattern(sum_ptr, Dy, 3);float dxy = calcHaarPattern(sum_ptr, Dxy, 4);sum_ptr += sampleStep;det_ptr[j] = dx*dy - 0.81f*dxy*dxy;trace_ptr[j] = dx + dy;}} }

5.3 局部最大值搜索findMaximaInLayer

這里算法思路很簡單，值得注意的是里面的一些坐標的轉換很巧妙，里面比較重的函數就是interpolateKeypoint函數，通過插值計算最大值點。

/* * Maxima location interpolation as described in "Invariant Features from * Interest Point Groups" by Matthew Brown and David Lowe. This is performed by * fitting a 3D quadratic to a set of neighbouring samples. * * The gradient vector and Hessian matrix at the initial keypoint location are * approximated using central differences. The linear system Ax = b is then * solved, where A is the Hessian, b is the negative gradient, and x is the * offset of the interpolated maxima coordinates from the initial estimate. * This is equivalent to an iteration of Netwon's optimisation algorithm. * * N9 contains the samples in the 3x3x3 neighbourhood of the maxima * dx is the sampling step in x * dy is the sampling step in y * ds is the sampling step in size * point contains the keypoint coordinates and scale to be modified * * Return value is 1 if interpolation was successful, 0 on failure. */static int interpolateKeypoint(float N9[3][9], int dx, int dy, int ds, KeyPoint& kpt) {Vec3f b(-(N9[1][5] - N9[1][3]) / 2, // Negative 1st deriv with respect to x-(N9[1][7] - N9[1][1]) / 2, // Negative 1st deriv with respect to y-(N9[2][4] - N9[0][4]) / 2); // Negative 1st deriv with respect to s Matx33f A(N9[1][3] - 2 * N9[1][4] + N9[1][5], // 2nd deriv x, x(N9[1][8] - N9[1][6] - N9[1][2] + N9[1][0]) / 4, // 2nd deriv x, y(N9[2][5] - N9[2][3] - N9[0][5] + N9[0][3]) / 4, // 2nd deriv x, s(N9[1][8] - N9[1][6] - N9[1][2] + N9[1][0]) / 4, // 2nd deriv x, yN9[1][1] - 2 * N9[1][4] + N9[1][7], // 2nd deriv y, y(N9[2][7] - N9[2][1] - N9[0][7] + N9[0][1]) / 4, // 2nd deriv y, s(N9[2][5] - N9[2][3] - N9[0][5] + N9[0][3]) / 4, // 2nd deriv x, s(N9[2][7] - N9[2][1] - N9[0][7] + N9[0][1]) / 4, // 2nd deriv y, sN9[0][4] - 2 * N9[1][4] + N9[2][4]); // 2nd deriv s, s Vec3f x = A.solve(b, DECOMP_LU);bool ok = (x[0] != 0 || x[1] != 0 || x[2] != 0) &&std::abs(x[0]) <= 1 && std::abs(x[1]) <= 1 && std::abs(x[2]) <= 1;if (ok){kpt.pt.x += x[0] * dx;kpt.pt.y += x[1] * dy;kpt.size = (float)cvRound(kpt.size + x[2] * ds);}return ok; }static void findMaximaInLayer(const Mat& sum, const Mat& mask_sum,const vector<Mat>& dets, const vector<Mat>& traces,const vector<int>& sizes, vector<KeyPoint>& keypoints,int octave, int layer, float hessianThreshold, int sampleStep) {// Wavelet Dataconst int NM = 1;const int dm[NM][5] = { { 0, 0, 9, 9, 1 } };SurfHF Dm;int size = sizes[layer];// 當前層圖像的大小int layer_rows = (sum.rows - 1) / sampleStep;int layer_cols = (sum.cols - 1) / sampleStep;// 邊界區域大小，考慮的下一層的模板大小int margin = (sizes[layer + 1] / 2) / sampleStep + 1;if (!mask_sum.empty())resizeHaarPattern(dm, &Dm, NM, 9, size, mask_sum.cols);int step = (int)(dets[layer].step / dets[layer].elemSize());for (int i = margin; i < layer_rows - margin; i++){const float* det_ptr = dets[layer].ptr<float>(i);const float* trace_ptr = traces[layer].ptr<float>(i);for (int j = margin; j < layer_cols - margin; j++){float val0 = det_ptr[j]; // 中心點的值if (val0 > hessianThreshold){// 模板左上角的坐標int sum_i = sampleStep*(i - (size / 2) / sampleStep);int sum_j = sampleStep*(j - (size / 2) / sampleStep);/* The 3x3x3 neighbouring samples around the maxima.The maxima is included at N9[1][4] */const float *det1 = &dets[layer - 1].at<float>(i, j);const float *det2 = &dets[layer].at<float>(i, j);const float *det3 = &dets[layer + 1].at<float>(i, j);float N9[3][9] = { { det1[-step - 1], det1[-step], det1[-step + 1],det1[-1], det1[0], det1[1],det1[step - 1], det1[step], det1[step + 1] },{ det2[-step - 1], det2[-step], det2[-step + 1],det2[-1], det2[0], det2[1],det2[step - 1], det2[step], det2[step + 1] },{ det3[-step - 1], det3[-step], det3[-step + 1],det3[-1], det3[0], det3[1],det3[step - 1], det3[step], det3[step + 1] } };/* Check the mask - why not just check the mask at the center of the wavelet? */if (!mask_sum.empty()){const int* mask_ptr = &mask_sum.at<int>(sum_i, sum_j);float mval = calcHaarPattern(mask_ptr, &Dm, 1);if (mval < 0.5)continue;}/* 檢測val0，是否在N9里極大值，？？為什么不檢測極小值呢*/if (val0 > N9[0][0] && val0 > N9[0][1] && val0 > N9[0][2] &&val0 > N9[0][3] && val0 > N9[0][4] && val0 > N9[0][5] &&val0 > N9[0][6] && val0 > N9[0][7] && val0 > N9[0][8] &&val0 > N9[1][0] && val0 > N9[1][1] && val0 > N9[1][2] &&val0 > N9[1][3] && val0 > N9[1][5] &&val0 > N9[1][6] && val0 > N9[1][7] && val0 > N9[1][8] &&val0 > N9[2][0] && val0 > N9[2][1] && val0 > N9[2][2] &&val0 > N9[2][3] && val0 > N9[2][4] && val0 > N9[2][5] &&val0 > N9[2][6] && val0 > N9[2][7] && val0 > N9[2][8]){/* Calculate the wavelet center coordinates for the maxima */float center_i = sum_i + (size - 1)*0.5f;float center_j = sum_j + (size - 1)*0.5f;KeyPoint kpt(center_j, center_i, (float)sizes[layer],-1, val0, octave, CV_SIGN(trace_ptr[j]));/* 局部極大值插值，用Hessian，類似于SIFT里的插值，里面沒有迭代5次，只進行了一次查找，why? */int ds = size - sizes[layer - 1];int interp_ok = interpolateKeypoint(N9, sampleStep, sampleStep, ds, kpt);/* Sometimes the interpolation step gives a negative size etc. */if (interp_ok){/*printf( "KeyPoint %f %f %d\n", point.pt.x, point.pt.y, point.size );*/keypoints.push_back(kpt);}}}}} }

6. 總結

總體來說，如果理解了SIFT算法，再來看SURF算法會發現思路非常簡單。尤其是局部最大值查找方面，基本一致。關鍵還是一個用積分圖來簡化卷積的思路，以及怎么用不同的模板來近似原來尺度空間中的高斯濾波器。

這一篇主要討論分析的是SURF的定位問題，下面還有SURF特征點的方向計算與描述子的生成，將在下一篇文章中詳細描述。

總結

以上是生活随笔為你收集整理的SURF算法与源码分析、上的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。