判定系数推导 — Coefficient of Determination Derivation
通過線性回歸得到回歸參數(shù)后,可以通過計算判定系數(shù)R2R^2R2來評估回歸函數(shù)的擬合優(yōu)度。判定系數(shù)R2R^2R2定義如下:
R2=SSRSST=1?SSESSTR^2 = \frac {SSR}{SST} = 1 - \frac {SSE}{SST} R2=SSTSSR?=1?SSTSSE?
其中,SSR=∑i=1n(y^i?yˉi)2SSR = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2SSR=i=1∑n?(y^?i??yˉ?i?)2,SSE=∑i=1n(yi?y^i)2SSE = \sum\limits_{i=1}^n (y_i - \hat y_i)^2SSE=i=1∑n?(yi??y^?i?)2和SST=∑i=1n(yi?yˉ)2SST = \sum\limits_{i=1}^n (y_i - \bar y)^2SST=i=1∑n?(yi??yˉ?)2。R2R^2R2越接近1,回歸函數(shù)的擬合優(yōu)度越大。上式可改寫成SST=SSR+SSESST = SSR + SSESST=SSR+SSE,即:
∑i=1n(yi?yˉ)2=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2\sum\limits_{i=1}^n (y_i - \bar y)^2 = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 i=1∑n?(yi??yˉ?)2=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2
為了理解R2R^2R2,我們有必要先回顧一下線性回歸的通式:
{y^i=f(x)=θ0+∑j=1nθjxijyi=y^i+?i\begin{cases} \hat y_i = f(x) = \theta_0 + \sum\limits_{j=1}^n \theta_j x_i^j \\ y_i = \hat y_i + \epsilon_i \end{cases} ????y^?i?=f(x)=θ0?+j=1∑n?θj?xij?yi?=y^?i?+?i??
其中,yiy_iyi?實際上由y^i\hat y_iy^?i?和?i\epsilon_i?i?組成,y^i\hat y_iy^?i?隨xix_ixi?變化而變化。令 xi0=1x_i^0 = 1xi0?=1,y^i=θ0+∑j=1nθjxij\hat y_i = \theta_0 + \sum\limits_{j=1}^n \theta_j x_i^jy^?i?=θ0?+j=1∑n?θj?xij?可被改寫成y^i=θTxi\hat y_i = \theta^Tx_iy^?i?=θTxi?。將上式改寫成向量和矩陣的形式:
{[1x11x12…x1n1x21x22…x2n?1xm1xm2…xmn][θ0θ1?θn]=[y^1y^2?y^m][y1y2?ym]=[y^1y^2?y^m]+[?1?2??m]\begin{cases} \begin{bmatrix} 1 & x_1^1 & x_1^2 & \dots & x_1^n \\ 1 & x_2^1 & x_2^2 & \dots & x_2^n \\ \vdots \\ 1 & x_m^1 & x_m^2 & \dots & x_m^n \\ \end{bmatrix} \begin{bmatrix} \theta_0 \\ \theta_1 \\ \vdots \\ \theta_n \end{bmatrix} = \begin{bmatrix} \hat y_1 \\ \hat y_2 \\ \vdots \\ \hat y_m \end{bmatrix} \\ \\ \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} = \begin{bmatrix} \hat y_1 \\ \hat y_2 \\ \vdots \\ \hat y_m \end{bmatrix} + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_m \end{bmatrix} \end{cases} ??????????????????????????????????????11?1?x11?x21?xm1??x12?x22?xm2??………?x1n?x2n?xmn??????????????θ0?θ1??θn????????=??????y^?1?y^?2??y^?m??????????????y1?y2??ym????????=??????y^?1?y^?2??y^?m????????+???????1??2???m?????????
當θ≠0\theta \neq \mathbf 0θ??=0時,Y^\hat YY^是XXX的一個線性組合,即Y^\hat YY^存在于由XXX的列向量所展開的列空間中。對于一次冪的線形回歸,XXX的列空間即是一個超平面,Y^\hat YY^是存在于面內(nèi)的一個向量(即YYY在面上的投影)。為了使得殘差最小化,?\epsilon?是YYY垂直于面方向上的投影。在三維中的幾何意義如下圖(文中θ\thetaθ即圖中β\betaβ,圖中XiX_iXi?表示列向量,圖取自):
因為?\epsilon?垂直于XXX的列空間,所以?\epsilon?垂直于XXX的所有列向量,即XT?=0X^T \epsilon = \mathbf 0XT?=0。又因?=Y?Xθ\epsilon = Y - X\theta?=Y?Xθ,得:
XT(Y?Xθ)=0XTY=XTXθθ=(XTX)?1XTYY^=Xθ=X(XTX)?1XTYX^T(Y - X\theta) = \mathbf 0 \\ X^TY = X^TX\theta \\ \theta = (X^TX)^{-1}X^TY \\ \hat Y = X\theta = X(X^TX)^{-1}X^TY XT(Y?Xθ)=0XTY=XTXθθ=(XTX)?1XTYY^=Xθ=X(XTX)?1XTY
根據(jù)Y^=Xθ=X(XTX)?1XTY\hat Y = X\theta = X(X^TX)^{-1}X^TYY^=Xθ=X(XTX)?1XTY,我們得到了投影矩陣P=X(XTX)?1XTP = X(X^TX)^{-1}X^TP=X(XTX)?1XT。Y^=PY\hat Y = PYY^=PY,投影矩陣PPP乘以YYY得到了YYY屬于XXX列空間的分量Y^\hat YY^。投影矩陣有兩個性質(zhì)需要了解:
PT=(X(XTX)?1XT)T=X((XTX)?1)TXT=X((XTX)T)?1XT=X(XTX)?1XT=PP^T = (X(X^TX)^{-1}X^T)^T = X((X^TX)^{-1})^TX^T = X((X^TX)^T)^{-1}X^T = X(X^TX)^{-1}X^T = P PT=(X(XTX)?1XT)T=X((XTX)?1)TXT=X((XTX)T)?1XT=X(XTX)?1XT=P
P2=PTP=X(XTX)?1XTX(XTX)?1XT=X(XTX)?1XTX(XTX)?1?XT=X(XTX)?1XT=PP^2 = P^TP = X(X^TX)^{-1}X^TX(X^TX)^{-1}X^T = X(X^TX)^{-1} \overbrace{X^TX(X^TX)^{-1}}X^T = X(X^TX)^{-1}X^T = P P2=PTP=X(XTX)?1XTX(XTX)?1XT=X(XTX)?1XTX(XTX)?1?XT=X(XTX)?1XT=P
現(xiàn)在,我們可以開始推導判定系數(shù)公示SST=SSR+SSESST = SSR + SSESST=SSR+SSE了。如下(1∈Rm\mathbf 1 \in R^m1∈Rm):
SST=∑i=1n(yi?yˉ)2=∑i=1n[(yi?y^i)+(y^i?yˉ)]2=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2+∑i=1n2(yi?y^i)(y^i?yˉ)=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2+∑i=1n2(yi?y^i)(y^i?yˉ)=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2+2?(Y^?Yˉ1)=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2+2?(PY?Yˉ1)=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2+2?TY^?2Yˉ?T1\begin{aligned} & SST = \sum\limits_{i=1}^n (y_i - \bar y)^2 = \sum\limits_{i=1}^n [(y_i - \hat y_i) + (\hat y_i - \bar y)]^2 \\ & = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 + \sum\limits_{i=1}^n 2(y_i - \hat y_i)(\hat y_i - \bar y) \\ & = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 + \sum\limits_{i=1}^n 2(y_i - \hat y_i)(\hat y_i - \bar y) \\ & = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 + 2\epsilon(\hat Y -\bar Y\mathbf 1) \\ & = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 + 2\epsilon(PY -\bar Y\mathbf 1) \\ & = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 + 2\epsilon^T\hat Y - 2\bar Y\epsilon^T\mathbf 1 \end{aligned} ?SST=i=1∑n?(yi??yˉ?)2=i=1∑n?[(yi??y^?i?)+(y^?i??yˉ?)]2=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2+i=1∑n?2(yi??y^?i?)(y^?i??yˉ?)=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2+i=1∑n?2(yi??y^?i?)(y^?i??yˉ?)=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2+2?(Y^?Yˉ1)=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2+2?(PY?Yˉ1)=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2+2?TY^?2Yˉ?T1?
因為?\epsilon?垂直于XXX的列空間,且Y^\hat YY^屬于XXX的列空間,所以?TY^=0\epsilon^T \hat Y = 0?TY^=0;又因為1=xi0∈Rm\mathbf 1 = x_i^0 \in R^m1=xi0?∈Rm(1\mathbf 11屬于XXX的列空間),所以?T1=0\epsilon^T \mathbf 1 = 0?T1=0。因此:
SST=∑i=1n(y^i?yˉi)2+∑i=1n(yi?y^i)2+2?TY^?2Yˉ?T1=SSR+SSESST = \sum\limits_{i=1}^n (\hat y_i - \bar y_i)^2 + \sum\limits_{i=1}^n (y_i - \hat y_i)^2 + 2\epsilon^T\hat Y - 2\bar Y\epsilon^T\mathbf 1 = SSR + SSE SST=i=1∑n?(y^?i??yˉ?i?)2+i=1∑n?(yi??y^?i?)2+2?TY^?2Yˉ?T1=SSR+SSE
總結(jié)
以上是生活随笔為你收集整理的判定系数推导 — Coefficient of Determination Derivation的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 网易126免费邮箱更改手机号(亲测有效)
- 下一篇: oracle sql 语句 start