The proof of “chi-square statistics follows chi-square distribution”
chi-square test(principle used in C4.5’s CVP Pruning),
also called chi-square statistics,
also called chi-square goodness-of fit
here is the contingency table:
The target is to prove:
∑i=1i=r∑j=1j=s[Xij?Ni?(N?jn)]2Ni?(Njn)~χ2[(r?1)(s?1)]①\sum_{i=1}^{i=r} \sum_{j=1}^{j=s}\frac{[X_{ij}-N_{i·}(\frac{N_{·j}}{n})]^2}{N_{i·}(\frac{N_j}{n})}\sim \chi^2{[(r-1)(s-1)]}① i=1∑i=r?j=1∑j=s?Ni??(nNj??)[Xij??Ni??(nN?j??)]2?~χ2[(r?1)(s?1)]①
Note:
the left side of above is “discrete”
the right side of above is “continuous”
----------------------------------------------
Let’s review the concepts of “Multi-dimensional Normal Distribution”,
according to[1]
X~N(μ,∑)X\sim N(\mu,\sum)X~N(μ,∑)
μ=[E[X1],E[X2],???,E[Xs]]T\mu=[E[X_1],E[X_2],···,E[X_s]]^Tμ=[E[X1?],E[X2?],???,E[Xs?]]T
∑=:[Cov[Xi,Xj];1≤i,j≤s]\sum=: [Cov[X_i,X_j];1≤i,j≤s]∑=:[Cov[Xi?,Xj?];1≤i,j≤s]
-----------------------------------------------------------------------------------------------
∑j=1j=s[Xij?Ni?(N?jn)]2Ni?(N?jn)\sum_{j=1}^{j=s}\frac{[X_{ij}-N_{i·}(\frac{N_{·j}}{n})]^2}{N_{i·}(\frac{N_{·j}}{n})}∑j=1j=s?Ni??(nN?j??)[Xij??Ni??(nN?j??)]2?
=Ni?∑j=1j=s[XijNi??(N?jn)]2(N?jn)N_{i·}\sum_{j=1}^{j=s}\frac{[\frac{X_{ij}}{N_{i·}}-(\frac{N_{·j}}{n})]^2}{(\frac{N_{·j}}{n})}Ni??∑j=1j=s?(nN?j??)[Ni??Xij???(nN?j??)]2?
=Ni?{[∑j=1j=s?1[XijNi??(N?jn)]2N?jn]+[XisNi??(N?sn)]2N?sn}N_{i·}\{[\sum_{j=1}^{j=s-1}\frac{[\frac{X_{ij}}{N_{i·}}-(\frac{N_{·j}}{n})]^2}{\frac{N_{·j}}{n}}]+ \frac{[\frac{X_{is}}{N_{i·}}-(\frac{N_{·s}}{n})]^2} {\frac{N_{·s}}{n}}\}Ni??{[∑j=1j=s?1?nN?j??[Ni??Xij???(nN?j??)]2?]+nN?s??[Ni??Xis???(nN?s??)]2?}
=Ni?{[∑j=1j=s?1[XijNi??(N?jn)]2N?jNi?]+[∑j=1j=s?1(XijNi??N?jn)]2NsNi?}N_{i·} \{[\sum_{j=1}^{j=s-1}\frac{[ \frac{X_{ij}}{N_{i·}}-(\frac{N_{·j}}{n})]^2 }{ \frac{N_{·j}}{N_{i·}} }]+ \frac{[\sum_{j=1}^{j=s-1}(\frac{X_{ij}}{N_{i·}}-\frac{N_{·j}}{n})]^2}{{\frac{Ns}{N_{i·}}}} \}Ni??{[∑j=1j=s?1?Ni??N?j??[Ni??Xij???(nN?j??)]2?]+Ni??Ns?[∑j=1j=s?1?(Ni??Xij???nN?j??)]2?}
Let’s set
p?=(N?1n,...,N?(s?1)n)Tp^*=(\frac{N_{·1}}{n},...,\frac{N_{·(s-1)}}{n})^Tp?=(nN?1??,...,nN?(s?1)??)T
X ̄?=(Xi1Ni?,???,Xi(s?1)Ni?)T\overline{X}^*=(\frac{X_{i1}}{N_{i·}},···,\frac{X_{i(s-1)}}{N_{i·}})^TX?=(Ni??Xi1??,???,Ni??Xi(s?1)??)T
So,
Ni?∑j=1j=s[XijNi??(N?jn)]2(N?jn)N_{i·}\sum_{j=1}^{j=s}\frac{[\frac{X_{ij}}{N_{i·}}-(\frac{N_{·j}}{n})]^2}{(\frac{N_{·j}}{n})}Ni??j=1∑j=s?(nN?j??)[Ni??Xij???(nN?j??)]2?
=Ni?(X ̄??p?)T(∑?)?1(X ̄??p?)=N_{i·}(\overline{X}^*-p^*)^T(\sum^*)^{-1}(\overline{X}^*-p^*)=Ni??(X??p?)T(∑?)?1(X??p?)
where ∑?=\sum^*=∑?=
[p10???00p2???0????00???ps?1]?[p1p2?ps?1][p1p2?ps?1]T\left[ \begin{matrix} p_1 & 0 & ···&0 \\ 0 & p_2 & ···&0 \\ \vdots & \vdots & \ddots&\vdots\\ 0&0&···&p_{s-1} \end{matrix} \right]- \left[ \begin{matrix} p_1 \\ p_2 \\ \vdots \\ p_{s-1} \end{matrix} \right] \left[ \begin{matrix} p_1 \\ p_2 \\ \vdots \\ p_{s-1} \end{matrix} \right]^T ??????p1?0?0?0p2??0????????????00?ps?1???????????????p1?p2??ps?1??????????????p1?p2??ps?1????????T
According to Sherman-Morison Formula:
(∑?)?1=(\sum^*)^{-1}=(∑?)?1=
[1p10???001p2???0????00???1ps?1]?1ps[11???111???1????11???1]\left[ \begin{matrix} \frac{1}{p_1} & 0 & ···&0 \\ 0 & \frac{1}{p_2} & ···&0 \\ \vdots & \vdots & \ddots&\vdots\\ 0&0&···&\frac{1}{p_{s-1}} \end{matrix} \right] -\frac{1}{p_s} \left[ \begin{matrix} 1 & 1 & ···&1 \\ 1 & 1 & ···&1 \\ \vdots & \vdots & \ddots&\vdots\\ 1&1&···&1 \end{matrix} \right] ??????p1?1?0?0?0p2?1??0????????????00?ps?1?1?????????ps?1???????11?1?11?1????????????11?1???????
Let’s set Yi=Ni?X ̄??p?∑?②Y_i=\sqrt{N_{i·}}\frac{\overline{X}^*-p^*}{\sqrt{\sum^*}}②Yi?=Ni???∑??X??p??②
according [3]:
------------------------the following are from wikipedia-------------------------------
[X1(1)?X1(k)]+[X2(1)?X2(k)]+?+[Xn(1)?Xn(k)]=[∑i=1n[Xi(1)]?∑i=1n[Xi(k)]]=∑i=1nXi{\displaystyle {\begin{bmatrix}X_{1(1)}\\\vdots \\X_{1(k)}\end{bmatrix}}+{\begin{bmatrix}X_{2(1)}\\\vdots \\X_{2(k)}\end{bmatrix}}+\cdots +{\begin{bmatrix}X_{n(1)}\\\vdots \\X_{n(k)}\end{bmatrix}} ={\begin{bmatrix}\sum _{i=1}^{n}\left[X_{i(1)}\right]\\\vdots \\\sum _{i=1}^{n}\left[X_{i(k)}\right]\end{bmatrix}}=\sum _{i=1}^{n}\mathbf {X} _{i}}????X1(1)??X1(k)??????+????X2(1)??X2(k)??????+?+????Xn(1)??Xn(k)??????=????∑i=1n?[Xi(1)?]?∑i=1n?[Xi(k)?]?????=i=1∑n?Xi?
and the average is
1n∑i=1nXi=1n[∑i=1nXi(1)?∑i=1nXi(k)]=[Xˉi(1)?Xˉi(k)]=Xˉn1n∑i=1nXi=1n[∑i=1nXi(1)?∑i=1nXi(k)]=[Xˉi(1)?Xˉi(k)]=Xˉn{\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}\mathbf {X} _{i}={\frac {1}{n}}{\begin{bmatrix}\sum _{i=1}^{n}X_{i(1)}\\\vdots \\\sum _{i=1}^{n}X_{i(k)}\end{bmatrix}}={\begin{bmatrix}{\bar {X}}_{i(1)}\\\vdots \\{\bar {X}}_{i(k)}\end{bmatrix}}=\mathbf {{\bar {X}}_{n}} } {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}\mathbf {X} _{i}={\frac {1}{n}}{\begin{bmatrix}\sum _{i=1}^{n}X_{i(1)}\\\vdots \\\sum _{i=1}^{n}X_{i(k)}\end{bmatrix}}={\begin{bmatrix}{\bar {X}}_{i(1)}\\\vdots \\{\bar {X}}_{i(k)}\end{bmatrix}}=\mathbf {{\bar {X}}_{n}} }n1?i=1∑n?Xi?=n1?????∑i=1n?Xi(1)??∑i=1n?Xi(k)??????=????Xˉi(1)??Xˉi(k)??????=Xˉn?n1?i=1∑n?Xi?=n1?????∑i=1n?Xi(1)??∑i=1n?Xi(k)??????=????Xˉi(1)??Xˉi(k)??????=Xˉn?
and therefore
1n∑i=1n[Xi?E?(Xi)]=1n∑i=1n(Xi?μ)=n(X ̄n?μ).1n∑i=1n[Xi?E?(Xi)]=1n∑i=1n(Xi?μ)=n(X ̄n?μ).{\displaystyle {\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}\left[\mathbf {X} _{i}-\operatorname {E} \left(X_{i}\right)\right]={\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}(\mathbf {X} _{i}-{\boldsymbol {\mu }})={\sqrt {n}}\left({\overline {\mathbf {X} }}_{n}-{\boldsymbol {\mu }}\right).} {\displaystyle {\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}\left[\mathbf {X} _{i}-\operatorname {E} \left(X_{i}\right)\right]={\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}(\mathbf {X} _{i}-{\boldsymbol {\mu }})={\sqrt {n}}\left({\overline {\mathbf {X} }}_{n}-{\boldsymbol {\mu }}\right).}n?1?i=1∑n?[Xi??E(Xi?)]=n?1?i=1∑n?(Xi??μ)=n?(Xn??μ).n?1?i=1∑n?[Xi??E(Xi?)]=n?1?i=1∑n?(Xi??μ)=n?(Xn??μ).
The multivariate central limit theorem states that
n(X ̄n?μ)→DNk(0,Σ){\displaystyle {\sqrt {n}}\left({\overline {\mathbf {X} }}_{n}-{\boldsymbol {\mu }}\right)\ {\stackrel {D}{\rightarrow }}\ N_{k}(0,{\boldsymbol {\Sigma }})}n?(Xn??μ)?→D?Nk?(0,Σ)
------------------------the above are from wikipedia-------------------------------
So,for②,we can get
Yi~Ns?1(0,Is?1)③Y_i\sim N_{s-1}(\bold0,I_{s-1})③Yi?~Ns?1?(0,Is?1?)③
where
0=[0,0,…,0]T\bold0=[0,0,\dots,0]^T0=[0,0,…,0]T
Is?1=E(s?1)?(s?1)I_{s-1}=E_{(s-1)·(s-1)}Is?1?=E(s?1)?(s?1)?
then for ①
∑i=1i=r∑j=1j=s[Xij?Ni?(N?jn)]2Ni?(N?jn)=∑i=1i=rYi2\sum_{i=1}^{i=r} \sum_{j=1}^{j=s}\frac{[X_{ij}-N_{i·}(\frac{N_{·j}}{n})]^2}{N_{i·}(\frac{N_{·j}}{n})}\\ =\sum_{i=1}^{i=r}Y_i^2i=1∑i=r?j=1∑j=s?Ni??(nN?j??)[Xij??Ni??(nN?j??)]2?=i=1∑i=r?Yi2?
Because of ③,
∑i=1i=rYi2~χ2[(s?1)(r?1)]\sum_{i=1}^{i=r}Y_i^2\sim\chi^2[(s-1)(r-1)]i=1∑i=r?Yi2?~χ2[(s?1)(r?1)]
The Chi-Square statistics was invented by Pearson[8].
Reference:
[1]https://en.wikipedia.org/wiki/Multivariate_normal_distribution
[2]《Seven different proofs for the Pearson independence test》
[3]https://en.wikipedia.org/wiki/Central_limit_theorem
[4]https://ocw.mit.edu/courses/mathematics/18-443-statistics-for-applications-fall-2003/lecture-notes/lec23.pdf
[5]https://arxiv.org/pdf/1808.09171.pdf
[6]https://www.math.utah.edu/~davar/ps-pdf-files/Chisquared.pdf
[7]http://personal.psu.edu/drh20/asymp/fall2006/lectures/ANGELchpt07.pdf
[8]https://download.csdn.net/download/appleyuchi/10834144
總結(jié)
以上是生活随笔為你收集整理的The proof of “chi-square statistics follows chi-square distribution”的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 关于辅酶Q10的相关常识与选购要点(转)
- 下一篇: CVP(Critical Value P