统计学——可汗学院
- 統(tǒng)計入門——可汗學(xué)院
- Sample and population
- Summary of Population and Sample
- Law of Large Number
- Central limit theorem
- Sampling distribution
- What is a sampling distribution?
- Sampling Distribution of Sample Proportion
- Normal conditions for sampling distributions of sample proportion
- Sample and population
統(tǒng)計入門——可汗學(xué)院
Sample and population
Sample is a part of population that is so selected to represent the entire group.閱讀資料:
Sample Vs Population
Difference between sample and population
Summary of Population and Sample
Measurement of central tendency of a data set: Mean, Median
Measurement of dispersion: Variance, Standard deviation
| Mean | The mean of a sample or a population is computed by adding all of the observations and dividing by the number of observations. | Xˉˉˉˉ=∑ni=1xinXˉ=∑i=1nxin | μ=∑Ni=1xiNμ=∑i=1NxiN |
| Variance | In a population, variance is the average squared deviation from the population mean. | s2=∑ni=1(xi?Xˉˉˉˉ)2n?1s2=∑i=1n(xi?Xˉ)2n?1 | σ2=∑Ni=1(xi?μ)2Nσ2=∑i=1N(xi?μ)2N |
| Standard Deviation | The standard deviation is the square root of the variance | s=∑ni=1(xi?Xˉˉˉˉ)2n?1?????????????√s=∑i=1n(xi?Xˉ)2n?1 | σ=∑Ni=1(xi?μ)2N????????????√σ=∑i=1N(xi?μ)2N |
方差的另一個公式:
Law of Large Number
可以參見之前一篇:大數(shù)定理和中心極限定理
謝益輝:大數(shù)定理和中心極限定理
概率論中討論隨機變量序列的算術(shù)平均值向隨機變量各數(shù)學(xué)期望的算術(shù)平均值收斂的定律。 在隨機事件的大量重復(fù)出現(xiàn)中,往往呈現(xiàn)幾乎必然的規(guī)律,這個規(guī)律就是大數(shù)定律。 通俗地說,這個定理就是,在試驗不變的條件下,重復(fù)試驗多次,隨機事件的頻率近似于它的概率。
所謂大數(shù)定律是,X1,X2X1,X2,…是一列獨立同分布(i.i.d)(i.i.d)的可積隨機變量,EXn=μ,Xnˉ=X1+X2+...+XnnEXn=μ,Xnˉ=X1+X2+...+Xnn,則EXnˉ→μ.EXnˉ→μ.
最后收斂的方式是依概率收斂的話稱作弱大數(shù)定律,幾乎處處收斂的話稱作強大數(shù)定律。
Central limit theorem
中心極限定理描述隨機變量序列收斂于正態(tài)分布。
就是說從一個存在均值和方差的總體中簡單隨機抽樣得到的樣本均值是服從正態(tài)分布的(當(dāng)n>=30)。
圖中為隨機變量的概率分布,假設(shè)樣本大小為4,不斷抽樣,計算樣本均值。
繪制頻率直方圖,可以發(fā)現(xiàn),隨著樣本大小n的增大,樣本均值(隨機變量)會越來越接近正態(tài)分布。
當(dāng)樣本量nn逐漸趨于無窮大時,nn個抽樣樣本的均值的頻數(shù)逐漸趨于正態(tài)分布,其對原總體的分布不做任何要求,意味著無論總體是什么分布,其抽樣樣本的均值的頻數(shù)的分布都隨著抽樣數(shù)的增多而趨于正態(tài)分布,如上圖,這個正態(tài)分布的均值會越來越逼近總體均值,并且其方差滿足σ2nσ2n,σσ為總體的標(biāo)準(zhǔn)差,注意抽樣樣本要多次抽取,一個容量為N的抽樣樣本是無法構(gòu)成分布的。
中心極限定理和大數(shù)定律的區(qū)別
Sampling distribution
What is a sampling distribution?
What is the distribution of the values that we could get for the statistics?
what is the frequency with which I can get different values for the statistic that is trying to estimate the parameter?
That distribution is a sampling distribution.
A sampling distribution for the sample mean with sample size of 2
Sampling Distribution of Sample Proportion
從桶里取球,黃球的比例p=0.6。
p=0.6:
p=0.1:
p=0.9:
sample size n=10:
sample size n=50 (tighter distribution):
the higher the sample size, the smaller the standard deviation
Normal conditions for sampling distributions of sample proportion
Under which conditions does the sampling distributions of sample proportion look roughly normal/ right skewed/ left skewed?
The mean of the sampling distribution of sample proportion is going to be the same thing as the population mean.
總結(jié)
- 上一篇: 计算机科学导论论文文章,计算机科学导论论
- 下一篇: 可汉学院python_A可汗学院-统计学