python经济学函数_有没有python计量经济学的教程?
多元線性回歸模型假設:
假設中國2013年各地區人均現金消費支出與工資性收入、其他收入之間的關系為:
Y= β 0 =\beta_0=β0?+β 1 X 1 \beta_1X_1β1?X1?+β 2 X 2 \beta_2X_2β2?X2?+μ \muμ
通過p y t h o n pythonpython的s t a t s m o d e l s statsmodelsstatsmodels庫對數據進行回歸計算:
import statsmodels.api as sm
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import model_selection
data = pd.read_excel(r'./計量經濟學數據.xlsx', sheet_name='Sheet1')
fit = sm.formula.ols(formula='現金消費支出Y ~ 工資性收入X1 + 其他收入X2', data=data).fit()
print(fit.summary())
sns.lmplot(x='工資性收入X1', y='現金消費支出Y', data=data, ci=None)
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
plt.show()
sns.pairplot(data.loc[:, ['現金消費支出Y', '工資性收入X1', '其他收入X2']])
# 顯示圖形
plt.show()
OLS Regression Results
==============================================================================
Dep. Variable: 現金消費支出Y R-squared: 0.922
Model: OLS Adj. R-squared: 0.917
Method: Least Squares F-statistic: 166.6
Date: Sun, 26 May 2019 Prob (F-statistic): 2.84e-16
Time: 13:43:41 Log-Likelihood: -260.68
No. Observations: 31 AIC: 527.4
Df Residuals: 28 BIC: 531.7
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 2599.1455 827.342 3.142 0.004 904.412 4293.879
工資性收入X1 0.4865 0.058 8.448 0.000 0.369 0.604
其他收入X2 0.6017 0.104 5.772 0.000 0.388 0.815
==============================================================================
Omnibus: 1.082 Durbin-Watson: 1.915
Prob(Omnibus): 0.582 Jarque-Bera (JB): 0.556
Skew: 0.327 Prob(JB): 0.757
Kurtosis: 3.064 Cond. No. 8.50e+04
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 8.5e+04. This might indicate that there are
strong multicollinearity or other numerical problems.
模型檢驗:
H 0 : β j = 0 H_0:\beta_j=0H0?:βj?=0
H 1 : β j H_1:\beta_jH1?:βj?不全部為零
擬合優度檢驗:
從回歸估計來看,模型擬合較好,可決系數R 2 = 0.922. R^{2}=0.922.R2=0.922.
F檢驗:
F值為166.6,查表得F α ( k , n ? k ? 1 ) = 3.34 F_{\alpha}(k,n-k-1)=3.34Fα?(k,n?k?1)=3.34,其中k=2,n=31,顯然有F > F α ( k , n ? k ? 1 ) F>F_{\alpha}(k,n-k-1)F>Fα?(k,n?k?1),表明模型的線性關系在5%的顯著水平下顯著成立.所以拒絕原假設。
t檢驗:
∣ t 1 ∣ = 8.448 , ∣ t 2 ∣ = 5.772 , t α / 2 ( n ? k ? 1 ) = 2.048 \left|t_1\right|=8.448,\left|t_2\right|=5.772,t_{\alpha/2}(n-k-1)=2.048∣t1?∣=8.448,∣t2?∣=5.772,tα/2?(n?k?1)=2.048
由于∣ t ∣ > t α / 2 ( n ? k ? 1 ) \left|t\right|>t_{\alpha/2}(n-k-1)∣t∣>tα/2?(n?k?1),所以拒絕零假設.
綜上可得中國2013年各地區人均現金消費支出與工資性收入、其他收入之間的關系為:
Y = 2599.1455 + 0.4865 X 1 + 0.6017 X 2 Y=2599.1455+0.4865X_1+0.6017X_2Y=2599.1455+0.4865X1?+0.6017X2?
β 1 < β 2 \beta_1
因變量預測有時建立完模型并對其進行檢驗后,還需觀察實際值和預測值具體情況,以確定模型的可用性。
data4 = pd.read_excel(r'./計量經濟學數據.xlsx', sheet_name='Sheet1')
train, test = model_selection.train_test_split(data4, test_size=0.2, random_state=1234)
fit4 = sm.formula.ols(formula='現金消費支出Y ~ 工資性收入X1 + 其他收入X2', data=train).fit()
test_X = test.drop(labels='現金消費支出Y', axis=1)
pred = fit4.predict(exog=test_X)
print('對比預測值和實際值:\n', pd.DataFrame({'prediction': pred, 'real': test.現金消費支出Y}))
對比預測值和實際值:
prediction real
7 13874.648201 14161.7
10 25068.272118 23257.2
4 16645.508042 19249.1
1 21539.239415 21711.9
29 15077.077324 15321.1
8 28477.482744 28155.0
3 15073.999588 13166.2由預測值和實際值對比可以看出,有的預測值和實際值相差比較大,但總體上來說預測值與實際值比較接近,也就一定程度上說明了這個模型的可用性。
化為線性的非線性實例模型假設:
由Cobb-Dauglas生產函數Y = A K β 1 L β 2 Y=AK^{\beta1}L^{\beta2}Y=AKβ1Lβ2,A代表既定的工程技術水平,β 1 \beta_1β1?、β 2 \beta_2β2?分別為資本與勞動投入的產出彈性,當β 1 + β 2 = 1 時 \beta_1+\beta_2=1時β1?+β2?=1時,當大于1或小于1時,表明規模收益遞增或遞減。為了便于比較,下面將會對此模型進行線性變換,即假設2010年中國制造業各行業的總產出及要素投入的關系為:
Y = β 0 + β 1 log ? K + β 2 log ? L + μ Y=\beta_0+\beta_1\log K+\beta_2\log L+\muY=β0?+β1?logK+β2?logL+μ
data2 = pd.read_excel(r'./計量經濟學數據.xlsx', sheet_name='Sheet2')
fit2 = sm.formula.ols(formula='np.log(工業總產值) ~ np.log(資本投入) + np.log(年均從業人員)', data=data2).fit()
sns.pairplot(data2.loc[:, ['工業總產值', '資本投入', '年均從業人員']])
print(fit2.summary())
plt.show()
OLS Regression Results
==============================================================================
Dep. Variable: np.log(工業總產值) R-squared: 0.941
Model: OLS Adj. R-squared: 0.938
Method: Least Squares F-statistic: 286.3
Date: Sun, 26 May 2019 Prob (F-statistic): 7.86e-23
Time: 13:43:42 Log-Likelihood: -12.793
No. Observations: 39 AIC: 31.59
Df Residuals: 36 BIC: 36.58
Df Model: 2
Covariance Type: nonrobust
==================================================================================
coef std err t P>|t| [0.025 0.975]
----------------------------------------------------------------------------------
Intercept 1.8003 0.401 4.493 0.000 0.988 2.613
np.log(資本投入) 0.6778 0.081 8.344 0.000 0.513 0.843
np.log(年均從業人員) 0.2911 0.086 3.395 0.002 0.117 0.465
==============================================================================
Omnibus: 37.173 Durbin-Watson: 1.263
Prob(Omnibus): 0.000 Jarque-Bera (JB): 165.957
Skew: -2.018 Prob(JB): 9.18e-37
Kurtosis: 12.264 Cond. No. 75.3
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
模型檢驗:
H 0 : β j = 0 H_0:\beta_j=0H0?:βj?=0
H 1 : β j H_1:\beta_jH1?:βj?不全部為零
擬合優度檢驗:
從回歸估計來看,模型擬合較好,可決系數R 2 = 0.941. R^{2}=0.941.R2=0.941.
F檢驗:
F值為286.3,查表得F α ( k , n ? k ? 1 ) = 3.26 F_{\alpha}(k,n-k-1)=3.26Fα?(k,n?k?1)=3.26,其中k = 2 k=2k=2,n = 39 n=39n=39,顯然有F > F α ( k , n ? k ? 1 ) F>F_{\alpha}(k,n-k-1)F>Fα?(k,n?k?1),表明模型的線性關系在5%的顯著水平下顯著成立.所以拒絕原假設。
t檢驗:
∣ t 1 ∣ = 8.344 , ∣ t 2 ∣ = 3.395 , t α / 2 ( n ? k ? 1 ) = 2.036 \left|t_1\right|=8.344,\left|t_2\right|=3.395,t_{\alpha/2}(n-k-1)=2.036∣t1?∣=8.344,∣t2?∣=3.395,tα/2?(n?k?1)=2.036
由于∣ t ∣ > t α / 2 ( n ? k ? 1 ) \left|t\right|>t_{\alpha/2}(n-k-1)∣t∣>tα/2?(n?k?1),所以拒絕零假設.
綜上可得2010年中國制造業各行業的總產出及要素投入的關系為:Y = 1.8003 + 0.6778 log ? K + 0.2911 log ? L , 0.6778 + 0.2911 = 0.9689 Y=1.8003+0.6778\log K+0.2911\log L,0.6778+0.2911=0.9689Y=1.8003+0.6778logK+0.2911logL,0.6778+0.2911=0.9689,以上結果表明,在2010年,中國工業總產出關于資本投入的產出彈性為0.6778,表明當其他因素不變時,工業的資本每增加1%,總產出將增加0.6778%,同樣地,當其他因素不變時,勞動力投入每增長1%,總產出將增加0.2911%,可見,資本投入的增加對工業總產出的增長起到了更大的作用。
虛擬變量問題在一些數據中,通常會有一些變量無法通過量化來進行處理,但是這些變量往往對模型結果產生較大的影響,所以,這類因素是無法被丟棄的,因此引入了“虛擬變量”,又叫做啞變量,來進行“量化處理”。下面我們將會以城鎮居民為基準線對2013年中國農村與城鎮居民家庭人均工資收入、其他收入和生活消費支出進行模型建立。
假設模型為:
Y = α 0 + α 1 X 1 + α 2 X 2 + C Y=\alpha_0+\alpha_1X_1+\alpha_2X_2+CY=α0?+α1?X1?+α2?X2?+C
data3 = pd.read_excel(r'./計量經濟學數據.xlsx', sheet_name='Sheet3')
fit3 = sm.formula.ols(formula='生活消費 ~ 工資收入 + 其他收入 + C(農村or城鎮)', data=data3).fit()
print(fit3.summary())
OLS Regression Results
==============================================================================
Dep. Variable: 生活消費 R-squared: 0.975
Model: OLS Adj. R-squared: 0.974
Method: Least Squares F-statistic: 758.1
Date: Sun, 26 May 2019 Prob (F-statistic): 1.81e-46
Time: 13:43:42 Log-Likelihood: -513.02
No. Observations: 62 AIC: 1034.
Df Residuals: 58 BIC: 1043.
Df Model: 3
Covariance Type: nonrobust
=====================================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------------
Intercept 1783.7377 345.728 5.159 0.000 1091.687 2475.788
C(農村or城鎮)[T.城鎮居民] 140.8608 483.598 0.291 0.772 -827.166 1108.888
工資收入 0.5477 0.039 13.978 0.000 0.469 0.626
其他收入 0.5589 0.073 7.666 0.000 0.413 0.705
==============================================================================
Omnibus: 0.360 Durbin-Watson: 1.733
Prob(Omnibus): 0.835 Jarque-Bera (JB): 0.086
Skew: 0.082 Prob(JB): 0.958
Kurtosis: 3.079 Cond. No. 6.19e+04
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 6.19e+04. This might indicate that there are
strong multicollinearity or other numerical problems.模型檢驗:
H 0 : β j = 0 H_0:\beta_j=0H0?:βj?=0
H 1 : β j H_1:\beta_jH1?:βj?不全部為零
擬合優度檢驗:
從回歸估計來看,模型擬合較好,可決系數R 2 = 0.975. R^{2}=0.975.R2=0.975.
F檢驗:
F值為758.1,查表得F α ( k , n ? k ? 1 ) = 4.16 F_{\alpha}(k,n-k-1)=4.16Fα?(k,n?k?1)=4.16,其中k=3,n=62,顯然有F > F α ( k , n ? k ? 1 ) F>F_{\alpha}(k,n-k-1)F>Fα?(k,n?k?1),表明模型的線性關系在5%的顯著水平下顯著成立.所以拒絕零假設。
t檢驗:
∣ t 1 ∣ = 0.291 , ∣ t 2 ∣ = 13.978 , , ∣ t 3 ∣ = 7.666 , t α / 2 ( n ? k ? 1 ) = 2.010 \left|t_1\right|=0.291,\left|t_2\right|=13.978,,\left|t_3\right|=7.666,t_{\alpha/2}(n-k-1)=2.010∣t1?∣=0.291,∣t2?∣=13.978,,∣t3?∣=7.666,tα/2?(n?k?1)=2.010
由 于 ∣ t ∣ > t α / 2 ( n ? k ? 1 ) 由于\left|t\right|>t_{\alpha/2}(n-k-1)由于∣t∣>tα/2?(n?k?1),所以拒絕零假設.
綜上可得2013年中國農村與城鎮居民家庭人均工資收入、其他收入和生活消費支出的關系為:
Y = 1783.7377 + 0.5477 X 1 + 0.5589 X 2 + 140.8608 Y=1783.7377+0.5477X_1+0.5589X_2+140.8608Y=1783.7377+0.5477X1?+0.5589X2?+140.8608城鎮居民,
以上結果表明,當其他因素不變時,中國城鎮居民平均消費支出比農村居民平均消費水平多140.8608元。
受約束回歸在建立回歸模型時,有時根據經濟理論需要對自變量之間的關系進行約束,比如兩個回歸系數β 1 \beta_1β1?和β 2 \beta_2β2?之間的約束條件使得β 1 + β 2 = 1 \beta_1+\beta_2=1β1?+β2?=1或者使得β 1 = β 2 \beta_1=\beta_2β1?=β2?,此時稱為此回歸模型為受約束回歸。
首先建立無約束回歸模型
即:l n ( Q ) = β 0 + β 1 l n ( X / P 0 ) + β 2 ( P 1 / P ) + β 3 ( P 2 / P ) + β 4 P 01 + β 5 P 02 + β 6 P 03 ln(Q)=\beta_0+\beta_1ln(X/P_0)+\beta_2(P_1/P)+\beta_3(P_2/P)+\beta_4P_{01}+\beta_5P_{02}+\beta_6P_{03}ln(Q)=β0?+β1?ln(X/P0?)+β2?(P1?/P)+β3?(P2?/P)+β4?P01?+β5?P02?+β6?P03?
import statsmodels.api as sm
data4 = pd.read_excel(r'./計量經濟學數據.xlsx', sheet_name='Sheet4')
Q = data4["蛋類消費量Q(千克)"]
X = data4["人均消費支出X(元)"]
P0 = data4["居民消費價格指數P0"]
P = data4["蛋類P(價格指數)"]
P1 = data4["肉禽類P1(價格指數)"]
P2 = data4["水產類P2(價格指數)"]
P01 = data4["糧食P3(價格指數)"]
P02 = data4["油脂P4(價格指數)"]
P03 = data4["蔬菜P5(價格指數)"]
df = pd.DataFrame({"log(X/P0)":np.log(X/P0),
"P1/P":P1/P,
"P2/P":P2/P,
"P01":P01,
"P02":P02,
"P03":P03},)
df = sm.add_constant(df)
fit = sm.OLS(np.log(Q),df).fit()
print(fit.summary())
Dep. Variable: 蛋類消費量Q(千克) R-squared: 0.527
Model: OLS Adj. R-squared: 0.409
Method: Least Squares F-statistic: 4.462
Date: Tue, 05 Jan 2021 Prob (F-statistic): 0.00361
Time: 18:00:28 Log-Likelihood: -19.335
No. Observations: 31 AIC: 52.67
Df Residuals: 24 BIC: 62.71
Df Model: 6
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const -11.1535 12.746 -0.875 0.390 -37.461 15.154
log(X/P0) 1.3283 0.326 4.078 0.000 0.656 2.001
P1/P -1.4528 4.210 -0.345 0.733 -10.141 7.235
P2/P 5.1265 2.281 2.248 0.034 0.419 9.834
P01 0.0150 0.077 0.196 0.846 -0.144 0.174
P02 0.0051 0.076 0.068 0.946 -0.151 0.161
P03 0.0101 0.033 0.310 0.759 -0.057 0.078
==============================================================================
Omnibus: 0.203 Durbin-Watson: 1.391
Prob(Omnibus): 0.904 Jarque-Bera (JB): 0.405
Skew: -0.094 Prob(JB): 0.817
Kurtosis: 2.473 Cond. No. 2.63e+04
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.63e+04. This might indicate that there are
strong multicollinearity or other numerical problems.模型檢驗:
H 0 : β j = 0 H_0:\beta_j=0H0?:βj?=0
H 1 : β j H_1:\beta_jH1?:βj?不全部為零
擬合優度檢驗:
從回歸估計來看,調整的可決系數R 2 = 0.409 R^{2}=0.409R2=0.409,但是此回歸結果并不作為預測模型來進行預測,所以可以不必過分關注可決系數.
F檢驗:
F值為 4.641,P = 0.00290 P=0.00290P=0.00290,表明模型的線性關系在5%的顯著水平下顯著成立,所以拒絕零假設。
t檢驗:
以上結果除變量log(X/P0) 和P2/P在5%的顯著水平下拒絕原假設,其他變量均無法通過t檢驗,在其他條件不變的情況下,農村人均消費支出會明顯增加蛋類消費量,同時,當水產類價格上升速度大于蛋類產品時,會刺激農村消費者傾向于消費更多的蛋類產品,即在農村消費者的消費傾向中,水產品類與蛋類產品有一定的替代作用。
建立受約束回歸模型
約束條件為H 0 : β 2 = β 4 = β 5 = β 6 = 0 H_0:\beta_2=\beta_4=\beta_5=\beta_6=0H0?:β2?=β4?=β5?=β6?=0,即回歸模型為l n ( Q ) = β 0 + β 1 l n ( X / P 0 ) + β 3 ( P 2 / P ) ln(Q)=\beta_0+\beta_1ln(X/P_0)+\beta_3(P2/P)ln(Q)=β0?+β1?ln(X/P0?)+β3?(P2/P)
OLS Regression Results
==============================================================================
Dep. Variable: 蛋類消費量Q(千克) R-squared: 0.517
Model: OLS Adj. R-squared: 0.482
Method: Least Squares F-statistic: 14.98
Date: Tue, 05 Jan 2021 Prob (F-statistic): 3.77e-05
Time: 18:12:23 Log-Likelihood: -19.671
No. Observations: 31 AIC: 45.34
Df Residuals: 28 BIC: 49.64
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const -8.9767 2.364 -3.797 0.001 -13.819 -4.134
log(X/P0) 1.2843 0.288 4.456 0.000 0.694 1.875
P2/P 4.8805 2.062 2.367 0.025 0.657 9.104
==============================================================================
Omnibus: 0.178 Durbin-Watson: 1.377
Prob(Omnibus): 0.915 Jarque-Bera (JB): 0.313
Skew: -0.156 Prob(JB): 0.855
Kurtosis: 2.619 Cond. No. 152.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.從以上結果可以看出,在約束條件下,線性關系檢驗(F檢驗)和回歸系數檢驗(t檢驗)在5%的顯著水平下更加顯著,拒絕原假設的理由更加充分,即更加印證了無約束回歸所說明的結論。
如果大家在學習中遇到困難,想找一個python學習交流環境,可以加入我們一起學習正在跳轉?jq.qq.com
有關Python問題都可以給我留言喔
與50位技術專家面對面20年技術見證,附贈技術全景圖總結
以上是生活随笔為你收集整理的python经济学函数_有没有python计量经济学的教程?的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 取什么名字好听 教你如何给宝宝取名?
- 下一篇: 双色球中奖号码是怎么产生的?