实验题目 异方差的诊断与修正
一、实验目的与要求:
要求目的:1、用图示法初步判断是否存在异方差,再用White检验异方差;
2、用加权最小二乘法修正异方差。
二、实验内容
根据1998年我国重要制造业的销售利润与销售收入数据,运用EV软件,做回归分析,用图示法,White检验模型是否存在异方差,如果存在异方差,运用加权最小二乘法修正异方差。
三、实验过程:(实践过程、实践所有参数与指标、理论依据说明等)
(一)模型设定
为了研究我国重要制造业的销售利润与销售收入是否有关,假定销售利润与销售收入之间满足线性约束,则理论模型设定为:
=++
其中,表示销售利润,表示销售收入。由1998年我国重要制造业的销售收入与销售利润的数据,如图1:
1988年我国重要制造业销售收入与销售利润的数据 (单位:亿元)
行业名称 | 销售利润Y | 销售收入X |
食品加工业 | 187.25 | 3180.44 |
食品制造业 | 111.42 | 1119.88 |
饮料制造业 | 205.42 | 14. |
烟草加工业 | 183.87 | 1328.59 |
纺织业 | 316.79 | 3862.9 |
服装制造业 | 157.7 | 1779.1 |
皮革羽绒制品 | 81.73 | 1081.77 |
木材加工业 | 35.67 | 443.74 |
家具制造业 | 31.06 | 226.78 |
造纸及纸制品 | 134.4 | 1124.94 |
印刷业 | 90.12 | 499.83 |
文教体育用品 | 54.4 | 504.44 |
石油加工业 | 194.45 | 2363.8 |
化学原料制品 | 502.61 | 4195.22 |
医药制造业 | 238.71 | 12.1 |
化学纤维制造 | 81.57 | 779.46 |
橡胶制品业 | 77.84 | 692.08 |
塑料制品业 | 144.34 | 1345 |
非金属矿制业 | 339.26 | 2866.14 |
黑色金属冶炼 | 367.47 | 3868.28 |
有色金属冶炼 | 144.29 | 1535.16 |
金属制品业 | 201.42 | 1948.12 |
普通机械制造 | 354.69 | 2351.68 |
专用设备制造 | 238.16 | 1714.73 |
交通运输设备 | 511.94 | 4011.53 |
电子机械制造 | 409.83 | 3286.15 |
电子通信设备 | 508.15 | 4499.19 |
仪器仪表设备 | 72.46 | 663.68 |
Dependent Variable: Y | ||||
Method: Least Squares | ||||
Date: 10/19/05 Time: 15:27 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | 12.035 | 19.51779 | 0.616650 | 0.5428 |
X | 0.104393 | 0.008441 | 12.36670 | 0.0000 |
R-squared | 0.854696 | Mean dependent var | 213.4650 | |
Adjusted R-squared | 0.849107 | S.D. dependent var | 146.45 | |
S.E. of regression | 56.90368 | Akaike info criterion | 10.935 | |
Sum squared resid | 84188.74 | Schwarz criterion | 11.08450 | |
Log likelihood | -151.8508 | F-statistic | 152.9353 | |
Durbin-Watson stat | 1.212795 | Prob(F-statistic) | 0.000000 |
(19.51779) (0.008441)
t=(0.616650) (12.36670)
=0.854696 =0.849107 S.E.=56.947 DW=1.212859 F=152.9353
这说明在其他因素不变的情况下,销售收入每增长1元,销售利润平均增长0.104393元。
=0.854696 , 拟合程度较好。在给定=0.0时,t=12.36670 > =2.056 ,拒绝原假设,说明销售收入对销售利润有显著性影响。F=152.9353 > = 4.23 ,表明方程整体显著。
(三)检验模型的异方差
※(一)图形法
6、判断
由图3可以看出,被解释变量Y随着解释变量X的增大而逐渐分散,离散程度越来越大;
同样,由图4可以看出,残差平方对解释变量X的散点图主要分布在图形中的下三角部分,大致看出残差平方随的变动呈增大趋势。因此,模型很可能存在异方差。但是否确实存在异方差还应该通过更近一步的检验。
※(二)White检验
White检验结果
White Heteroskedasticity Test: | ||||
F-statistic | 3.607218 | Probability | 0.042036 | |
Obs*R-squared | 6.270612 | Probability | 0.043486 | |
Test Equation: | ||||
Dependent Variable: RESID^2 | ||||
Method: Least Squares | ||||
Date: 10/19/05 Time: 15:29 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | -3279.779 | 2857.117 | -1.147933 | 0.2619 |
X | 5.670634 | 3.109363 | 1.823728 | 0.0802 |
X^2 | -0.000871 | 0.000653 | -1.334000 | 0.1942 |
R-squared | 0.223950 | Mean dependent var | 3006.741 | |
Adjusted R-squared | 0.161866 | S.D. dependent var | 5144.470 | |
S.E. of regression | 4709.744 | Akaike info criterion | 19.85361 | |
Sum squared resid | 5.55E+08 | Schwarz criterion | 19.99635 | |
Log likelihood | -274.9506 | F-statistic | 3.607218 | |
Durbin-Watson stat | 1.479908 | Prob(F-statistic) | 0.042036 | |
从上表可以看出,n=6.270612 ,有White检验知,在=0,05下,查分布表,得临界值(2)=5.99147。比较计算的统计量与临界值,因为n= 6.270612 > (2)=5.99147 ,所以拒绝原假设,不拒绝备择假设,这表明模型存在异方差。
(四)异方差的修正
在运用加权最小二乘法估计过程中,分别选用了权数=1/,=1/,=1/。
用权数的结果
Dependent Variable: Y | ||||
Method: Least Squares | ||||
Date: 10/22/10 Time: 00:13 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Weighting series: W1 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | 5.988351 | 6.403392 | 0.935184 | 0.3583 |
X | 0.108606 | 0.008155 | 13.31734 | 0.0000 |
Weighted Statistics | ||||
R-squared | 0.032543 | Mean dependent var | 123.4060 | |
Adjusted R-squared | -0.004667 | S.D. dependent var | 31.99659 |
S.E. of regression | 32.07117 | Akaike info criterion | 9.842541 | |
Sum squared resid | 26742.56 | Schwarz criterion | 9.937699 | |
Log likelihood | -135.7956 | F-statistic | 177.3515 | |
Durbin-Watson stat | 1.465148 | Prob(F-statistic) | 0.000000 | |
Unweighted Statistics | ||||
R-squared | 0.853095 | Mean dependent var | 213.4650 | |
Adjusted R-squared | 0.847445 | S.D. dependent var | 146.45 | |
S.E. of regression | 57.21632 | Sum squared resid | 85116.40 | |
Durbin-Watson stat | 1.261469 | |||
Dependent Variable: Y | ||||
Method: Least Squares | ||||
Date: 10/22/10 Time: 00:16 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Weighting series: W2 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | 6.496703 | 3.486526 | 1.863374 | 0.0737 |
X | 0.1062 | 0.010991 | 9.725260 | 0.0000 |
Weighted Statistics | ||||
R-squared | 0.922715 | Mean dependent var | 67.92129 | |
Adjusted R-squared | 0.919743 | S.D. dependent var | 75.51929 |
S.E. of regression | 21.39439 | Akaike info criterion | 9.032884 | |
Sum squared resid | 11900.72 | Schwarz criterion | 9.128041 | |
Log likelihood | -124.4604 | F-statistic | 94.58068 | |
Durbin-Watson stat | 1.905670 | Prob(F-statistic) | 0.000000 | |
Unweighted Statistics | ||||
R-squared | 0.854182 | Mean dependent var | 213.4650 | |
Adjusted R-squared | 0.848573 | S.D. dependent var | 146.45 | |
S.E. of regression | 57.00434 | Sum squared resid | 84486.88 | |
Durbin-Watson stat | 1.242212 | |||
用权数的结果
Dependent Variable: Y | ||||
Method: Least Squares | ||||
Date: 10/22/10 Time: 00:17 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Weighting series: W3 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | 8.0341 | 11.18733 | 0.772333 | 0.4469 |
X | 0.106153 | 0.007746 | 13.70473 | 0.0000 |
Weighted Statistics | ||||
R-squared | 0.611552 | Mean dependent var | 165.8420 | |
Adjusted R-squared | 0.596612 | S.D. dependent var | 67.13044 |
S.E. of regression | 42.636 | Akaike info criterion | 10.41205 | |
Sum squared resid | 472.56 | Schwarz criterion | 10.50720 | |
Log likelihood | -143.7686 | F-statistic | 187.8197 | |
Durbin-Watson stat | 1.275429 | Prob(F-statistic) | 0.000000 | |
Unweighted Statistics | ||||
R-squared | 0.854453 | Mean dependent var | 213.4650 | |
Adjusted R-squared | 0.848855 | S.D. dependent var | 146.45 | |
S.E. of regression | 56.95121 | Sum squared resid | 84329.44 | |
Durbin-Watson stat | 1.233545 | |||
用权数的结果
Dependent Variable: Y | ||||
Method: Least Squares | ||||
Date: 10/22/10 Time: 00:16 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Weighting series: W2 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | 6.496703 | 3.486526 | 1.863374 | 0.0737 |
X | 0.1062 | 0.010991 | 9.725260 | 0.0000 |
Weighted Statistics | ||||
R-squared | 0.922715 | Mean dependent var | 67.92129 | |
Adjusted R-squared | 0.919743 | S.D. dependent var | 75.51929 |
S.E. of regression | 21.39439 | Akaike info criterion | 9.032884 | |
Sum squared resid | 11900.72 | Schwarz criterion | 9.128041 | |
Log likelihood | -124.4604 | F-statistic | 94.58068 | |
Durbin-Watson stat | 1.905670 | Prob(F-statistic) | 0.000000 | |
Unweighted Statistics | ||||
R-squared | 0.854182 | Mean dependent var | 213.4650 | |
Adjusted R-squared | 0.848573 | S.D. dependent var | 146.45 | |
S.E. of regression | 57.00434 | Sum squared resid | 84486.88 | |
Durbin-Watson stat | 1.242212 | |||
(1.863374) (9.725260)
=0.922715 DW=1.905670 F=94.58068
括号中的数据为t统计量值。
由上可以看出,运用加权最小二乘法消除了异方差后,参数的t检验显著,可决系数提高了不少,F检验也显著,并说明销售收入每增长1元,销售利润平均增长0.1062元。
四、实践结果报告:
1、用图示法初步判断是否存在异方差:被解释变量Y随着解释变量X的增大而逐渐分散,离散程度越来越大;同样的,残差平方对解释变量X的散点图主要分布在图形中的下三角部分,大致看出残差平方随的变动呈增大趋势。因此,模型很可能存在异方差。但是否确实存在异方差还应该通过更近一步的检验。
再用White检验异方差:因为n= 6.270612 > (2)=5.99147 ,所以拒绝原假设,不拒绝备择假设,这表明模型存在异方差。
2、用加权最小二乘法修正异方差:
发现用权数的效果最好,则估计结果为:
= 6.496703 + 0.1062
(1.863374) (9.725260)
=0.922715 DW=1.905670 F=94.58068
括号中的数据为t统计量值。
由上可以看出,=0.922715,拟合程度较好。在给定=0.0时,t=9.725260 > =2.056 ,拒绝原假设,说明销售收入对销售利润有显著性影响。
F=94.58068 > = 4.23 , 表明方程整体显著。
运用加权最小二乘法后,参数的t检验显著,可决系数提高了不少,F检验也显著,并说明销售收入每增长1元,销售利润平均增长0.1062元。
3、再用White检验修正后的模型是否还存在异方差:
White检验结果
White Heteroskedasticity Test: | ||||
F-statistic | 3.144597 | Probability | 0.060509 | |
Obs*R-squared | 5.628058 | Probability | 0.059963 | |
Test Equation: | ||||
Dependent Variable: STD_RESID^2 | ||||
Method: Least Squares | ||||
Date: 10/22/10 Time: 00:17 | ||||
Sample: 1 28 | ||||
Included observations: 28 | ||||
Variable | Coefficient | Std. Error | t-Statistic | Prob. |
C | 1927.346 | 675.2246 | 2.854378 | 0.0085 |
X | -1.456613 | 0.734838 | -1.982223 | 0.0585 |
X^2 | 0.000245 | 0.000154 | 1.586342 | 0.1252 |
R-squared | 0.201002 | Mean dependent var | 425.0258 | |
Adjusted R-squared | 0.137082 | S.D. dependent var | 1198.210 | |
S.E. of regression | 1113.057 | Akaike info criterion | 16.96857 | |
Sum squared resid | 30972414 | Schwarz criterion | 17.11130 | |
Log likelihood | -234.5599 | F-statistic | 3.144597 | |
Durbin-Watson stat | 2.559506 | Prob(F-statistic) | 0.060509 | |
(2)=5.99147。
比较计算的统计量与临界值,因为n= 5.628058 < (2)=5.99147 ,所以接受原假设,这说明修正后的模型不存在异方差。