
《聚类分析》
实 验 报 告
实验项目: 聚类分析
实验地点: 实验室名称:
学 院: 年级专业班:
学生姓名: 学号:
完成时间:
教师评语:
开课时间: 至 学年第 学期
| 成 绩 | |
| 教师签名 | |
| 批阅日期 |
1、掌握SPSS17.0或STATISTICA6.0软件中相关操作内容;
2、掌握聚类分析的基本步骤;
3、了解聚类分析的作用;
4、熟习虚拟变量的用法;
5、利用得到的结果解决实际问题。
二、实验原理
利用聚类分析解决经济与社会中的现实问题。
三、实验内容
1.数据搜集;
2.数据组织与录入保存;
3.利用SPSS17.0或STATISTICA6.0软件进行聚类分析;
4.对得到的结果进行相应的分析。
四、使用仪器、材料
SPSS
五、实验步骤
六、实验原始记录及其处理(数据、图表、计算等)
| 表4-2 各监测点环境要素的超标倍数 | |||||||
| 监测点序号 | 超标倍数 | 监测点序号 | 超标倍数 | ||||
| 大气 | 地表水 | 土壤 | 大气 | 地表水 | 土壤 | ||
| 1 | 3.66 | 2.54 | 2.21 | 11 | 12.53 | 3.28 | 1.48 |
| 2 | 3.34 | 2.27 | 2.12 | 12 | 3.02 | 1.58 | 1.43 |
| 3 | 3.29 | 5.71 | 1.90 | 13 | 0. | 1.10 | 1.04 |
| 4 | 6. | 1.30 | 1.90 | 14 | 3.66 | 1.32 | 1.17 |
| 5 | 3. | 1.31 | 1.52 | 15 | 3.17 | 2.80 | 1.15 |
| 6 | 8.65 | 1.07 | 3.50 | 16 | 3.84 | 1.08 | 1.01 |
| 7 | 4.55 | 6.16 | 4.25 | 17 | 3.96 | 1.36 | 1.09 |
| 8 | 4.75 | 5.60 | 2.75 | 18 | 3.42 | 1.68 | 1.25 |
| 9 | 5. | 1.39 | 1.23 | 19 | 3.66 | 0. | 1.10 |
| 10 | 4.05 | 3.45 | 2.51 | 20 | 1.18 | 0.78 | 1.24 |
| 七、实验结果及分析 |
一、背景介绍
我国经济飞速发展,相应的也使我国环境受到一定的影响,而工厂排放的废水、废气等物质对我国环境的污染尤其严重。以下数据是在在某工业区设有20个监测点,对该工业区的大气、地表水和土壤等环境要素进行监测,得到各种环境要素代表性污染物的日平均浓度,将其分别除以相应的环境质量标准(消除量纲不同的影响),得到的超标倍数。通过聚类分析,研究出哪些监测点可以并为一类,以减少开支。
| 表4-2 各监测点环境要素的超标倍数 | |||||||
| 监测点序号 | 超标倍数 | 监测点序号 | 超标倍数 | ||||
| 大气 | 地表水 | 土壤 | 大气 | 地表水 | 土壤 | ||
| 1 | 3.66 | 2.54 | 2.21 | 11 | 12.53 | 3.28 | 1.48 |
| 2 | 3.34 | 2.27 | 2.12 | 12 | 3.02 | 1.58 | 1.43 |
| 3 | 3.29 | 5.71 | 1.90 | 13 | 0. | 1.10 | 1.04 |
| 4 | 6. | 1.30 | 1.90 | 14 | 3.66 | 1.32 | 1.17 |
| 5 | 3. | 1.31 | 1.52 | 15 | 3.17 | 2.80 | 1.15 |
| 6 | 8.65 | 1.07 | 3.50 | 16 | 3.84 | 1.08 | 1.01 |
| 7 | 4.55 | 6.16 | 4.25 | 17 | 3.96 | 1.36 | 1.09 |
| 8 | 4.75 | 5.60 | 2.75 | 18 | 3.42 | 1.68 | 1.25 |
| 9 | 5. | 1.39 | 1.23 | 19 | 3.66 | 0. | 1.10 |
| 10 | 4.05 | 3.45 | 2.51 | 20 | 1.18 | 0.78 | 1.24 |
1、打开SPSS,并定义变量,如图:
将数据录入到SPSS中,如图:
2、进行聚类分析
(1)依次点击分析→分类→系统聚类,打开系统聚类分对话框,如图:
(2)将“大气”、“地表水”、“土壤”选为变量,将“监测点序号”选为标注个案,如图:
(3)然后点击“统计量”,如图选择:
(4)点击“继续”,选择“绘制”,如图选择:
(5)点击“继续”,选择“方法”,如图选择:
(6)点击“继续”,选择“保存”如图:
(7)选择“继续”,点击“确定”,即得结果
三、得到结果
表1:
| Proximity Matrix | ||||||||||||||||||||
| Case | Squared Euclidean Distance | |||||||||||||||||||
| 1:1 | 2:2 | 3:3 | 4:4 | 5:5 | 6:6 | 7:7 | 8:8 | 9:9 | 10:10 | 11:11 | 12:12 | 13:13 | 14:14 | 15:15 | 16:16 | 17:17 | 18:18 | 19:19 | 20:20 | |
| 1:1 | .000 | .051 | 3.661 | 2.008 | 1.146 | 6.659 | 10.027 | 3.830 | 2.445 | .428 | 12.802 | 1.162 | 3.859 | 1.904 | 1.496 | 2.592 | 2.105 | 1.446 | 2.528 | 3.220 |
| 2:2 | .051 | .000 | 4.204 | 2.042 | .829 | 7.214 | 11.319 | 4.690 | 2.270 | .758 | 13.686 | .791 | 3.076 | 1.485 | 1.306 | 2.109 | 1.704 | 1.090 | 2.012 | 2.474 |
| 3:3 | 3.661 | 4.204 | .000 | 8.509 | 7.016 | 15.165 | 7.372 | 1.251 | 8.131 | 2.351 | 15.237 | 6.2 | 9.449 | 7.448 | 3.685 | 8.562 | 7.530 | 6.227 | 8.971 | 9.739 |
| 4:4 | 2.008 | 2.042 | 8.509 | .000 | 1.331 | 3.904 | 15.990 | 7.937 | .662 | 3.111 | 6.858 | 2.297 | 6.418 | 2.028 | 3.332 | 2.218 | 1.929 | 2.163 | 2.223 | 5.172 |
| 5:5 | 1.146 | .829 | 7.016 | 1.331 | .000 | 8.467 | 17.827 | 8.488 | .716 | 2.860 | 12.679 | .151 | 1.911 | .165 | 1.031 | .351 | .238 | .175 | .295 | 1.312 |
| 6:6 | 6.659 | 7.214 | 15.165 | 3.904 | 8.467 | .000 | 12.336 | 10.208 | 7.778 | 6.444 | 9.209 | 10.375 | 17.465 | 10.738 | 12.661 | 11.434 | 10.790 | 10.749 | 11.150 | 15.020 |
| 7:7 | 10.027 | 11.319 | 7.372 | 15.990 | 17.827 | 12.336 | .000 | 2.992 | 19.6 | 6.479 | 22.368 | 17.8 | 24.453 | 20.447 | 16.526 | 22.529 | 20.883 | 18.724 | 22.526 | 23.436 |
| 8:8 | 3.830 | 4.690 | 1.251 | 7.937 | 8.488 | 10.208 | 2.992 | .000 | 9.355 | 1.766 | 13.123 | 8.338 | 13.387 | 9.783 | 6.395 | 11.147 | 9.910 | 8.523 | 11.426 | 12.979 |
| 9:9 | 2.445 | 2.270 | 8.131 | .662 | .716 | 7.778 | 19.6 | 9.355 | .000 | 4.093 | 8.015 | 1.313 | 4.255 | .760 | 1.826 | .733 | .590 | .955 | .863 | 3.494 |
| 10:10 | .428 | .758 | 2.351 | 3.111 | 2.860 | 6.444 | 6.479 | 1.766 | 4.093 | .000 | 12.270 | 2.876 | 6.459 | 3.907 | 2.630 | 4.849 | 4.108 | 3.186 | 4.859 | 5.806 |
| 11:11 | 12.802 | 13.686 | 15.237 | 6.858 | 12.679 | 9.209 | 22.368 | 13.123 | 8.015 | 12.270 | .000 | 14.727 | 23.346 | 13.397 | 13.503 | 13.426 | 12.621 | 13.547 | 14.113 | 21.794 |
| 12:12 | 1.162 | .791 | 6.2 | 2.297 | .151 | 10.375 | 17.8 | 8.338 | 1.313 | 2.876 | 14.727 | .000 | 1.134 | .172 | .625 | .415 | .299 | .069 | .368 | .783 |
| 13:13 | 3.859 | 3.076 | 9.449 | 6.418 | 1.911 | 17.465 | 24.453 | 13.387 | 4.255 | 6.459 | 23.346 | 1.134 | .000 | 1.421 | 1.998 | 1.554 | 1.698 | 1.346 | 1.403 | .131 |
| 14:14 | 1.904 | 1.485 | 7.448 | 2.028 | .165 | 10.738 | 20.447 | 9.783 | .760 | 3.907 | 13.397 | .172 | 1.421 | .000 | .804 | .058 | .022 | .062 | .071 | 1.041 |
| 15:15 | 1.496 | 1.306 | 3.685 | 3.332 | 1.031 | 12.661 | 16.526 | 6.395 | 1.826 | 2.630 | 13.503 | .625 | 1.998 | .804 | .000 | 1.129 | .825 | .461 | 1.317 | 2.039 |
| 16:16 | 2.592 | 2.109 | 8.562 | 2.218 | .351 | 11.434 | 22.529 | 11.147 | .733 | 4.849 | 13.426 | .415 | 1.554 | .058 | 1.129 | .000 | .038 | .226 | .028 | 1.172 |
| 17:17 | 2.105 | 1.704 | 7.530 | 1.929 | .238 | 10.790 | 20.883 | 9.910 | .590 | 4.108 | 12.621 | .299 | 1.698 | .022 | .825 | .038 | .000 | .113 | .091 | 1.318 |
| 18:18 | 1.446 | 1.090 | 6.227 | 2.163 | .175 | 10.749 | 18.724 | 8.523 | .955 | 3.186 | 13.547 | .069 | 1.346 | .062 | .461 | .226 | .113 | .000 | .256 | 1.044 |
| 19:19 | 2.528 | 2.012 | 8.971 | 2.223 | .295 | 11.150 | 22.526 | 11.426 | .863 | 4.859 | 14.113 | .368 | 1.403 | .071 | 1.317 | .028 | .091 | .256 | .000 | .962 |
| 20:20 | 3.220 | 2.474 | 9.739 | 5.172 | 1.312 | 15.020 | 23.436 | 12.979 | 3.494 | 5.806 | 21.794 | .783 | .131 | 1.041 | 2.039 | 1.172 | 1.318 | 1.044 | .962 | .000 |
| This is a dissimilarity matrix | ||||||||||||||||||||
| Agglomeration Schedule | ||||||
| Stage | Cluster Combined | Coefficients | Stage Cluster First Appears | Next Stage | ||
| Cluster 1 | Cluster 2 | Cluster 1 | Cluster 2 | |||
| 1 | 14 | 17 | .011 | 0 | 0 | 5 |
| 2 | 16 | 19 | .025 | 0 | 0 | 5 |
| 3 | 1 | 2 | .051 | 0 | 0 | 10 |
| 4 | 12 | 18 | .085 | 0 | 0 | 7 |
| 5 | 14 | 16 | .137 | 1 | 2 | 8 |
| 6 | 13 | 20 | .203 | 0 | 0 | 13 |
| 7 | 5 | 12 | .300 | 0 | 4 | 8 |
| 8 | 5 | 14 | .615 | 7 | 5 | 12 |
| 9 | 4 | 9 | .946 | 0 | 0 | 14 |
| 10 | 1 | 10 | 1.332 | 3 | 0 | 14 |
| 11 | 3 | 8 | 1.958 | 0 | 0 | 15 |
| 12 | 5 | 15 | 2.666 | 8 | 0 | 13 |
| 13 | 5 | 13 | 4.581 | 12 | 6 | 17 |
| 14 | 1 | 4 | 7.411 | 10 | 9 | 17 |
| 15 | 3 | 7 | 10.657 | 11 | 0 | 18 |
| 16 | 6 | 11 | 15.262 | 0 | 0 | 18 |
| 17 | 1 | 5 | 20.108 | 14 | 13 | 19 |
| 18 | 3 | 6 | 33.484 | 15 | 16 | 19 |
| 19 | 1 | 3 | 57.000 | 17 | 18 | 0 |
| Cluster Membership | |||
| Case | 11 Clusters | 10 Clusters | 9 Clusters |
| 1:1 | 1 | 1 | 1 |
| 2:2 | 1 | 1 | 1 |
| 3:3 | 2 | 2 | 2 |
| 4:4 | 3 | 3 | 3 |
| 5:5 | 4 | 4 | 4 |
| 6:6 | 5 | 5 | 5 |
| 7:7 | 6 | 6 | 6 |
| 8:8 | 7 | 7 | 2 |
| 9:9 | 3 | 3 | 3 |
| 10:10 | 8 | 1 | 1 |
| 11:11 | 9 | 8 | 7 |
| 12:12 | 4 | 4 | 4 |
| 13:13 | 10 | 9 | 8 |
| 14:14 | 4 | 4 | 4 |
| 15:15 | 11 | 10 | 9 |
| 16:16 | 4 | 4 | 4 |
| 17:17 | 4 | 4 | 4 |
| 18:18 | 4 | 4 | 4 |
| 19:19 | 4 | 4 | 4 |
| 20:20 | 10 | 9 | 8 |
由树状图可以看出,监测点1、2、5、12、14、16、17、18、19为一类;监测点10单独为一类;监测点15单独为一类;监测点7单独为一类;监测点8单独为一类;监测点3单独为一类;监测点4、9为一类;监测点13、20为一类;监测点6单独为一类;监测点11单独为一类。每一类的观测效果是差不多的,因此,可以将在每一类中再选取某一监测点来进行监测。这样,可以减轻人力、物力的投入,并同时减少开支。
