最新文章专题视频专题问答1问答10问答100问答1000问答2000关键字专题1关键字专题50关键字专题500关键字专题1500TAG最新视频文章推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37视频文章20视频文章30视频文章40视频文章50视频文章60 视频文章70视频文章80视频文章90视频文章100视频文章120视频文章140 视频2关键字专题关键字专题tag2tag3文章专题文章专题2文章索引1文章索引2文章索引3文章索引4文章索引5123456789101112131415文章专题3
当前位置: 首页 - 正文

LOGISTIC回归及SAS程序(很全哦)

来源:动视网 责编:小OO 时间:2025-09-29 06:44:08
文档

LOGISTIC回归及SAS程序(很全哦)

使用条件:⏹应变量Y是一个二值变量,取值为0和1⏹自变量X1,X2,……,Xm。⏹P表示在m个自变量作用下事件发生的概率。图像:程序:dataceshi;inputx1-x18y;cards;……;proclogisticdes;modely=x1-x18/selection=stepwise;run;例:三种药物drug取值0-2,病情程度degree分重-轻两类(0-1);因变量response为治疗效果的效与无效(1-0)Dataex12_1;Inputdrugdegreerespons
推荐度:
导读使用条件:⏹应变量Y是一个二值变量,取值为0和1⏹自变量X1,X2,……,Xm。⏹P表示在m个自变量作用下事件发生的概率。图像:程序:dataceshi;inputx1-x18y;cards;……;proclogisticdes;modely=x1-x18/selection=stepwise;run;例:三种药物drug取值0-2,病情程度degree分重-轻两类(0-1);因变量response为治疗效果的效与无效(1-0)Dataex12_1;Inputdrugdegreerespons
使用条件:

⏹应变量Y是一个二值变量,取值为0和1

⏹自变量X1,X2,……,Xm。

⏹P表示在m个自变量作用下事件发生的概率。

图像:

程序:

data ceshi;                                                                                                                                                                                                                                 

input x1-x18 y;                                                                                                                                                                                                                             

cards;                                                                                                                                                                                                                                      

……

;                                                                                                                                                                                                                                           

proc logistic des;                                                                                                                                                                                                                          

model y=x1-x18/selection=stepwise;                                                                                                                                                                                                          

run;

例:

三种药物drug取值0-2, 病情程度degree 分重-轻两类(0-1);因变量response为治疗效果的效与无效(1-0)

Data ex12_1;

Input drug degree response count;

Datalines;

0 1 1 38 

0 1 0 

0 0 1 10

0 0 0 82

1 1 1 95

1 1 0 18

1 0 1 50

1 0 0 35

2 1 1 88

2 1 0 26

2 0 1 34

2 0 0 37

;

Proc logistic data=ex12_1 descending;

Freq count;

Class drug/param=ref descending;

Model response=drug degree/rsq   scale=n aggregate;

Run;

Rsq显示R2

Scale, SCALE= specifies method to correct overdispersion,指定参数,=n表示不需要修正。

Aggregate计算卡方检验统计量

Class 语句将分类变量化成虚拟变量,三种药用两个虚拟变量表示。

The LOGISTIC Procedure

Model Information

Data Set                      WORK.EX12_1

Response Variable             response

Number of Response Levels     2

Frequency Variable            count

Model                         binary logit

Optimization Technique        Fisher's scoring

Number of Observations Read          12

Number of Observations Used          12

Sum of Frequencies Read             577

Sum of Frequencies Used             577

Response Profile

Ordered                      Total

Value     response     Frequency

1            1           315

2            0           262

Probability modeled is response=1.

Class Level Information

Design

Class     Value     Variables

drug 2        1      0

1          0      1

0          0      0

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Deviance and Pearson Goodness-of-Fit Statistics

Criterion Value DF Value/DF Pr > ChiSq

Deviance          0.3749        2       0.1874         0.8291

Pearson           0.36        2       0.1844         0.8316

模型拟合集优度检验,

Number of unique profiles: 6

Model Fit Statistics

                           Intercept

                 Intercept            and

Criterion          Only     Covariates

AIC             797.017        1.326

SC              801.375        658.757

-2 Log L        795.017        633.326

R-Square    0.2444    Max-rescaled R-Square    0.3268

The LOGISTIC Procedure

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 161.6907 3 <.0001

Score 148.1598 3 <.0001

Wald 118.1394 3 <.0001

检验模型全部系数为0,拒绝则模型有意义

Type 3 Analysis of Effects

Wald

Effect DF Chi-Square Pr > ChiSq

drug 2 95.0859 <.0001

degree 1 47.4607 <.0001

Analysis of Maximum Likelihood Estimates

Standard          Wald

Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.9594 0.2229 77.2441 <.0001

drug 2 1 1.8342 0.2406 58.0936 <.0001

drug 1 1 2.2850 0.2479 84.9472 <.0001

degree 1 1.3806 0.2004 47.4607 <.0001

参数估计与检验

Odds Ratio Estimates

Point          95% Wald

Effect           Estimate      Confidence Limits

drug   2 vs 0       6.260       3.906      10.033

drug   1 vs 0       9.826       6.044      15.974

degree              3.977       2.685       5.1

Association of Predicted Probabilities and Observed Responses

Percent Concordant     72.2    Somers' D    0.568

Percent Discordant     15.4    Gamma        0.9

Percent Tied           12.4    Tau-a        0.282

Pairs                 82530    c            0.784

铸铁冶炼,要对铁加热heat和 水中热处理(soaking time),n 表示铸铁块数,r 表示没有准备好轧制的铁块数。

data ingots;

      input Heat Soak r n @@;

      datalines;

   7 1.0 0 10  14 1.0 0 31  27 1.0 1 56  51 1.0 3 13

   7 1.7 0 17  14 1.7 0 43  27 1.7 4 44  51 1.7 0  1

   7 2.2 0  7  14 2.2 2 33  27 2.2 0 21  51 2.2 0  1

   7 2.8 0 12  14 2.8 0 31  27 2.8 1 22  51 4.0 0  1

   7 4.0 0  9  14 4.0 0 19  27 4.0 1 16

   ;

   proc logistic data=ingots;

      model r/n=Heat Soak;

   run;

                                                    The LOGISTIC Procedure

                                                       Model Information

                                        Data Set                       WORK.INGOTS

                                        Response Variable (Events)     r

                                        Response Variable (Trials)     n

                                        Model                          binary logit

                                        Optimization Technique         Fisher's scoring

实验次数n,事件发生次数r

                                            Number of Observations Read          19

                                            Number of Observations Used          19

                                            Sum of Frequencies Read             387

                                            Sum of Frequencies Used             387

                                                        Response Profile

                                               Ordered     Binary           Total

                                                 Value     Outcome      Frequency

                                                     1     Event               12

                                                     2     Nonevent           375

响应变量分析,发生12次,不发生375次。

                                                   Model Convergence Status

                                        Convergence criterion (GCONV=1E-8) satisfied.

                                                     Model Fit Statistics

                                                                         Intercept

                                                          Intercept            and

                                            Criterion          Only     Covariates

                                            AIC             108.988        101.346

                                            SC              112.947        113.221

                                            -2 Log L        106.988         95.346

用于选择最优级模型,越小越优级

                                            Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

                                    Likelihood Ratio        11.28        2         0.0030

                                    Score                   15.1091        2         0.0005

                                    Wald                    13.0315        2         0.0015

模型检验

似然比检验(likelihood ratiotest)、计分检验(score  test)、Wald检验(Wald test)三种

                                           Analysis of Maximum Likelihood Estimates

Standard               Wald

Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -5.5592 1.1197 24.6503 <.0001

Heat          1      0.0820      0.0237       11.9454        0.0005

Soak          1      0.0568      0.3312        0.0294        0.8639

系数检验

                                                     Odds Ratio Estimates

                                                       Point          95% Wald

                                          Effect    Estimate      Confidence Limits

                                          Heat         1.085       1.036       1.137

                                          Soak         1.058       0.553       2.026

                                                    The LOGISTIC Procedure

                                 Association of Predicted Probabilities and Observed Responses

                                       Percent Concordant     .4    Somers' D    0.460

                                       Percent Discordant     18.4    Gamma        0.555

                                       Percent Tied           17.2    Tau-a        0.028

                                       Pairs                  4500    c            0.730

Using the parameter estimates, you can calculate the estimated logit of as 

Logit(p)=log(p/1-p)=-5.5592+0.082 × Heat+0.0568 × Soak 

If Heat=7 and Soak=1, then logit(p)=-4.92584. Using this logit estimate, you can calculate as follows: 

P=1/(1+e4.9284)=0.0072

Y表示骑车上班(Y=1bike,Y=0,BUS),X1年龄,X2月收入,X3性别(1男,0女)

X3

X1

X2

y
0188500
02112000
0238501
0239501
02812001
0318500
03615001
04210001
0469501
04812000
05518001
05621001
05818001
1188500
12010000
12512000
12713000
12815000
1309501
13210000
13318000
13310000
13812000
14115000
14518001
14810000
15215001
15618001
Data p256;

Input  X3    X1    X2    y;

Datalines;

0    18    850    0

0    21    1200    0

0    23    850    1

0    23    950    1

0    28    1200    1

0    31    850    0

0    36    1500    1

0    42    1000    1

0    46    950    1

0    48    1200    0

0    55    1800    1

0    56    2100    1

0    58    1800    1

1    18    850    0

1    20    1000    0

1    25    1200    0

1    27    1300    0

1    28    1500    0

1    30    950    1

1    32    1000    0

1    33    1800    0

1    33    1000    0

1    38    1200    0

1    41    1500    0

1    45    1800    1

1    48    1000    0

1    52    1500    1

1    56    1800    1

;

Proc logistic data=p256 descending  ;

Model y=x1-x3;

output out=pred p=phat lower=lcl upper=ucl

predprobs=(individual crossvalidate);

run;

proc print data=pred;

run;

The LOGISTIC Procedure

Model Information

                                        Data Set                      WORK.P256

                                        Response Variable             y

                                        Number of Response Levels     2

                                        Model                         binary logit

                                        Optimization Technique        Fisher's scoring

                                            Number of Observations Read          28

                                            Number of Observations Used          28

                                                        Response Profile

                                               Ordered                      Total

                                                 Value            y     Frequency

                                                     1            0            15

                                                     2            1            13

                                                  Probability modeled is y=0.

                                                   Model Convergence Status

                                        Convergence criterion (GCONV=1E-8) satisfied.

                                                     Model Fit Statistics

                                                                         Intercept

                                                          Intercept            and

                                            Criterion          Only     Covariates

                                            AIC              40.673         33.971

                                            SC               42.005         39.299

                                            -2 Log L         38.673         25.971

                                            Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

                                    Likelihood Ratio        12.7026        3         0.0053

                                    Score                   10.4135        3         0.0154

                                    Wald                     6.5331        3         0.0884

Analysis of Maximum Likelihood Estimates

Standard          Wald

Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept     1      3.6547      2.0911        3.0545        0.0805

X1            1     -0.0822      0.0521        2.4853        0.1149

X2            1    -0.00152     0.00187        0.6613        0.4161

X3            1      2.5016      1.1578        4.66        0.0307

                                                    The LOGISTIC Procedure

                                                     Odds Ratio Estimates

                                                       Point          95% Wald

                                          Effect    Estimate      Confidence Limits

                                          X1           0.921       0.832       1.020

                                          X2           0.998       0.995       1.002

                                          X3          12.203       1.262     118.014

                                 Association of Predicted Probabilities and Observed Responses

                                       Percent Concordant     87.2    Somers' D    0.744

                                       Percent Discordant     12.8    Gamma        0.744

                                       Percent Tied            0.0    Tau-a        0.384

                                       Pairs                   

195    c            0.872

序号样品数W

其中有房屋数收 入(千元)

110.01.52.0
220.03.23.0
325.04.04.0
430.05.05.0
540.08.06.0
650.012.08.0
760.018.010.0
880.028.013.0
9100.045.015.0
1070.036.020.0
1165.039.025.0
1250.033.030.0
1340.030.035.0
1425.020.040.0
1530.027.050.0
1640.038.060.0
1750.048.070.0
1860.058.080.0
Data ex1;

Input no n n1 x;

Datalines;

1    10.0    1.5    2.0

2    20.0    3.2    3.0

3    25.0    4.0    4.0

4    30.0    5.0    5.0

5    40.0    8.0    6.0

6    50.0    12.0    8.0

7    60.0    18.0    10.0

8    80.0    28.0    13.0

9    100.0    45.0    15.0

10    70.0    36.0    20.0

11    65.0    39.0    25.0

12    50.0    33.0    30.0

13    40.0    30.0    35.0

14    25.0    20.0    40.0

15    30.0    27.0    50.0

16    40.0    38.0    60.0

17    50.0    48.0    70.0

18    60.0    58.0    80.0

;

Proc logistic data=ex1;

Model n1/n=x;

Run;

文档

LOGISTIC回归及SAS程序(很全哦)

使用条件:⏹应变量Y是一个二值变量,取值为0和1⏹自变量X1,X2,……,Xm。⏹P表示在m个自变量作用下事件发生的概率。图像:程序:dataceshi;inputx1-x18y;cards;……;proclogisticdes;modely=x1-x18/selection=stepwise;run;例:三种药物drug取值0-2,病情程度degree分重-轻两类(0-1);因变量response为治疗效果的效与无效(1-0)Dataex12_1;Inputdrugdegreerespons
推荐度:
  • 热门焦点

最新推荐

猜你喜欢

热门推荐

专题
Top