統(tǒng)計建模與R軟件第九章答案_第1頁
統(tǒng)計建模與R軟件第九章答案_第2頁
統(tǒng)計建模與R軟件第九章答案_第3頁
已閱讀5頁,還剩8頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、第九章9.1( 1)利用主成分確定了 8個指標(biāo)的主成分,有 4 個,如圖( 21)(2)用 order()分別對 4 個主成分的預(yù)測值進(jìn)行排序,結(jié)果是如下表(26),而利用kmeans()進(jìn)行動態(tài)排序得到如下分類:第 1 類:建材( 6 ),森工( 7),食品( 8 ),紡織( 9),皮革( 11);第 2 類:機(jī)械( 5 );第 3 類:電力( 2),煤炭( 3),縫紉( 10 )造紙( 12);第 4 類:冶金( 1)化學(xué)( 4),文教藝術(shù)用品( 13 )。成分13個行業(yè)排序結(jié)果第一主成分:51324613119712108第二主成分:58491011312711623第三主成分:8153

2、9127102611413第四主成分:11657101312918324表( 26 )各行業(yè)按主成分得分進(jìn)行排序結(jié)果圖( 21)主成分碎石圖圖( 22)第一主成分與第二主成分下的散點(diǎn)圖習(xí)題程序與結(jié)論:> industry<-data.frame(+X1=c(90342,4903,6735,49454,139190,12215,2372,11062,17111,1206,2150,5251,14341) , +X2=c(52455,1973,21139,36241,203505,16219,6572,23078,23907,3930,5704,6155,13203) , +X3=c(

3、101091,2035,3767,81557,215898,10351,8103,54935,52108,6126,6200,10383,19396), +X4=c(19272,10313,1780,22504,10609,6382,12329,23804,21796,15586,10870,16875,14691),+ X5=c(82.0,34.2,36.1,98.1,93.2,62.5,184.4,370.4,221.5,330.4,184.2,146.4,94.6),+ X6=c(16.1,7.1,8.2,25.9,12.6,8.7,22.2,41.0,21.5,29.5,12.0,27

4、.5,17.8),+X7=c(197435,592077,726396,348226,139572,145818,20921,65486,63806,1840,8913,78796,6354),+X8=c(0.172,0.003,0.003,0.985,0.628,0.066,0.152,0.263,0.276,0.437,0.274,0.151,1.574) )> industry.pr<-princomp(industry,cor=T)> summary(industry.pr) # 做主成分分析,得到 4 個主成分,累積貢獻(xiàn)率達(dá) 94.68%Importance of

5、components:Comp.1 Comp.2 Comp.3 Comp.4 Comp.5Standard deviation 1.7620762 1.7021873 0.9644768 0.80132532 0.55143824Proportion of Variance 0.3881141 0.3621802 0.1162769 0.08026528 0.03801052Cumulative Proportion 0.3881141 0.7502943 0.8665712 0.94683649 0.98484701Comp.6 Comp.7 Comp.8Standard deviation

6、0.29427497 0.179400062 0.0494143207Proportion of Variance 0.01082472 0.004023048 0.0003052219Cumulative Proportion 0.99567173 0.999694778 1.0000000000> load<-loadings(industry.pr)# 求出載荷矩陣> loadLoadings:Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8X1 -0.477 -0.296 -0.1040.1840.758

7、0.245-0.518 0.527-0.174 -0.781X2 -0.473 -0.278 -0.163 -0.174 -0.305X3 -0.424 -0.378 -0.156X4 0.213 -0.4510.516 0.5390.288 -0.249 0.220X5 0.388 -0.331 -0.321 -0.199 -0.450 0.582 0.233X6 0.352 -0.403 -0.145 0.279 -0.317 -0.714X7 -0.215 0.377 -0.140 0.758 -0.418 0.194X8 -0.273 0.891 -0.322 0.122Comp.1

8、Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8SS loadings1.0001.0001.0001.0001.0001.0001.0001.000Proportion Var0.1250.1250.1250.1250.1250.1250.1250.125Cumulative Var0.1250.2500.3750.5000.6250.7500.8751.000> plot(load,1:2)> text(load,1,load,2,adj=c(-0.4,-0.3)> screeplot(industry.pr,npcs=4,

9、type="lines")#得出主成分的碎石圖> biplot(industry.pr)# 得出在第一,第二主成分之下的散點(diǎn)圖> p<-predict(industry.pr)# 預(yù)測數(shù)據(jù),講預(yù)測值放入 p 中> order(p,1);order(p,2);order(p,3);order(p,4);#將預(yù)測值分別以第一,第二,第三,第四主成分進(jìn)行排序15132 4 6 13 119 71210815849 10 1 13 127 1162318153 9 12 7 102 611413111657 10 13 12 91 8324> kmea

10、ns(scale(p),4)#將預(yù)測值進(jìn)行標(biāo)準(zhǔn)化,并分為4 類K-means clustering with 4 clusters of sizes 5, 1, 4, 3Cluster means:Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.61 0.5132590 -0.03438438 -0.3405983 -0.5130031 0.2355151 0.224410402 -2.5699693 -1.32913757 -0.4848689 -0.9460127 -0.9000187 -0.064979503 0.2381581 0.72871986 -0

11、.2995918 0.3126036 -0.4744091 -0.197097104 -0.3163193 -0.47127333 1.1287426 0.7535380 0.5400265 -0.08956137Comp.7 Comp.81 -0.38197798 -0.74748552 -0.67500209 0.45695483 0.09063069 0.98269154 0.74078975 -0.2167643Clustering vector:1 4 3 3 4 2 1 1 1 1 3 1 3 4Within cluster sum of squares by cluster:1

12、19.41137 0.00000 24.49504 16.61172(between_SS / total_SS = 37.0 %)Available components:1 "cluster""centers""totss" "withinss" "tot.withinss"6 "betweenss" "size"9.2#用數(shù)據(jù)框的形式輸入數(shù)據(jù)#用數(shù)據(jù)框的形式輸入數(shù)據(jù) sale<-data.frame(X1=c(82.9,88.0,99.9,10

13、5.3,117.7,131.0,148.2,161.8,174.2,184.7), X2=c(92,93,96,94,100,101,105,112,112,112), X3=c(17.1,21.3,25.1,29.0,34.0,40.0,44.0,49.0,51.0,53.0), X4=c(94,96,97,97,100,101,104,109,111,111), Y=c(8.4,9.6,10.4,11.4,12.2,14.2,15.8,17.9,19.6,20.8)#作線性回歸 lm.sol<-lm(YX1+X2+X3+X4,data=sale) summary(lm.sol) 顯示

14、結(jié)果Call:lm(formula = Y X1 + X2 + X3 + X4, data = sale)Residuals:12345 670.0248030.079476 0.012381 -0.007025 -0.288345 0.216090 -0158360 -0.135964 0.082310Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept)-17.667685.94360-2.9730.03107 *X10.090060.020954.2980.00773 *X2-0.23132

15、0.07132 -3.2430.02287 *X30.018060.039070.4620.66328X40.420750.118473.5520.01636 *Signif. codes:0 * '0.001 *' 0.01*' 0.05 .' 0.1 ' 1Residual standard error: 0.2037 on 5 degrees of freedomMultiple R-squared: 0.9988,Adjusted R-squared: 0.9978F-statistic: 1021 on 4 and 5 DF, p-value:

16、 1.827e-07模型通過 t 檢驗(yàn)和 F檢驗(yàn),因此回歸方程為: Y=-17.66768+0.09006X1-0.23132X2+0.01806X3+0.42075X4 Y 是銷售量, X1 是居民可支配收入 X2 是該類消費(fèi)品平均價格指數(shù), X1 和 X2 越高 Y 越高這與實(shí)際情況不 符,原因是 4 個變量存在多重共線性,對變量作主成分回歸,先作主成分分析。#作主成分分析 sale.pr<-princomp(X1+X2+X3+X4,data=sale,cor=TRUE) summary(sale.pr,loadings=TRUE)Importance of components:C

17、omp.1 Comp.2 Comp.3 Comp.4Standard deviation1.9859037 0.199906992 0.11218966 0.0603085506Proportion of Variance 0.9859534 0.009990701 0.00314663 0.0009092803Cumulative Proportion 0.9859534 0.995944090 0.99909072 1.0000000000Loadings:Comp.1 Comp.2 Comp.3 Comp.4X1 -0.502 -0.237 0.579 0.598X2 -0.500 0.

18、493 -0.610 0.367X3 -0.498 -0.707 -0.368 -0.342X4 -0.501 0.449 0.396 -0.626 4=0.060308550620所以變量存在著多重共線性 下面作主成分回歸分析,首先計算樣本的主成分的預(yù)測,并將第一主成分和第二主成分的預(yù)測值存放在 數(shù)據(jù)框 sale 中,然后對主成分作回歸分析,其命令格式如下#預(yù)測樣本主成分,并作主成分分析 pre<-predict(sale.pr)sale$Z1<-pre,1;sale$Z2<-pre,2 lm.sol<-lm(YZ1+Z2,data=sale)summary(lm.s

19、ol)Call:lm(formula = Y Z1 + Z2, data = sale)Residuals:Min 1Q Median 3Q Max -0.74323 -0.29223 0.01746 0.30807 0.80849 Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 14.03000Z1 -2.061190.17125 81.927 1.06e-11 *0.08623 -23.903 5.70e-08 *Z2-0.62409 0.85665 -0.729 0.49Signif. codes: 0 *

20、' 0.001 *' 0.01 *' 0.05 .' 0.1 ' 1 Residual standard error: 0.5415 on 7 degrees of freedomMultiple R-squared: 0.9879,Adjusted R-squared: 0.9845F-statistic: 285.9 on 2 and 7 DF, p-value: 1.945e-07 模型通過 t檢驗(yàn)和 F檢驗(yàn),回歸方程: Y=14.0300-2.06119Z * 1-0.62409Z *2 #作變換,得到原坐標(biāo)下的關(guān)系表達(dá)式 beta<-co

21、ef(lm.sol);A<-loadings(sale.pr) x.bar<-sale.pr$center;x.sd<-sale.pr$scale coef<-(beta2*A,1+beta3*A,2)/x.sd beta0<-beta1-sum(x.bar*coef)c(beta0,coef)顯示結(jié)果(Intercept)-16.88460655X1X2 X3 X40.03420968 0.093764600.11954881 0.12360237故回歸方程為: Y=-16.88460655+0.03420968X1+0.09376460X2+0.1195488

22、1X3+0.12360237X4該方程對應(yīng)系數(shù)均為整數(shù)比原方程更合理9.3將數(shù)據(jù)放入矩陣中,形成相關(guān)矩陣, 并對矩陣 r 做主成分分析, 由碎石圖和累積貢獻(xiàn)率可知,只取 comp1,comp2 兩個主成分,因此因子數(shù)是接近于 0.1 的數(shù)據(jù),因此將數(shù)據(jù)分為2,接下做因子分析, 在因子的載荷矩陣中看到有接近于 1 的數(shù)據(jù)和 factor1 :身高 x1,手臂長 x2 ,上肢長 x3 ,下肢長 x4 數(shù)據(jù)都接近于 1 (“長”類)和factor2:體重 x5,頸圍 x6,胸圍 x7,胸寬 x8 數(shù)據(jù)都較大( “寬”類)。圖( 25)碎石圖習(xí)題程序與結(jié)論:> x<- c(1.000, 0

23、.846, 0.805, 0.859, 0.473, 0.398, 0.301, 0.382,+0.846, 1.000, 0.881, 0.826, 0.376, 0.326, 0.277, 0.277,+0.805, 0.881, 1.000, 0.801, 0.380, 0.319, 0.237, 0.345,+0.859, 0.826, 0.801, 1.000, 0.436, 0.329, 0.327, 0.365,+0.473, 0.376, 0.380, 0.436, 1.000, 0.762, 0.730, 0.629,+0.398, 0.326, 0.319, 0.329,

24、0.762, 1.000, 0.583, 0.577,+0.301, 0.277, 0.237, 0.327, 0.730, 0.583, 1.000, 0.539,+0.382, 0.415, 0.345, 0.365, 0.629, 0.577, 0.539, 1.000)> names<-c("身高 x1", "手臂長 x2", "上肢長 x3", " 下肢長 x4", "體重 x5",+ "頸圍 x6", "胸圍 x7", "

25、 胸寬 x8")> r<-matrix(x, nrow=8, dimnames=list(names, names)#構(gòu)成相關(guān)矩陣#做主成分分析,選取主成分個數(shù),由累積貢獻(xiàn)率可知,只取comp1 , comp2> r.pr<-princomp(r,cor=T)> summary(r.pr) Importance of components:Comp.5Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 2.5668691 0.80719257 0.6497815 0.4481795 0.279924016 Pro

26、portion of Variance 0.8236021 0.08144498 0.0527770 0.0251081 0.009794682 Cumulative Proportion 0.8236021 0.90504709 0.9578241 0.9829322 0.992726878Comp.6 Comp.7 Comp.8Standard deviation 0.19139613 0.146807702 1.144334e-08Proportion of Variance 0.00457906 0.002694063 1.636875e-17Cumulative Proportion

27、 0.99730594 1.000000000 1.000000e+00> r.load<-loadings(r.pr);r.load # 載荷矩陣Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8-0.374 -0.210 -0.373 -0.352 0.613 -0.390 0.124Loadings:手臂長 x2 -0.3830.305 0.397 0.768上肢長 x3-0.382-0.1080.617 -0.209 -0.3100.566下肢長 x4-0.377-0.374 -0.470 -0.5920.3000

28、.225體重 x50.3440.364 -0.230-0.7080.372 -0.102-0.213頸圍 x60.3300.291 -0.6970.309 -0.1980.1710.396胸圍 x70.3410.317 0.619 -0.1090.2220.109 0.568胸寬 x80.286 -0.812 -0.140 -0.3330.1100.1750.294身高 x1Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8SS loadings1.0001.0001.0001.0001.0001.0001.0001.000Propo

29、rtion Var0.1250.1250.1250.1250.1250.1250.1250.125Cumulative Var0.1250.2500.3750.5000.6250.7500.8751.000> screeplot(r.pr,type="lines")# 畫出折線碎石圖#做因子分析,取 2 個因子> factanal(factors=2,covmat=r)Call: factanal(factors = 2, covmat = r)Uniquenesses:身高 x1 手臂長 x2 上肢長 x3 下肢長 x4 體重 x5 頸圍 x6 胸圍 x70.

30、166 0.110 0.166 0.1960.0990.360 0.414胸寬 x80.538LoadingsFactor1 Factor2身高 x10.8690.282手臂長x2 0.9290.164上肢長x3 0.8960.174下肢長x4 0.8620.247體重 x50.2440.917頸圍 x60.2010.774胸圍 x70.1410.752胸寬 x80.2220.643Factor1Factor2SS loadings3.3332.618Proportion Var0.4170.327Cumulative Var0.4170.744The degrees of freedom f

31、or the model is 13 and the fit was 0.2495#factor1 :身高 x1,手臂長 x2,上肢長 x3,下肢長 x4 數(shù)據(jù)都接近于 1(“長”類) #factor2 :體重 x5,頸圍 x6,胸圍 x7,胸寬 x8 數(shù)據(jù)都較大( “寬”類)9.4#輸入數(shù)據(jù)X<-data.frame( x1=c(99,99,100,93,100,90,75,93,87,95,76,85), x2=c(94,88,98,88,91,78,73,84,73,82,72,75), x3=c(93,96,81,88,72,82,88,83,60,90,43,50), x4=c

32、(100,99,96,99,96,75,97,68,76,62,67,34), x5=c(100,97,100,96,78,97,89,88,84,39,78,37)fa<-factanal(X,factors=2); fa顯示結(jié)果Call:factanal(x = X, factors = 2)Uniquenesses:x1 x2 x3 x4 x50.005 0.141 0.494 0.005 0.346Loadings:Factor1 Factor2x1 0.992 0.104x2 0.854 0.360x3 0.497 0.509x4 0.284 0.956x5 0.132 0.7

33、98Factor1Factor2SS loadings2.0591.950Proportion Var0.4120.390Cumulative Var0.4120.802Test of the hypothesis that 2 factors are sufficient. The chi square statistic is 0.28 on 1 degree of freedom. The p-value is 0.598#相關(guān)矩陣r=cor(X) fa<-factanal(factors=2,covmat=r); fa 顯示結(jié)果Call:factanal(factors = 2,

34、 covmat = r)Uniquenesses:x1 x2 x3 x4 x50.005 0.141 0.494 0.005 0.346Loadings:Factor1Factor2x1 0.9920.104x2 0.8540.360x3 0.4970.509x4 0.2840.956x5 0.1320.798Factor1Factor2SS loadings2.0591.950Proportion Var 0.4120.390Cumulative Var 0.4120.802The degrees of freedom for the model is 1 and the fit was 0

35、.0387 由以上兩種方法可得相關(guān)系數(shù)矩陣和用原矩陣所得到的結(jié)果是一致的,在計算結(jié)果中,因子 f1 前兩個變 量(x1,x2)的載荷因子接近于 1,這類歸為“文科類” ,因此稱之為文科因子;因子 f2 后兩個變量 (x4,x5)的 載荷因子接近于 1,這類歸為“理科類” ,因此稱之為理科因子。fa<-factanal(X,factors=2,scores='regression')plot(fa$scores,1:2,type='n')text(fa$scores,1,fa$scores,2)由圖可以看出, 1、 2、3、 5 的綜合成績比較突出, 7 號

36、學(xué)生的理科能力很強(qiáng),但文科很弱,偏科比較嚴(yán) 重,12 號學(xué)生的各科成績都不突出。9.5由這三張圖可知,第一張圖基本在一條直線附近,這是因?yàn)樗南嚓P(guān)系數(shù)為 0.9423286,接近 1,而第二 張圖開始有點(diǎn)分散,但也看做在直線附近,這是因?yàn)橄嚓P(guān)系數(shù)為 0.4754312,但第三張圖已經(jīng)分散,這是 因?yàn)橄嚓P(guān)系數(shù)為 0.104134 ,接近于 0。第一典型變量為坐標(biāo)的散點(diǎn)圖第二典型變量為坐標(biāo)的散點(diǎn)圖第三典型變量為坐標(biāo)的散點(diǎn)圖> index<-data.frame(+ x1=c(140.6,135.7,140.2,152.1,132.2,147.1,147.5,130.6,154.9,14

37、2.4,+136.5,162.0,148.9,136.3,159.5,165.9,134.5,152.5,138.2,144.2,128.1,+127.5,140.7,150.4,151.5,151.3,150.2,139.4,150.8,140.6,135.7,140.2,+152.1,132.2,147.1,147.5,130.6,154.9,142.4,136.5),+ x2=c(43.7,39.5,48.0,52.3,36.7,45.2,47.4,38.4,48.2,42.6,38.4,58.7,+42.4,33.1,49.1,55.7,41.6,53.4,35.5,42.0,37.3

38、,32.0,44.7,49.7,48.5,+47.2,48.1,33.6,45.6,46.7,47.5,48.0,50.3,43.7,41.2,45.4,38.4,48.2,+42.6,40.4),+ x3=c(77.9,63.9,75.0,88.1,62.4,78.9,76.2,61.8,87.2,74.1,69.6,95.6,+80.6,68.3,87.7,93.5,61.9,83.2,66.1,76.2,57.0,57.9,73.7,82.4,81.3,+84.3,85.8,67.0,84.9,67.9,57.9,71.0,88.1,62.4,78.9,76.2,65.8,91.2,+8

39、3.1,69.6),+ y1=c(2.67,2.08,2.62,2.89,2.14,2.86,3.14,2.03,2.91,2.33,1.98,3.29,+2.74,2.44,2.98,3.17,2.25,2.96,2.13,2.52,1.92,2.02,2.64,2.87,2.71,+2.92,2.79,2.27,2.86,2.67,2.38,2.62,2.89,2.14,2.66,2.75,2.13,2.91,+2.63,2.01),+ y2=c(7.00,6.98,6.17,10.42,7.47,9.25,8.78,5.31,10.69,11.15,7.77,3.35,+10.11,7.82,11.77,13.14,8.75,6.60,6.62,5.59,5.81,6.42,8.00,9.09,10.20,+6.16,9.50,8.92,12.03,7.00,6.98,6.17,10.42,7.47,9.25,8.78,5.31,10.69,+11.15,7.77),+ Y3=c(108.0,91.7,101.8,112.5,97.5,92.4,95.4,77.2,80.8,76.7,49.9,58.0,+ 82.4,76.5,88.1,110.3,75.1,71.5,105.4,82.0,92.7

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論