實驗報告聚類分析_第1頁
實驗報告聚類分析_第2頁
實驗報告聚類分析_第3頁
實驗報告聚類分析_第4頁
實驗報告聚類分析_第5頁
已閱讀5頁,還剩28頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領

文檔簡介

1、實驗報告聚類分析實驗原理:K均值聚類、中心點聚類、系統(tǒng)聚類和 EM算法聚類分析技術。實驗題目:用鳶尾花的數(shù)據(jù)集,進行聚類挖掘分析。實驗要求:探索鳶尾花數(shù)據(jù)的基本特征,利用不同的聚類挖掘方法,獲得基本結論并簡明解釋。實驗題目-分析報告:data(iris)> rm(list=ls()> gc()used (Mb) gc trigger (Mb) max used (Mb)929718607591Ncells 431730Vcells 7876058388608 1592403data(iris) data<-irishead(data)Sp eciessetosa setosa

2、 setosa setosa setosa setosa#Kmear聚類分析 > n ewiris <- iris> n ewiris$S pecies <- NULL> (kc <- kmea ns(n ewiris, 3)K-mea ns clusteri ng with 3 clusters of sizes 62, 50, 38Cluster mea ns:Clusteri ng vector:1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 241 2

3、 2 2 2 2 2 2 2 2 2 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 181 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 3 3 3 3 1 3 3 3 3 3 3 1 1 3 3121 3 1 3 1 3 3 1 1 3 3 3 3 3 1 3 3 3 3 1 3 3 3 1 3 3 3 1 3 3 1With in cluster sum of squares by cluster: 1 (between_SS / total_SS = %)Available compone

4、nts:1 "cluster" "centers" "totss" "withinss "betweenss" "size" "iter" "ifault"> table(iris$S pecies, kc$cluster)setosa 0 50 0versicolor 48 0 2virgi nica 14 0 36 > plot( newirisc("", ""), col = kc$clu

5、ster) > poi nts(kc$ce nters,c("", ""), col = 1:3, pch = 8, cex=2)OH00c&o-J oO 0c?i0 0oc 3- o000 *自 0<5oo0c0門0 r 0-0oo0 0o oo o oOD Q Q O OCl c oJ匸>o心OD凸O OD ?. O O0J0 O <Sjj 0 QOOOO O 005QQ GQoQliQD iio001?00oCD04£505560657 07 58 0Sepal#K-Mediods進行聚類分析("

6、cluster")library(cluster)v-p am(iris,3)> table(iris$S pecies,$clusten ng)setosa 50 0 0versicolor 0 3 47virgi nica 0 49 1> layout(matrix(c(1,2),1,2)> plotelusplotpam(x = Iris, k = 3)Silhouette plot of pamfx = Iris, k = 3) n-1503 Clustersj=唧昨調ECM1 y 滬2.52 J 0.4146 I 0 51-3-2Cornponeiil 1

7、Treso TWO Mmpu陸rte explain 95.02 % of Ltie point vsD.0020.4 DE 0.61j0'Sitwuelte width 5;Aver dye Sifiovudu w<JUi. 0.57> layout(matrix(1)#hc<-hclust( dist(iris,1:4)pl ot( , hang = -1)plclust( , labels = FALSE, ha ng = -1) re <- , k = 3)<-cutree, 3)1hlEh Idistiris, 1;4hdust (*, '

8、;complete')#利用剪枝函數(shù)cutree()參數(shù)h控制輸出height=18時的系譜類別> sapply(uniq ue,+ fun ctio n(g)iris$S pecies=g)1 1 setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa 12 setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa 23 setosa setosa setosa setosa setos

9、a setosa setosa setosa setosa setosa setosa 34 setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa 45 setosa setosa setosa setosa setosa setosaLevels: setosa versicolor virginica21 versicolor versicolor versicolor versicolor versicolor versicolor versicolor8 versicolor versi

10、color versicolor versicolor versicolor versicolor versicolor 15 versicolor versicolor versicolor versicolor versicolor versicolor versicolor22 versicolor versicolor virginica virginica virginica virginica virginica29 virginica virginicavirginica virginicavirginica virginica virginica36 virginica vir

11、ginicavirginica virginicavirginica virginica virginica43 virginica virginicavirginica virginicavirginica virginica virginica50 virginica virginicavirginica virginicavirginica virginica virginica57 virginica virginicavirginica virginicavirginica virginica virginica64 virginica virginicavirginica virg

12、inicavirginica virginica virginica71 virginica virginicaLevels: setosa versicolor virginica31 versicolor versicolor versicolor versicolor versicolor versicolor versicolor8 versicolor versicolor versicolor versicolor versicolor versicolor versicolor15 versicolor versicolor versicolor versicolor versi

13、color versicolor versicolor22 versicolor versicolor versicolor versicolor versicolor versicolor virginicaLevels: setosa versicolor virginica> plot> ,k=4,border="light grey")#用淺灰色矩形框出4分類聚類結果> ,k=3,border="dark grey")#用淺灰色矩形框出3分類聚類結果> ,k=7,which=c(2,6),border="dark

14、 grey")bfrndrdgrom# DBSCAN基于密度的聚類 > ("fpc") > library(fpc)半徑參數(shù)為1,密度閾值為50.51.52 5anaZSepal.LengthSepal.WidthA2 血 3u0 MZ oCD曲朋aZxA 增ia禽俺迪也Petal. LengthP etal. Width> ds1=dbsca n(iris,1:4,e ps=1,Mi nP ts=5)#> ds1 dbscan Pts=150 Mi nP ts=5 eps=11 2 border 0 1 seed 50 99 total

15、50 100> ds2=dbsca n(iris,1:4,e ps=4,Mi nP ts=5)> ds3=dbsca n(iris,1:4,e ps=4,Mi nP ts=2)> ds4=dbsca n(iris,1:4,e ps=8,Mi nP ts=2)> p ar(mfcol=c(2,2)> plot(ds1,iris,1:4,mai n="1: MinP ts=5 ep s=1")> plot(ds3,iris,1:4,mai n="3: MinP ts=2 ep s=4")> plot(ds2,iris

16、,1:4,mai n="2: MinP ts=5 ep s=4")> plot(ds4,iris,1:4,mai n="4: MinP ts=2 ep s=8")4: IVIinPts=2 eps=81計算數(shù)據(jù)集的距離矩陣d> d=dist(iris,1:4)#> max(d);min(d)#計算數(shù)據(jù)集樣本的距離的最值1 0> ("gg plot2")> library(gg plot2)> in terval=cut_i nterval(d,30)> table(i nterval)inte

17、rval0,1total 150dbscan Pts=150 Mi nP ts=3 eps=388585876891831688,543369379339335406,458459465480468505,,349385321291187138,9792785018> (table(i nterval), > for(i in 3:5)+ for(j in 1:10)+ ds=dbsca n(ins,1:4,e ps=i,M inP ts=j)+ prin t(ds)+ + dbscan Pts=150 MinP ts=1 ep s=31 seed 150 total 150 db

18、scan Pts=150 Mi nP ts=2 eps=3 seed 150seed 150 total 150 dbscan Pts=150 Mi nP ts=4 eps=3 seed 150 total 150 dbscan Pts=150 Mi nP ts=5 eps=3 seed 150 total 150 dbscan Pts=150 Mi nP ts=6 eps=3 seed 150 total 150 dbscan Pts=150 Mi nP ts=7 eps=3 seed 150 total 150 dbscan Pts=150 Mi nP ts=8 eps=3 seed 15

19、0 total 150 dbscan Pts=150 Mi nP ts=9 eps=3 seed 150 total 150 dbscan Pts=150 Mi nP ts=10 eps=3 seed 150 total 150 dbscan Pts=150 MinP ts=1 ep s=4total 150 dbscan Pts=150 Mi nP ts=2 eps=4 seed 150 total 150 dbscan Pts=150 Mi nP ts=3 eps=4 seed 150 total 150 dbsca n Pts=150 Mi nP ts=4 ep s=4 seed 150

20、 total 150 dbsca n Pts=150 Mi nP ts=5 ep s=4 seed 150 total 150 dbscan Pts=150 Mi nP ts=6 eps=4 seed 150 total 150 dbscan Pts=150 Mi nP ts=7 eps=4 seed 150 total 150 dbscan Pts=150 Mi nP ts=8 eps=4 seed 150 total 150 dbscan Pts=150 Mi nP ts=9 eps=4 seed 150 total 1501seed 150seed 150 total 150 dbsca

21、 n Pts=150 Mi nP ts=1 ep s=5 seed 150 total 150 dbscan Pts=150 Mi nP ts=2 eps=5 seed 150 total 150 dbscan Pts=150 Mi nP ts=3 eps=5 seed 150 total 150 dbscan Pts=150 Mi nP ts=4 eps=5 seed 150 total 150 dbscan Pts=150 Mi nP ts=5 eps=5 seed 150 total 150 dbscan Pts=150 Mi nP ts=6 eps=5 seed 150 total 1

22、50 dbscan Pts=150 Mi nP ts=7 eps=5 seed 150 total 150 dbscan Pts=150 Mi nP ts=8 eps=5total 150 dbscan Pts=150 Mi nP ts=9 eps=51 seed 150 total 150 dbscan Pts=150 Mi nP ts=10 eps=51 seed 150 total 150 #30次dbscan的聚類結果> ds5=dbsca n(iris,1:4,e ps=3,Mi nP ts=2)> ds6=dbsca n(iris,1:4,e ps=4,Mi nP ts

23、=5)> ds7=dbsca n(iris,1:4,e ps=5,Mi nP ts=9)> p ar(mfcol=c(1,3)> plot(ds5,iris,1:4,mai n="1: MinP ts=2 ep s=3")> plot(ds6,iris,1:4,mai n="3: MinP ts=5 ep s=4")> plot(ds7,iris,1:4,mai n="2: MinP ts=9 ep s=5")2: IVIinPts=9 eps=50.51.52 5anaZn:3;訂 ji* 整Sepal

24、.LengthSepal.WidthA2 血 3u0 MeCDAAU A曲朋aZxA輜fl編駁曲Petal. LengthP etal. Width4,aX朋2# EM期望最大化聚類1100%> ("mclust")> library(mclust)> fit_EM=Mclust(iris,1:4)fitting .> summary(fit_EM)Gaussia n fin ite mixture model fitted by EM algorithmMclust VEV (elli psoidal, equal sha pe) model wi

25、th 2 components:n df BIC ICL150 26Clusteri ng table:1 2 50 100> summary(fit_EM ,p arameters=TRUE)Gaussia n fin ite mixture model fitted by EM algorithmMclust VEV (elli psoidal, equal sha pe) model with 2 components:n df BIC ICL150 26Clusteri ng table:1 2 50 100Mixing p robabilities:Mea ns:1 2,1 ,

26、20.0.Varia nces:,10. 0.0. 0.,20.0.0.0.> plot(fit_EM)# 對EM聚類結果作圖Model-based clusteri ng pl ots: 1: BIC 2: classificati on 3: un certa inty4: den sitySelectio n:(下面顯示選項)#選1 2 3 4 S G 7 S 9IN umber of componentaOS.SHUOst00®.co®.SOL#選22.0 2 5 3u0 3.5 4 0OoriSep al. Lengthn «|UF DJWWB審醪

27、零cnP4 B 55 6S 75D gI INSepaLWidth口卩oS韻D>vQPetal. LengthOB 1 a 15 20 25 I L 丄 仁En 口Petal WidthingiagLpgg#選3orio4 B 55 6S 75OB 1 a 15 20 25g (0ggini<#選42.0 2.5 3.0 3.5 4 005 1.0- 1.S 20 2 5+.5556.S7.523*567Selectio n: 0 > iris_BIC=mclustBIC(iris,1:4)fitting .1100%> iris_BICsum=summary(ins_

28、BIC,data=iris,1:4) > iris BICsum #獲取數(shù)1據(jù)集iris 在各模型和類別數(shù)下的BIC值Best BIC values:VEV,2VEV,3 VVV,2BICBIC diffClassification table for model (VEV,2):50 100 > iris_BICBayesia n In formatio n Criteri on (BIC):EllVII EEI VEI EVI VVI EEEEVEVEE VVE EEV VEV EVV VVVNANANANANATop 3 models based on the BIC cri

29、teri on:VEV,2 VEV,3 VVV,2> p ar(mfcol=c(1,1)> p lot(iris_BIC,G=1:7,col="yellow")3456Number of comp orientsOSEEVEVVEEEEIWEVEIEEVEVIVEVWlEWEEEWV7g?-> mclust2D plot(iris,1:2,+classificati on=iris_BICsum$classificati on,+p arameters=iris_BICsum$ parameters,col="yellow")4.55.05.5606.57.07.58.0對每一個樣本進行密度估計SepalLength> iris_De ns=de nsityMclust(iris,1:2)#fitting .> iris Dens 'de nsityMclust' model object: (VEV,2)Available components:1 &q

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論