版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
命題方式:單獨(dú)命題
佛山科學(xué)技術(shù)學(xué)院2008-2009學(xué)年第一學(xué)期
《數(shù)據(jù)分析》課程期末考試試題A卷
專業(yè)、班級:姓名:學(xué)號:
題號—?~~三四五六七八九十十二總成績
得分
說明:1.請仔細(xì)閱讀題目,按要求在SAS軟件系統(tǒng)編程運(yùn)算:
2.將SAS程序及運(yùn)算的有關(guān)結(jié)果作為解答copy到試卷的后面.
一、(12分)有關(guān)SAS的簡答題:
1、SAS所采用的Windows操作系統(tǒng)中,SAS界面有哪三個部分?
日志框,編輯框,輸出窗口
2、怎樣輸入非數(shù)值變量?
在非數(shù)值變量后面加“$”
3、與固定格式不同的自由格式輸入數(shù)據(jù)應(yīng)加上何種標(biāo)記?
加上”@@”
4、寫出三均值的計算公式。
入111
二、(15分)北京市GDP同比增長1978?1995年的數(shù)據(jù)如下:
100.00107.57112.4296.21121.58107.21117.16116.19101.37
109.78112.83104.37105.40109.50111.60112.10113.50112.40
(D計算均值、方差、標(biāo)準(zhǔn)差、變異系數(shù)、偏度、峰度;
(2)計算中位數(shù),上、下四分位數(shù),四分位極差;
(3)做出直方圖、QQ圖、莖葉圖、箱線圖;
(4)進(jìn)行正態(tài)性W檢驗(yàn)(取a=0.05).
共3頁第1頁
三、(15分)已知數(shù)據(jù)如下:
X1x2x3x4(1)計算協(xié)方差矩陣,Pearson相關(guān)矩陣;
16.726.76.435.0(2)分析各指標(biāo)間的相關(guān)性(取a=(H0)
18.228.03.229.7
16.726.72.134.9
18.126.74.331.5
16.726.03.032.7
18.130.27.034.9
20.230.54.834.4
20.229.55.536.2
21.531.55.836.5
18.830.65.435.4
21.627.85.434.1
21.329.55.835.8
四、(15分)已知某工廠產(chǎn)量y及工人數(shù)xl、
成本x2的有關(guān)數(shù)據(jù)如下:
序號yxlx2(1)求回歸方程,給出各參數(shù)的實(shí)際解
11692653782釋;
281983008(2)求出方差分析、參數(shù)估計的結(jié)果;
31923302450
41161952137
555532560
61622742450
71201803254
82233753802
91312052838
1067862347
五、(13分)已知數(shù)據(jù)如下:
X1x2x3x4x5x6x7
12.516.416.722.829.33.01726.6
7.89.910.212.617.60.84110.6
13.410.99.910.913.91.77217.8
19.119.819.029.739.62.44935.8
8.09.88.911.916.20.78913.7
9.74.24.24.66.50.8743.9
0.60.70.70.81.10.0561.0
13.99.49.39.813.32.12617.1
9.111.39.512.216.41.32711.6
對以上樣本進(jìn)行主成分分析,并求出相應(yīng)的主成分.
共3頁第2頁
六、(15分)已知數(shù)據(jù)如下:
序號類別X1x2x3x4x5x6x7
36.057.137.7516.6711.682.3812.88
37.697.018.9416.1511.080.8311.67
38.696.018.8214.7911.441.7413.23
37.759.618.4913.159.761.2811.28
35.718.048.3115.137.761.4113.25
39.778.4912.9419.2711.052.0413.29
40.917.328.9417.6012.751.1414.80
33.707.5910.9818.8214.731.7810.10
35.024.726.2810.037.151.9310.39
52.417.709.9812.5311.702.3114.69
52.653.849.1613.0315.261.9814.57
55.855.507.459.559.522.2116.30
44.687.3214.5117.1312.081.2611.57
45.797.6610.3616.5612.862.7511.69
50.3711.3513.3019.2514.592.7514.87
64.348.0022.2220.0615.120.7222.89
(1)求出三個協(xié)方差矩陣;
(2)用距離判別求出線性判別函數(shù),用交叉確認(rèn)法計算誤判率;
(3)判別待判樣品屬于哪一類.
七、(15分)利用上一題的數(shù)據(jù)(共16個)進(jìn)行聚類分析:
(1)最短距離法,寫出聚類過程,畫出譜系圖(取nclusters=4);
(2)最長距離法,寫出聚類過程,畫出譜系圖(取nclusters=4),求出四個聚類
統(tǒng)計量;
(3)快’速聚類法分3類的結(jié)果,在平面坐標(biāo)系中畫出分類圖.
共3頁第3頁
-(1)SAS界面包括
輸出框,日志框,編輯器
(2)在非數(shù)值變量后面家上“$”符號.
(3)自由格式輸入數(shù)據(jù)應(yīng)加上“@@”標(biāo)記.
A
(4)三均值的計算公式M=1/4Q1+1/2M+1/4Q3
程序:
datat1;
inputx@@;
cards;
100.00107.57112.4296.21121.58107.21117.16116.19101.37
109.78112.83104.37105.40109.50111.60112.1113.50112.40
procunivariateplotnormal;
run;
proccapabilitygraphicsnormal;
histogramx/normal;
qqplotx/normal(....);
run;
N18權(quán)重總和18
均值109.510556觀測總和1971.19
標(biāo)準(zhǔn)偏差6.36948929方差40.5703938
偏度-0.3324812峰度0.05978054
未校平方和216555.809校正平方和689.696694
變異系數(shù)5.81632451標(biāo)準(zhǔn)誤差均值1.50130302
(1)由上圖可知道
均值:109.510556方差:40.5703938
變異系數(shù):5.81632451峰度:0.05978054
偏度:-0.3324812
(2)
中位數(shù):巴庫數(shù)110.69Q0J
上四分位數(shù):IZ5O3112.831
下四分位數(shù):幽Qi項(xiàng)畫1
四分位極差:嶼分位被差7.43000|
(3)做出直方圖、QQ圖、莖葉圖、箱線圖
直方圖:
QQ圖
25-
20-
15-
10-
05-
oo-
95'
-2-1012
莖葉圖:
莖葉#
1221
11672
11002222348
105783
100143
981
——+------+-------+-------+
莖?葉乘以10**+1
箱線圖:
一一最小值——
值觀測
104.3712
105.4013
盒形圖
I
+---+
(4)進(jìn)行正態(tài)性w檢驗(yàn)(取a=0.05).
檢驗(yàn)——統(tǒng)計量-----------P值-------
w
pr<D
Shapiro-WiIkW0.978265pr>0.9304
KoImogorov-SmirnovD0.128559>0.1500
pr>W-sq
Cramer-vonMisesW-Sq0.044882pr>sq>0.2500
Anderson-DarIingA-Sq0.247567A->0.2500
由上圖可以知道Wo=0.978265,P=0.9304>a=0.05;
故不能拒絕原假設(shè)Ho,所以是高度顯著的。
三
datat2;
inputxl-x4;
cards
16.726.76.435.0
18.228.03.229.7
16.726.72.134.9
18.126.74.331.5
16.726.03.032.7
18.130.27.034.9
20.230.54.834.4
20.229.55.536.2
21.531.55.836.5
18.830.65.435.4
21.627.85.434.1
21.329.55.835.8
proccorrcovpearson;
run;
(1)計算協(xié)方差矩陣,Pearson相關(guān)矩陣;
協(xié)方差矩陣:
蝕方差矩陣,自由度二11
xlx2x3x4
xl3.5771969702.3332575761.2264393941.542196970
x22.3332575763.5244696971.5731060612.067348485
x31.2264393941.5731060612.1681060611.643257576
x41.5421969702.0673484851.6432575764.064469697
Pearson相關(guān)矩陣:
xlx2x3x4
xl1.000000.657120.440390.40445
0.02020.15190.1922
x20.657121.000000.569080.54622
0.02020.05350.0662
x30.440390.569081.000000.55356
0.15190.05350.0619
x40.404450.546220.553561.00000
0.19220.06620.0619
(2)分析各指標(biāo)間的相關(guān)性(取a=0.10)
山Pearson相關(guān)矩陣的上三角矩陣看出rl3,rl4都大于0.10
故這些向量的相關(guān)性不是很強(qiáng)。
四:
data14;
inputnum$yxlx2;
cardsr
11692653782
281983008
31923302450
41161952137
555532560
61622742450
71201803254
82233753802
91312052838
1067862347
procregdata=t4;
modely=xl-x2/i;
run;
(1)求回歸方程,給出各參數(shù)的實(shí)際解釋
Parameter
VariableDFEstimate
Intercept14.14260
xl10.49482
x210.00890
由上圖可以知道
000890
8)=4.14260,jgi=0.49482,j^^-
回歸方程為y=4.14260+0.49482x1+0.00890x2;
工廠產(chǎn)量y及工人數(shù)xl、成本X2的有關(guān)數(shù)據(jù)如下
Bo為基本產(chǎn)量,當(dāng)成本x2固定時,工人數(shù)xl每增加?個單位,產(chǎn)量y就增加0.49482個
單位,同理當(dāng)成本xl固定時,成本x2每增加?個單位,產(chǎn)量y就增加0.00890個單位。
(2)求出方差分析、參數(shù)估計的結(jié)果
方差分析:
Analysisotvariance
SumofMean
SourceDFSquaresSquareFValuePr>F
Model227272136362935.52<.0001
Error732.516074.64515
CorrectedTotal927304
由方差分析圖可以知道
32=4.64515
R2=SSM/SST=27272/27304=0.9988
F值為2935.52
參數(shù)估計
ParameterEstimates
ParameterStandard
VariableDFEstimateErrortValuePr>HI
Intercept14.142603.555111.170.2821
xl10.494820.0073467.43<.0001
x210.008900.001336.700.0003
第五題:
datat5;
inputxl-x7;
cards;
12.516.416.722.829.33.01726.6
7.89.910.212.617.60.84110.6
13.410.99.910.913.91.77217.8
19.119.819.029.739.62.44935.8
8.09.88.911.916.20.78913.7
9.74.24.24.66.50.8743.9
0.60.70.70.81.10.0561.0
13.99.49.39.813.32.12617.1
9.111.39.512.216.41.32711.6
procprincomp;
run;
EigenvaluesoftheCorrelationMatrix
EigenvalueDifferenceProportionCumu1ative
16.368806955.970882200.90980.9098
20.397924750.237540340.05680.9667
30.160384420.114957090.02290.9896
40,045427330.023012480.00650.9961
50.022414850.017666030.00320.9993
60.004748820.004455930.00071.0000
70.000292890.00001.0000
Eigenvectors
PrinlPrin2Prin3Prin4Prin5PrinGPrin7
0.3488240.6123630.6820500.1332460.136972-.013602-.037959
0.390078-.1767270.0020060.456233-.5905200.5062310.058911
0.391810-.169297-.1106550.344580-.130939-.813353-.090308
0.385562-.3496340.020863-.1028180.4029360.226117-.710356
0.383622-.3754840.096918-.0477990.4649390.0779110.691324
0.3537200.549207-.7153370.0297170.2041800.1272680.052690
0.3894910.0160380.031998-.801013-.441775-.0927680.040273
特征值:
xl=6.36880695,x2=0.39792475,x3=0.16038442,x4=0.04542733,x5=0.02241485,x6=0.00474882.
X7=0.00029289;
ProportionCumulative
0.90980.9098
0.05680.9667
0.02290.9896
0.00650.9961
0.00320.9993
0.00071.0000
0.00001.0000
貢獻(xiàn)率和累計貢獻(xiàn)率分別為:
各主成分分別為:由于W1已經(jīng)達(dá)到了90%所以第一主成分為
wl=0.348824X1+0.390078X2+0.391810X3+0.385562X4+0.383622X5+0.353720X6+0.389491x
7
六:
datat6;
inputxy$xl-x7;
cards;
136.057.137.7516.6711.682.3812.88
137.697.018.9416.1511.080.8311.67
138.696.018.8214.7911.441.7413.23
137.759.618.4913.159.761.2811.28
135.718.048.3115.137.761.4113.25
139.778.4912.9419.2711.052.0413.29
140.917.328.9417.6012.751.1414.80
133.707.5910.9818.8214.731.7810.10
135.024.726.2810.037.151.9310.39
252.417.709.9812.5311.702.3114.69
252.653.849.1613.0315.261.9814.57
255.855.507.459.559.522.2116.30
244.687.3214.5117.1312.081.2611.57
245.797.6610.3616.5612.862.7511.69
250.3711.3513.3019.2514.592.7514.87
datat61;
inputxl-x7;
cards;
64.348.0022.2220.0615.120.7222.89
procdiscrimdata=t6testdata=t61
out=al
outstat=a2outcross=a3
testout=a4method=normal
listcrosslisttestlistall;
classxy;
varxl-x7;
priorsequal;
run;
(1)求出三個協(xié)方差矩陣;
S!=
Variablexlx2x3x4x5xGX?
xl136.3561056-12.7039611-32.1020333-43.9701278-4.7449722-0.278977861,7896722
x2-12.703961147.685705632.455933347,69347229.33239441.9405222-0.4706611
x3-32.102033332.455933363.950133377.496000029.5911333-1.5631000-11.4411667
x4-43.970127847.693472277.4960000131.109872263.98726111.9098222-6.8091944
x5-4.74497229.332394429.591133363.987261165.85563891.1910111-1.2275389
xG-0.27897781.9405222-1.56310001.90982221.19101113.45262220.3389556
X?61.7996722-0.4706611-11.4411667-6.8091944-1.22753890.938955637,3281722
S2=
Variablexlx2x3x4x5xGx7
xl18.54121667-3.74661667-8.57356667-11.76273000-2.189896670.522380007.93868333
x2-3.746616676.374656674.282866676.663030000.830496670.63964000-0.64302333
x3-8.573566674.282866676.958386678.268320001.92554667-0.42338000-3.00631333
x4-11.762730006.663030008.2683200012.818710004.331510000.26400000-4.10899000
x5-2.169896670.830496671.925546674.331510004.328416670.20144000-0.75466333
x60.522380000.63964000-0.423380000.264000000.201440000.309720000.29376000
X?7.93868333-0.64302333-3.00631333-4.10899000-0.754663330.293760003.61457667
S=
Variablexlx2x3x4x5xGx7
xl10.48893120-0.97722778-2.48938718-3.38231752-0.36499786-0.021459834.75382094
x2-0.977227783.668131202.496610263.668728630.717876500.14927094-0.03620470
x3-2.469387182.496610264.919241035.361230772.27624103-0.12023846-0.88008974
x4-3.382317523.668728635.9612307710.085382484.322097010.14690940-0.52378419
x5-0.364997880.717876502.276241034.922097015.065818380.09161824-0.09442607
x6-0.021459830.14927094-0.120238460.146909400.091616240.265586320.07222735
x74.75382094-0.03620470-0.88008974-0.52378419-0.094426070.072227352.87139786
(2)用距離判別求出線性判別函數(shù),用交叉確認(rèn)法計算誤判率;
LinearDiscriminantFundionforxy
Variable12
Constant-206.18758-382.57458
xl16.6024023.14210
x2-2.77150-3.89531
x3-5.80267-5.94472
x414.1735917.23215
x5-8.00073-10.19191
xG7.4917412.60276
x7-22.87514-32.83581
由上圖可以知道線性判別函數(shù)為:
W!=-206.18758+16.6024x1-2.77150x2-5.80267x3+14.17359x4-8.00073x5+7.4917
4x6-22.87514x7
W2=-382.57458+23.14210x1-3.89531x2-5.94472x3+17.23215x4-10.19191x5+12.6
0276x6-32.83581x7
PosteriorProbabi1ityofMembershipinxy
FromClassified
Obsxyintoxy12
1111.00000.0000
2111.00000.0000
3111.00000.0000
4111.00000.0000
5111.00000.0000
612*0.00001.0000
7111.00000.0000
8111.00000.0000
9111.00000.0000
10220.00001.0000
11220.00001.0000
12220.00001.0000
1321*1.00000.0000
14220.00001.0000
15220.00001.0000
*Misclassifiedobservation
用交叉確認(rèn)法計算誤判率P=2/15=13.33%
(3)判別待判樣品屬于哪一類.
PosteriorProbabiIityofMembershipinxy
Classified
Obsintoxy12
120.00001.0000
判別待判樣品屬于2類
七、(15分)利用上一題的數(shù)據(jù)(共16個)進(jìn)行聚類分析:
(1)最短距離法,寫出聚類過程,畫出譜系圖(取nclusters=4);
(2)最長距離法,寫出聚類過程,畫出譜系圖(取nclusters=4),求出四個聚類統(tǒng)
計量;
(3)快速聚類法分3類的結(jié)果,在平面坐標(biāo)系中畫出分類圖.
datat7;
inputxl-x7;
cards;
36.057.137.7516.6711.682.3812.88
37.697.018.9416.1511.080.8311.67
38.696.018.8214.7911.441.7413.23
37.759.618.4913.159.761.2811.28
35.718.048.3115.137.761.4113.25
39.778.4912.9419.2711.052.0413.29
40.917.328.9417.6012.751.1414.80
33.707.5910.9818.8214.731.7810.10
35.024.726.2810.037.151.9310.39
52.417.709.9812.5311.702.3114.69
52.653.849.1613.0315.261.9814.57
55.855.507.459.559.522.2116.30
44.687.3214.5117.1312.081.2611.57
45.797.6610.3616.5612.862.7511.69
50.3711.3513.3019.2514.592.7514.87
64.348.0022.2220.0615.120.7222.89
procclusterdata=t7method=sinstdnonormouttree=treel;
varxl-x7;
run;
proctreedata=treelgraphicshorizontalout=clnclusters=4;
run;
procprintdata=cl;
run;
procclusterdata=t7method=comstdnonormouttree=tree2;
varxl-x7;
run;
proctreedata=treezgraphicshorizontalout=c2nclusters=4;
run;
procprintdata=c2;
run;
procfastclusmaxc=3distancelistcluster=c
data=t6out=d;
run;
procplot;
plotx2*xl=c;
run;
(1)最短距離法,寫出聚類過程,畫出譜系圖(取nclusters=4);
ClusterHistory
Min
NCL-ClustersJoined—FREQDist
oB1onuQ2
15on2oD721.3976
14Dn—1.4581
o4oDb-2
13cB15oD1431.525
12cL12cU1—451.5721
11L—1.6783
o6oB.132
10cB11cL1371.8356
9LB—01.8609
cL9cL19
8—01.865
cL8oL.10
7cL7oB1211.9501
66oB—22.0097
cLo
5cL5oB1132.126
4B—
cL4oBQ42.6429
33ou52.707
cLB.15
2cL2oB1662.7151
1—5.0941
OBI----------------------------------1
0B3----------------------------------1
OBI4----------------------------------------
0B2-------------------------------------_
0B7-------------------------------------1
0B4----------------------------------------------
0B5--------------------------------------1
0B6----------------------------------------------
0B13----------------------------------------------
OB10-------------------------------------------------
0B12--------------------------------------------------
0B8------------------------------------------------------
0B11------------------------------------------------------------------
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年房地產(chǎn)開發(fā)土方合同
- 2024年文化藝術(shù)節(jié)活動組織與承辦合同
- 2024年修訂版技術(shù)轉(zhuǎn)讓合同
- 2024年文化旅游綜合體開發(fā)合作協(xié)議
- DB4117T 233-2018 夏芝麻高產(chǎn)高效栽培技術(shù)規(guī)程
- DB4117T 169.13-2023 動物疫病流行病學(xué)調(diào)查技術(shù)規(guī)范 第13部分:雞傳染性支氣管炎
- 2024年工業(yè)用水管道安裝協(xié)議
- 2024年度××建筑工程項(xiàng)目施工合同
- 質(zhì)檢個人年度總結(jié)(7篇)
- 2024年技術(shù)轉(zhuǎn)讓合同:研發(fā)機(jī)構(gòu)將自主研發(fā)的技術(shù)轉(zhuǎn)讓給企業(yè)
- 2024中國鐵塔集團(tuán)湖南分公司招聘24人高頻考題難、易錯點(diǎn)模擬試題(共500題)附帶答案詳解
- (高清版)JT∕T 1402-2022 交通運(yùn)輸行政執(zhí)法基礎(chǔ)裝備配備及技術(shù)要求
- 中華聯(lián)合保險集團(tuán)股份有限公司行測筆試題庫2024
- 印刷服務(wù) 投標(biāo)方案(技術(shù)方案)
- 必修一《數(shù)據(jù)與計算》復(fù)習(xí)提綱與練習(xí)題
- 三級公立醫(yī)院績效考核微創(chuàng)手術(shù)目錄(2022版)
- 危險駕駛罪課件講解
- HJ 704-2014 土壤 有效磷的測定 碳酸氫鈉浸提-鉬銻抗分光光度法
- 四年級寒假奧數(shù)培優(yōu)講義-4-04-倒推法解題4-講義-教師
- 教師心理健康B證心得體會7篇
- DZ∕T 0317-2018 陸上石油天然氣開采業(yè)綠色礦山建設(shè)規(guī)范(正式版)
評論
0/150
提交評論