《數(shù)據(jù)分析》課程期末考試試題A卷_第1頁
《數(shù)據(jù)分析》課程期末考試試題A卷_第2頁
《數(shù)據(jù)分析》課程期末考試試題A卷_第3頁
《數(shù)據(jù)分析》課程期末考試試題A卷_第4頁
《數(shù)據(jù)分析》課程期末考試試題A卷_第5頁
已閱讀5頁,還剩11頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

命題方式:單獨(dú)命題

佛山科學(xué)技術(shù)學(xué)院2008-2009學(xué)年第一學(xué)期

《數(shù)據(jù)分析》課程期末考試試題A卷

專業(yè)、班級:姓名:學(xué)號:

題號—?~~三四五六七八九十十二總成績

得分

說明:1.請仔細(xì)閱讀題目,按要求在SAS軟件系統(tǒng)編程運(yùn)算:

2.將SAS程序及運(yùn)算的有關(guān)結(jié)果作為解答copy到試卷的后面.

一、(12分)有關(guān)SAS的簡答題:

1、SAS所采用的Windows操作系統(tǒng)中,SAS界面有哪三個部分?

日志框,編輯框,輸出窗口

2、怎樣輸入非數(shù)值變量?

在非數(shù)值變量后面加“$”

3、與固定格式不同的自由格式輸入數(shù)據(jù)應(yīng)加上何種標(biāo)記?

加上”@@”

4、寫出三均值的計算公式。

入111

二、(15分)北京市GDP同比增長1978?1995年的數(shù)據(jù)如下:

100.00107.57112.4296.21121.58107.21117.16116.19101.37

109.78112.83104.37105.40109.50111.60112.10113.50112.40

(D計算均值、方差、標(biāo)準(zhǔn)差、變異系數(shù)、偏度、峰度;

(2)計算中位數(shù),上、下四分位數(shù),四分位極差;

(3)做出直方圖、QQ圖、莖葉圖、箱線圖;

(4)進(jìn)行正態(tài)性W檢驗(yàn)(取a=0.05).

共3頁第1頁

三、(15分)已知數(shù)據(jù)如下:

X1x2x3x4(1)計算協(xié)方差矩陣,Pearson相關(guān)矩陣;

16.726.76.435.0(2)分析各指標(biāo)間的相關(guān)性(取a=(H0)

18.228.03.229.7

16.726.72.134.9

18.126.74.331.5

16.726.03.032.7

18.130.27.034.9

20.230.54.834.4

20.229.55.536.2

21.531.55.836.5

18.830.65.435.4

21.627.85.434.1

21.329.55.835.8

四、(15分)已知某工廠產(chǎn)量y及工人數(shù)xl、

成本x2的有關(guān)數(shù)據(jù)如下:

序號yxlx2(1)求回歸方程,給出各參數(shù)的實(shí)際解

11692653782釋;

281983008(2)求出方差分析、參數(shù)估計的結(jié)果;

31923302450

41161952137

555532560

61622742450

71201803254

82233753802

91312052838

1067862347

五、(13分)已知數(shù)據(jù)如下:

X1x2x3x4x5x6x7

12.516.416.722.829.33.01726.6

7.89.910.212.617.60.84110.6

13.410.99.910.913.91.77217.8

19.119.819.029.739.62.44935.8

8.09.88.911.916.20.78913.7

9.74.24.24.66.50.8743.9

0.60.70.70.81.10.0561.0

13.99.49.39.813.32.12617.1

9.111.39.512.216.41.32711.6

對以上樣本進(jìn)行主成分分析,并求出相應(yīng)的主成分.

共3頁第2頁

六、(15分)已知數(shù)據(jù)如下:

序號類別X1x2x3x4x5x6x7

36.057.137.7516.6711.682.3812.88

37.697.018.9416.1511.080.8311.67

38.696.018.8214.7911.441.7413.23

37.759.618.4913.159.761.2811.28

35.718.048.3115.137.761.4113.25

39.778.4912.9419.2711.052.0413.29

40.917.328.9417.6012.751.1414.80

33.707.5910.9818.8214.731.7810.10

35.024.726.2810.037.151.9310.39

52.417.709.9812.5311.702.3114.69

52.653.849.1613.0315.261.9814.57

55.855.507.459.559.522.2116.30

44.687.3214.5117.1312.081.2611.57

45.797.6610.3616.5612.862.7511.69

50.3711.3513.3019.2514.592.7514.87

64.348.0022.2220.0615.120.7222.89

(1)求出三個協(xié)方差矩陣;

(2)用距離判別求出線性判別函數(shù),用交叉確認(rèn)法計算誤判率;

(3)判別待判樣品屬于哪一類.

七、(15分)利用上一題的數(shù)據(jù)(共16個)進(jìn)行聚類分析:

(1)最短距離法,寫出聚類過程,畫出譜系圖(取nclusters=4);

(2)最長距離法,寫出聚類過程,畫出譜系圖(取nclusters=4),求出四個聚類

統(tǒng)計量;

(3)快’速聚類法分3類的結(jié)果,在平面坐標(biāo)系中畫出分類圖.

共3頁第3頁

-(1)SAS界面包括

輸出框,日志框,編輯器

(2)在非數(shù)值變量后面家上“$”符號.

(3)自由格式輸入數(shù)據(jù)應(yīng)加上“@@”標(biāo)記.

A

(4)三均值的計算公式M=1/4Q1+1/2M+1/4Q3

程序:

datat1;

inputx@@;

cards;

100.00107.57112.4296.21121.58107.21117.16116.19101.37

109.78112.83104.37105.40109.50111.60112.1113.50112.40

procunivariateplotnormal;

run;

proccapabilitygraphicsnormal;

histogramx/normal;

qqplotx/normal(....);

run;

N18權(quán)重總和18

均值109.510556觀測總和1971.19

標(biāo)準(zhǔn)偏差6.36948929方差40.5703938

偏度-0.3324812峰度0.05978054

未校平方和216555.809校正平方和689.696694

變異系數(shù)5.81632451標(biāo)準(zhǔn)誤差均值1.50130302

(1)由上圖可知道

均值:109.510556方差:40.5703938

變異系數(shù):5.81632451峰度:0.05978054

偏度:-0.3324812

(2)

中位數(shù):巴庫數(shù)110.69Q0J

上四分位數(shù):IZ5O3112.831

下四分位數(shù):幽Qi項(xiàng)畫1

四分位極差:嶼分位被差7.43000|

(3)做出直方圖、QQ圖、莖葉圖、箱線圖

直方圖:

QQ圖

25-

20-

15-

10-

05-

oo-

95'

-2-1012

莖葉圖:

莖葉#

1221

11672

11002222348

105783

100143

981

——+------+-------+-------+

莖?葉乘以10**+1

箱線圖:

一一最小值——

值觀測

104.3712

105.4013

盒形圖

I

+---+

(4)進(jìn)行正態(tài)性w檢驗(yàn)(取a=0.05).

檢驗(yàn)——統(tǒng)計量-----------P值-------

w

pr<D

Shapiro-WiIkW0.978265pr>0.9304

KoImogorov-SmirnovD0.128559>0.1500

pr>W-sq

Cramer-vonMisesW-Sq0.044882pr>sq>0.2500

Anderson-DarIingA-Sq0.247567A->0.2500

由上圖可以知道Wo=0.978265,P=0.9304>a=0.05;

故不能拒絕原假設(shè)Ho,所以是高度顯著的。

datat2;

inputxl-x4;

cards

16.726.76.435.0

18.228.03.229.7

16.726.72.134.9

18.126.74.331.5

16.726.03.032.7

18.130.27.034.9

20.230.54.834.4

20.229.55.536.2

21.531.55.836.5

18.830.65.435.4

21.627.85.434.1

21.329.55.835.8

proccorrcovpearson;

run;

(1)計算協(xié)方差矩陣,Pearson相關(guān)矩陣;

協(xié)方差矩陣:

蝕方差矩陣,自由度二11

xlx2x3x4

xl3.5771969702.3332575761.2264393941.542196970

x22.3332575763.5244696971.5731060612.067348485

x31.2264393941.5731060612.1681060611.643257576

x41.5421969702.0673484851.6432575764.064469697

Pearson相關(guān)矩陣:

xlx2x3x4

xl1.000000.657120.440390.40445

0.02020.15190.1922

x20.657121.000000.569080.54622

0.02020.05350.0662

x30.440390.569081.000000.55356

0.15190.05350.0619

x40.404450.546220.553561.00000

0.19220.06620.0619

(2)分析各指標(biāo)間的相關(guān)性(取a=0.10)

山Pearson相關(guān)矩陣的上三角矩陣看出rl3,rl4都大于0.10

故這些向量的相關(guān)性不是很強(qiáng)。

四:

data14;

inputnum$yxlx2;

cardsr

11692653782

281983008

31923302450

41161952137

555532560

61622742450

71201803254

82233753802

91312052838

1067862347

procregdata=t4;

modely=xl-x2/i;

run;

(1)求回歸方程,給出各參數(shù)的實(shí)際解釋

Parameter

VariableDFEstimate

Intercept14.14260

xl10.49482

x210.00890

由上圖可以知道

000890

8)=4.14260,jgi=0.49482,j^^-

回歸方程為y=4.14260+0.49482x1+0.00890x2;

工廠產(chǎn)量y及工人數(shù)xl、成本X2的有關(guān)數(shù)據(jù)如下

Bo為基本產(chǎn)量,當(dāng)成本x2固定時,工人數(shù)xl每增加?個單位,產(chǎn)量y就增加0.49482個

單位,同理當(dāng)成本xl固定時,成本x2每增加?個單位,產(chǎn)量y就增加0.00890個單位。

(2)求出方差分析、參數(shù)估計的結(jié)果

方差分析:

Analysisotvariance

SumofMean

SourceDFSquaresSquareFValuePr>F

Model227272136362935.52<.0001

Error732.516074.64515

CorrectedTotal927304

由方差分析圖可以知道

32=4.64515

R2=SSM/SST=27272/27304=0.9988

F值為2935.52

參數(shù)估計

ParameterEstimates

ParameterStandard

VariableDFEstimateErrortValuePr>HI

Intercept14.142603.555111.170.2821

xl10.494820.0073467.43<.0001

x210.008900.001336.700.0003

第五題:

datat5;

inputxl-x7;

cards;

12.516.416.722.829.33.01726.6

7.89.910.212.617.60.84110.6

13.410.99.910.913.91.77217.8

19.119.819.029.739.62.44935.8

8.09.88.911.916.20.78913.7

9.74.24.24.66.50.8743.9

0.60.70.70.81.10.0561.0

13.99.49.39.813.32.12617.1

9.111.39.512.216.41.32711.6

procprincomp;

run;

EigenvaluesoftheCorrelationMatrix

EigenvalueDifferenceProportionCumu1ative

16.368806955.970882200.90980.9098

20.397924750.237540340.05680.9667

30.160384420.114957090.02290.9896

40,045427330.023012480.00650.9961

50.022414850.017666030.00320.9993

60.004748820.004455930.00071.0000

70.000292890.00001.0000

Eigenvectors

PrinlPrin2Prin3Prin4Prin5PrinGPrin7

0.3488240.6123630.6820500.1332460.136972-.013602-.037959

0.390078-.1767270.0020060.456233-.5905200.5062310.058911

0.391810-.169297-.1106550.344580-.130939-.813353-.090308

0.385562-.3496340.020863-.1028180.4029360.226117-.710356

0.383622-.3754840.096918-.0477990.4649390.0779110.691324

0.3537200.549207-.7153370.0297170.2041800.1272680.052690

0.3894910.0160380.031998-.801013-.441775-.0927680.040273

特征值:

xl=6.36880695,x2=0.39792475,x3=0.16038442,x4=0.04542733,x5=0.02241485,x6=0.00474882.

X7=0.00029289;

ProportionCumulative

0.90980.9098

0.05680.9667

0.02290.9896

0.00650.9961

0.00320.9993

0.00071.0000

0.00001.0000

貢獻(xiàn)率和累計貢獻(xiàn)率分別為:

各主成分分別為:由于W1已經(jīng)達(dá)到了90%所以第一主成分為

wl=0.348824X1+0.390078X2+0.391810X3+0.385562X4+0.383622X5+0.353720X6+0.389491x

7

六:

datat6;

inputxy$xl-x7;

cards;

136.057.137.7516.6711.682.3812.88

137.697.018.9416.1511.080.8311.67

138.696.018.8214.7911.441.7413.23

137.759.618.4913.159.761.2811.28

135.718.048.3115.137.761.4113.25

139.778.4912.9419.2711.052.0413.29

140.917.328.9417.6012.751.1414.80

133.707.5910.9818.8214.731.7810.10

135.024.726.2810.037.151.9310.39

252.417.709.9812.5311.702.3114.69

252.653.849.1613.0315.261.9814.57

255.855.507.459.559.522.2116.30

244.687.3214.5117.1312.081.2611.57

245.797.6610.3616.5612.862.7511.69

250.3711.3513.3019.2514.592.7514.87

datat61;

inputxl-x7;

cards;

64.348.0022.2220.0615.120.7222.89

procdiscrimdata=t6testdata=t61

out=al

outstat=a2outcross=a3

testout=a4method=normal

listcrosslisttestlistall;

classxy;

varxl-x7;

priorsequal;

run;

(1)求出三個協(xié)方差矩陣;

S!=

Variablexlx2x3x4x5xGX?

xl136.3561056-12.7039611-32.1020333-43.9701278-4.7449722-0.278977861,7896722

x2-12.703961147.685705632.455933347,69347229.33239441.9405222-0.4706611

x3-32.102033332.455933363.950133377.496000029.5911333-1.5631000-11.4411667

x4-43.970127847.693472277.4960000131.109872263.98726111.9098222-6.8091944

x5-4.74497229.332394429.591133363.987261165.85563891.1910111-1.2275389

xG-0.27897781.9405222-1.56310001.90982221.19101113.45262220.3389556

X?61.7996722-0.4706611-11.4411667-6.8091944-1.22753890.938955637,3281722

S2=

Variablexlx2x3x4x5xGx7

xl18.54121667-3.74661667-8.57356667-11.76273000-2.189896670.522380007.93868333

x2-3.746616676.374656674.282866676.663030000.830496670.63964000-0.64302333

x3-8.573566674.282866676.958386678.268320001.92554667-0.42338000-3.00631333

x4-11.762730006.663030008.2683200012.818710004.331510000.26400000-4.10899000

x5-2.169896670.830496671.925546674.331510004.328416670.20144000-0.75466333

x60.522380000.63964000-0.423380000.264000000.201440000.309720000.29376000

X?7.93868333-0.64302333-3.00631333-4.10899000-0.754663330.293760003.61457667

S=

Variablexlx2x3x4x5xGx7

xl10.48893120-0.97722778-2.48938718-3.38231752-0.36499786-0.021459834.75382094

x2-0.977227783.668131202.496610263.668728630.717876500.14927094-0.03620470

x3-2.469387182.496610264.919241035.361230772.27624103-0.12023846-0.88008974

x4-3.382317523.668728635.9612307710.085382484.322097010.14690940-0.52378419

x5-0.364997880.717876502.276241034.922097015.065818380.09161824-0.09442607

x6-0.021459830.14927094-0.120238460.146909400.091616240.265586320.07222735

x74.75382094-0.03620470-0.88008974-0.52378419-0.094426070.072227352.87139786

(2)用距離判別求出線性判別函數(shù),用交叉確認(rèn)法計算誤判率;

LinearDiscriminantFundionforxy

Variable12

Constant-206.18758-382.57458

xl16.6024023.14210

x2-2.77150-3.89531

x3-5.80267-5.94472

x414.1735917.23215

x5-8.00073-10.19191

xG7.4917412.60276

x7-22.87514-32.83581

由上圖可以知道線性判別函數(shù)為:

W!=-206.18758+16.6024x1-2.77150x2-5.80267x3+14.17359x4-8.00073x5+7.4917

4x6-22.87514x7

W2=-382.57458+23.14210x1-3.89531x2-5.94472x3+17.23215x4-10.19191x5+12.6

0276x6-32.83581x7

PosteriorProbabi1ityofMembershipinxy

FromClassified

Obsxyintoxy12

1111.00000.0000

2111.00000.0000

3111.00000.0000

4111.00000.0000

5111.00000.0000

612*0.00001.0000

7111.00000.0000

8111.00000.0000

9111.00000.0000

10220.00001.0000

11220.00001.0000

12220.00001.0000

1321*1.00000.0000

14220.00001.0000

15220.00001.0000

*Misclassifiedobservation

用交叉確認(rèn)法計算誤判率P=2/15=13.33%

(3)判別待判樣品屬于哪一類.

PosteriorProbabiIityofMembershipinxy

Classified

Obsintoxy12

120.00001.0000

判別待判樣品屬于2類

七、(15分)利用上一題的數(shù)據(jù)(共16個)進(jìn)行聚類分析:

(1)最短距離法,寫出聚類過程,畫出譜系圖(取nclusters=4);

(2)最長距離法,寫出聚類過程,畫出譜系圖(取nclusters=4),求出四個聚類統(tǒng)

計量;

(3)快速聚類法分3類的結(jié)果,在平面坐標(biāo)系中畫出分類圖.

datat7;

inputxl-x7;

cards;

36.057.137.7516.6711.682.3812.88

37.697.018.9416.1511.080.8311.67

38.696.018.8214.7911.441.7413.23

37.759.618.4913.159.761.2811.28

35.718.048.3115.137.761.4113.25

39.778.4912.9419.2711.052.0413.29

40.917.328.9417.6012.751.1414.80

33.707.5910.9818.8214.731.7810.10

35.024.726.2810.037.151.9310.39

52.417.709.9812.5311.702.3114.69

52.653.849.1613.0315.261.9814.57

55.855.507.459.559.522.2116.30

44.687.3214.5117.1312.081.2611.57

45.797.6610.3616.5612.862.7511.69

50.3711.3513.3019.2514.592.7514.87

64.348.0022.2220.0615.120.7222.89

procclusterdata=t7method=sinstdnonormouttree=treel;

varxl-x7;

run;

proctreedata=treelgraphicshorizontalout=clnclusters=4;

run;

procprintdata=cl;

run;

procclusterdata=t7method=comstdnonormouttree=tree2;

varxl-x7;

run;

proctreedata=treezgraphicshorizontalout=c2nclusters=4;

run;

procprintdata=c2;

run;

procfastclusmaxc=3distancelistcluster=c

data=t6out=d;

run;

procplot;

plotx2*xl=c;

run;

(1)最短距離法,寫出聚類過程,畫出譜系圖(取nclusters=4);

ClusterHistory

Min

NCL-ClustersJoined—FREQDist

oB1onuQ2

15on2oD721.3976

14Dn—1.4581

o4oDb-2

13cB15oD1431.525

12cL12cU1—451.5721

11L—1.6783

o6oB.132

10cB11cL1371.8356

9LB—01.8609

cL9cL19

8—01.865

cL8oL.10

7cL7oB1211.9501

66oB—22.0097

cLo

5cL5oB1132.126

4B—

cL4oBQ42.6429

33ou52.707

cLB.15

2cL2oB1662.7151

1—5.0941

OBI----------------------------------1

0B3----------------------------------1

OBI4----------------------------------------

0B2-------------------------------------_

0B7-------------------------------------1

0B4----------------------------------------------

0B5--------------------------------------1

0B6----------------------------------------------

0B13----------------------------------------------

OB10-------------------------------------------------

0B12--------------------------------------------------

0B8------------------------------------------------------

0B11------------------------------------------------------------------

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論