版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
第七章主成分分析principle
component
analysisPCA的基本思想一般模型與算法PCA的SPSS實(shí)現(xiàn)PCA的應(yīng)用Origins
of
PCAPearson
(1901)首先引入Hotelling
(1933)發(fā)展成熟One
of
the
most
widely
used
multivariate
techniques.一、PCA的基本思想主成分分析:一種通過(guò)降維技術(shù)把多指標(biāo)化為少數(shù)幾個(gè)綜合指標(biāo)的多元統(tǒng)計(jì)分析方法。綜合指標(biāo)通常表示為原指標(biāo)的線性組合,且為了使這些新指標(biāo)所含的信息不重疊,要求各新指標(biāo)之間互不相關(guān)What
does
PCA
do?Original
data
matrix,
say
n
by
pNew
data
matrix,
say
n
by
q,
with
q
<
p:例:研究55個(gè)國(guó)家運(yùn)動(dòng)員徑賽能力,用8項(xiàng)徑賽成績(jī)-原始數(shù)據(jù)矩陣:x55×8經(jīng)PCA得到新數(shù)據(jù)陣:z55×2:選取2個(gè)主成分,其中第一主成分表示綜合能力指標(biāo),第二主成分表示短跑能力正交旋轉(zhuǎn)哪個(gè)國(guó)家運(yùn)動(dòng)員實(shí)力最強(qiáng)?What
About
the
New
Data?Each
new
variable
is
some
linearcombination
of
all
the
old
variables綜合了原指標(biāo)的信息z1s
=.132xs1
+.059xs
2
+.151xs3
+.158xs
4
+.161xs5
+.158xs
6
+.159xs
7
+.149xs8z2s
=.375xs1
+.821xs
2
+.086xs3
-.026xs
4
-.067xs5
-.176xs
6
-.183xs
7
-.265xs8New
variables
are
chosen
so
as
to
capture
most
ofthe
variability
in
the
original
variables.New
variables
are
uncorrelated!原指標(biāo)相關(guān)性很強(qiáng),新指標(biāo)互不相關(guān)The
new
variables
are
called
“scores”or
“principal
components”P(pán)CA的實(shí)質(zhì)——簡(jiǎn)化數(shù)據(jù)用盡可能少的變量(主成分)反映原始數(shù)據(jù)中盡可能多的信息,以簡(jiǎn)化數(shù)據(jù),突出主要矛盾。反映原始數(shù)據(jù)特征的指標(biāo):方差-離散度主成分:原始變量的最優(yōu)加權(quán)線性組合最優(yōu)加權(quán):第一主成分:尋找原始數(shù)據(jù)的一個(gè)線性組合,使之具有最大方差(數(shù)據(jù)離散度最大的方向)第二主成分:尋找原始數(shù)據(jù)的一個(gè)線性組合,使之具有次大方差,且與第一主成分無(wú)關(guān)…….5
1.0
1.5
2.0
2.50.0(erocsZ利潤(rùn))2.01.51.0.50.0-.5-1.0-1.5-2.0-1.5
-1.0
-.5Zscore(售電量)x1x2一個(gè)簡(jiǎn)單的兩變量例售電量和利潤(rùn):不同供電局在這兩個(gè)指標(biāo)上的離散度都很大。忽略哪個(gè)指標(biāo)都會(huì)給評(píng)估帶來(lái)較大偏誤!0.100.000.1-000.2售電量s00.1-00.000.1利潤(rùn)sm1m2Z13210-1-2Z22.01.00.0-1.0-2.0m1m2正交旋轉(zhuǎn),樣品間距離不變第一主成分:尋找原始數(shù)據(jù)的一個(gè)線性組合,使之具有最大方差(數(shù)據(jù)離散度最大的方向)第一主成分第二主成分第一主成分包含的信息量顯然大于第二主成分,因而忽略第二主成分信息損失不大10.0012.0018.0020.0014.00
16.00run100m8.000.002212.0024.0026.0028.0030.00run200m各國(guó)運(yùn)動(dòng)員100m和200m成績(jī),計(jì)算下列變量方差:X100m和X200m0.707*X100m+0.707X200m0.167*X100m+0.986X200mDescriptive
StatisticsNStd.
DeviationVariancerun100m55.35143.124run200m551.375411.892p707551.10654801.224p167551.38351821.914Valid
N
(listwise)55二、PCA的模型與算法設(shè):x為標(biāo)準(zhǔn)化變量,
原始數(shù)據(jù)陣
Xs
=[x1
,
x2
,x
p
]PCA目標(biāo):找到原始數(shù)據(jù)方差最大的線性組合設(shè):線性組合系數(shù)為mp×1=[m1,m2,…mp]T即:要找一個(gè)m
使z=Xsm=m1x1+m2x2
+…+mpxp具有最大方差μ'X
'X
μs
sz'z
=var(z)
=1n
-11n
-1\
var(z)
=
μ'Rμ1s
sn
-1
X
'X
=
R對(duì)于標(biāo)準(zhǔn)化變量,樣本方差陣=樣本相關(guān)陣max
μ
'Rμs.t.
μ
'μ
=
1構(gòu)造L
=μ
'Rμ
-l(μ
'μ
-1)?L
=
2Rμ
-
2lμ
=
0?μRμ
=lμ
或(R
-lI)μ
=0上式的解就是R的特征根l1>l2>…>lp
,其相應(yīng)的特征向量m1、m2、…mp就是權(quán)重向量,權(quán)重向量與Xs相乘得到的向量z1=Xsm1、z2=Xsm2、…zp=Xsmp,就是Xs的主成分,且有:i=μ
'
l
μ
=
lμ
'μ
=
li
i
i
i
i
ivar(zi
)
=μ
'
Rμi其中最大的特征根l1對(duì)應(yīng)的主成分z1稱(chēng)為第一主成分,其方差最大,其次是第二主成分z2、第三主成分z3…。有p個(gè)變量,就有p個(gè)主成分。注意:R是對(duì)稱(chēng)陣,∴兩不同特征根對(duì)應(yīng)的特征向量相互正交。第i個(gè)主成分的方差就是相應(yīng)的特征根。因?yàn)榈谝粋€(gè)特征根最大,所以第一主成分方差最大p
l
l2l1svar(Z)
=
U'RU
=
Λ
=
則有:Z
=X
U,設(shè):U
=(μ1
,μ2
,μ
p
)p·
p
,Z
=(z1
,z2
,z
p
)n·
p由于各主成分彼此不相關(guān),因此,所有主成分之和的方:var(z1
+
z2
++
z
p
)
=
l1
+
l2
++
lp差等于所有特征根之和,即p[trace(
ABC)
=
trace(BCA)]li
=
trace(U'RU)i=1trace(U'RU)
=
trace(RUU'
)
=
trace(R)
=
p所有主成分方差之和解釋了原始數(shù)據(jù)全部方差正交陣主成分得分矩陣特征根和特征向量的求法3
0
11
31,1
1
1
0-
l
=
(1-
l)(3
-
l)
-1
=
0,S
-
lI
=
1S
=
10.41411121
21113
21
21
的特征向量:
1l
=
3.414m11
=
0.414m21,
m21
=1,m11
=
0.414+
3m
=
3.414mm11
+
m21
=
3.414m11
=
3.414m
mm1m11
m11
對(duì)l1
=3.414:l2
-4l
+2
=0,\S的特征值l
=3.414,l
=0.5861
2Sμi
=
liμiCorrelation
MatrixX1X2X3CorrelationX11.000.562.704X2.5621.000.304X3.704.3041.000=
Ul3
=
0.231l2
=
0.706l1
=
2.063m3-
0.7570.3310.563m2-
0.0950.797-
0.596m1-2-101234X34321-3-2
-1
0
1X2X12
3
4
-3
-2
-1
0例:三標(biāo)準(zhǔn)化變量x1、x2、x3,n=300z1
=
0.646x1
+
0.505x2
+
0.572x3
z2
=
-0.095x1
+
0.797x2
-
0.596x3z3
=
-0.757x1
+
0.331x2
+
0.563x3三個(gè)主成分3-202Y14622.0
1.5
1.01Y3Y20-1-2z1.5
0.0
-.5-1.0z2z3l1
+
l2
+
l3
=
3注意:這里主成
成得分是非標(biāo)準(zhǔn)化的特征0.646向0.505量0.572主成分得分Z=XsU數(shù)據(jù)Descriptive
StatisticsNMinimumMaximumMeanStd.
DeviationVariancex1300-2.7383.031-.00002.9999901.000x2300-2.8033.033-.00002.9999951.000x3300-2.3403.056-.000011.0000121.000z1300-3.584.33.00001.436092.062z2300-2.322.23.0000.83973.705z3300-1.351.56.0000.48065.231z577300-3.46894.2681-.0000271.42983462.044Valid
N
(listwise)300m1m2m30.646-
0.095-
0.7570.5050.7970.3310.572-
0.5960.563l1
=
2.063l2
=
0.706l3
=
0.231主成分分析:對(duì)原始數(shù)據(jù)陣進(jìn)行正交旋轉(zhuǎn),使得第一主成分能保留原始數(shù)據(jù)的最大方差,第二主成分保留次大方差……,且主成分之間不相關(guān),我們可以忽略方差貢獻(xiàn)小的主成分從而達(dá)到降維的目的。\ilili:第i個(gè)主成分的方差貢獻(xiàn)率第一主成分解釋了最大部分的方差kij
=1
lilj:前k個(gè)主成分的累計(jì)貢獻(xiàn)率68
.3
%24
.0
%7
.7
%方差貢獻(xiàn)率68
.3
%92
.3
%100
%累計(jì)貢獻(xiàn)率l1
=
2.063,
l2
=
0.706,
l3
=
0.231保留一個(gè)主成分可解釋68.3%的方差保留兩個(gè)主成分可解釋92.3%的方差Conceptual
Model主成分載荷陣標(biāo)準(zhǔn)化的主成分Z與原始數(shù)據(jù)陣Xs的相關(guān)系數(shù)矩陣,反映各主成分與原始變量x的相關(guān)程度,有助于解釋各主成分的含義。s1
1=
RU
Λ
-1
2=
XXF
=
corr(
X
s
,
Z
s
)
==
UΛΛ
-1
2'
ZΛ
-1
2'
Zn
-
1
n
-
1s
sZ
s
=
ZΛ-1
2cijj
=1f
2變量共同度:載荷陣第i行前c個(gè)元素的平方和,反映了前c個(gè)主成分對(duì)xi方差的解釋程度pl
μ
l
μ
l
)1
1
2
2
pF
=
UΛ1
2
=
μ2z3-.3641.1590.2706z1
z2x1
.9279
-.0798x
.7255
.6696x3
.8222
-.5008principal
component
loading從載荷陣可看出:z與各x正相關(guān),綜1合評(píng)價(jià);z2與x2正相關(guān),與x3負(fù)相關(guān),反映樣本在這兩個(gè)指標(biāo)方面的差距.72552+.66962=0.975:前兩個(gè)主成分揭示了x2方差的97.5%三、PCA的SPSS實(shí)現(xiàn)SPSS過(guò)程:1)分析→數(shù)據(jù)降維→因子分析…(順序點(diǎn)擊菜單項(xiàng),打開(kāi)因子對(duì)話框)。2)在對(duì)話框中指定分析變量,如圖1所示。主成分分析用SPSS中的因子分析過(guò)程。圖13)在圖1對(duì)話框點(diǎn)擊”抽取…”按鈕,出現(xiàn)圖2對(duì)話框,選擇主成分提取法。圖2默認(rèn)選項(xiàng):分析相關(guān)矩陣、顯示未旋轉(zhuǎn)因子解,主成分選取原則:特征根≥1特征根變化曲線主成分個(gè)數(shù)選取原則因子提取方法分析相關(guān)陣還是協(xié)方差陣按照特征根值的大小選取直接指定主成分個(gè)數(shù)4)點(diǎn)擊score,確定主成分得分將各主成分得分作為變量保存顯示因子(主成分)系數(shù)矩陣注意:這里保存的是標(biāo)準(zhǔn)化的主成分,即Zspslllp1zz
z,221Z
=
ZΛ
-1
2
=
,
,s=
ZΛ
-1
2
=
X
UΛ
-1/2s又:Z成分得分系數(shù)矩陣成分123x1.450-.113-1.575x2.352.949.688x3.398-.7101.171提取方法:主成分分析法。分析結(jié)果:Descriptive
StatisticsNMinimumMaximumMeanStd.DeviationVariancez1300-3.584.33.00001.436092.062z2300-2.322.23.0000.83973.705z3300-1.351.56.0000.48065.231REGR
factor
score1
foranalysis
1300-2.492123.01248.00000001.000000001.000REGR
factor
score2
foranalysis
1300-2.763532.65771.00000001.000000001.000REGR
factor
score3
foranalysis
1300-2.807413.23832.00000001.000000001.000ValidN(listwise)300三個(gè)主成分z1
=
0.646x1
+
0.505x2
+
0.572x3
z2
=
-0.095x1
+
0.797x2
-
0.596x3z3
=
-0.757x1
+
0.331x2
+
0.563x3z1s
=
0.450x1
+
0.352x2
+
0.398x3
z2
s
=
-0.113x1
+
0.949x2
-
0.710x3z3s
=
-1.575x1
+
0.688x2
+1.171x3ilisziz
=提取兩個(gè)主成分變量共同度載荷陣主成分分析例1對(duì)55個(gè)國(guó)家徑賽成績(jī)作主成分分析步驟:第一次分析:計(jì)算特征根,確定提取主成分個(gè)數(shù)提取主成分,計(jì)算主成分得分運(yùn)用主成分得分做樣本分析綜合評(píng)價(jià)異常點(diǎn)分析應(yīng)該保留幾個(gè)主成分?幾點(diǎn)說(shuō)明1:pc特征根準(zhǔn)則:特征根>1累計(jì)方差準(zhǔn)則:如累計(jì)方差貢獻(xiàn)率>95%li
/
lii=1
i=1特征根變化曲線(Scree
criterion):Scree
plot:特征根隨其個(gè)數(shù)的變化曲線尋找曲線變平坦的拐點(diǎn)ScreePlot87654Eigenvalue76543210特征根準(zhǔn)則:eigenvalue>1Scree
criterion1
2
3Component
Number數(shù)據(jù)根累計(jì)貢獻(xiàn)率88%保留2個(gè)特征根vs保留4個(gè)特征累計(jì)貢獻(xiàn)率97%最終結(jié)果:),
m2l1
l2=
(
m1=
UΛ-1
2z1s
=.132xs1
+.059xs
2
+.151xs3
+.158xs
4
+.161xs5
+.158xs
6
+.159xs
7
+.149xs8z2s
=.375xs1
+.821xs
2
+.086xs3
-.026xs
4
-.067xs5
-.176xs
6
-.183xs
7
-.265xs8Z
=
ZL-1
2
=
X
UL-1
2s
s注意:此處主成分得分系數(shù)矩陣不是U主成分的解釋——分析載荷陣第一主成分與除200m外的各變量均高度相關(guān),從系數(shù)
看,各變量權(quán)重相差不大,因而可以看成是綜合評(píng)分。第二主成分則主要反映200m(短跑)成績(jī)loadingcfj
=12ij利用兩個(gè)主成分分析各國(guó)運(yùn)動(dòng)員情況-2024主成分提取方法相關(guān)矩陣vs
方差-協(xié)方差矩陣分析相關(guān)矩陣:針對(duì)標(biāo)準(zhǔn)化變量進(jìn)行主成分分析。,分析方差-協(xié)方差矩陣:針對(duì)非標(biāo)準(zhǔn)化變量進(jìn)行主成分分析。幾點(diǎn)說(shuō)明2:z
i
=
(X
-
X)μ
i兩者有區(qū)別:主成分分析追求最大化方差協(xié)方差陣:方差數(shù)量級(jí)大的變量的影響會(huì)淹沒(méi)數(shù)量級(jí)小的變量的影響(盡管其離散度可能并不?。笙蠛屯米酉嚓P(guān)陣:可以克服變量計(jì)量單位不同、數(shù)量級(jí)相差較大帶來(lái)的問(wèn)題,更好地反映方差模式。缺點(diǎn):可能放大了不重要變量的影響。用哪種方法:根據(jù)研究問(wèn)題的性質(zhì)、目的和經(jīng)濟(jì)解釋前例:相關(guān)陣與協(xié)差陣主成分比較RS成分得分系數(shù)矩陣Descriptive
StatisticsMeanStd.DeviationAnalysis
NRUN100M10.4711.351455RUN200M21.10381.375455RUN400M46.43871.457055RUN800M1.79336.368E-0255RUN1500M3.6982.155955RUN5000M13.8278.791755RUN1000028.99671.801755MARATHON136.62409.227055SR成分12run100m.132.375run200m.059.821run400m.151.086run800m.158-.026run1500m.161-.067run5000m.158-.176run10000.159-.183marathon.149-.265主成分分析案例:2003年廣東省各地區(qū)電信業(yè)發(fā)展?fàn)顩r綜合分析調(diào)查對(duì)象的選取廣東省21個(gè)地級(jí)市2003年度電信業(yè)發(fā)展數(shù)據(jù)七個(gè)主要指標(biāo)X1:電信業(yè)務(wù)總量(萬(wàn)元)X2:每百人擁有固定電話數(shù)(門(mén))X3:每百人擁有移動(dòng)電話數(shù)(個(gè))X4:國(guó)際互聯(lián)網(wǎng)用戶(萬(wàn)戶)X5:互聯(lián)網(wǎng)用戶使用時(shí)長(zhǎng)(萬(wàn)分鐘)X6:長(zhǎng)途電話通話量(萬(wàn)次)X7:長(zhǎng)途電話通話時(shí)長(zhǎng)(萬(wàn)分鐘)Component
Matrixa.971.010.967
.043互每長(zhǎng)每電(萬(wàn)萬(wàn))互每長(zhǎng)每互互(萬(wàn)萬(wàn)鐘)ComponentExtraction
Method:
Principal
Component
Analysis.a.
2
components
extracted.Component
Score
Coefficient
Matrix.207.024每每每每每每每電每每互每長(zhǎng)每互互(萬(wàn)萬(wàn)鐘)電電電電電電(萬(wàn)萬(wàn))每每每每每每每每(門(mén))每每每每每每每電每每(個(gè))1.745.179.5742-.483.916.802電電電電電電(萬(wàn)萬(wàn))每每每每每每每每(門(mén))(個(gè))互互互互互每(萬(wàn)互)1.159.038.123.2062-.273.517.453-.085互互互互互每(萬(wàn)互)互互互互互互互(萬(wàn)萬(wàn)鐘).962.978-.150-.174互互互互互互互(萬(wàn)萬(wàn)鐘)互每長(zhǎng)每電(萬(wàn)萬(wàn)).209.208-.098.006ComponentExtraction
Method:
Principal
Component
Analysis.Component
Scores.第一主成分:總量因素,城市電信業(yè)務(wù)規(guī)模第二主成分:平均量成分,電話人均普及狀況2.000003.000000R.00000
1.00000EGR
factor
score1
for
analysis
1-2.00000-1.000000.000001.000002.00000REGR
factor
score2
for
analysis
1
廣廣深深佛佛梅廣惠廣東東汕
汕江門(mén)汕汕茂湛茂江中佛清清珠珠陽(yáng)江潮廣肇肇云河云河韶揭韶陽(yáng)各城市電信發(fā)展水平的評(píng)價(jià):排名四、PCA的應(yīng)用主成分回歸:當(dāng)回歸分析的解釋變量中存在多重共線或解釋變量個(gè)數(shù)相對(duì)于樣本數(shù)過(guò)多時(shí)綜合評(píng)價(jià)降維或化簡(jiǎn)變量分組結(jié)構(gòu)篩選數(shù)據(jù),找出特異點(diǎn)1.
Principal
ComponentsRegressionStandard
regression
problem
with
response
y
andregressors
X1,
X2,
…,
Xp.X1,
X2,
…,
Xp
may
be
exactly
collinear
or
nearly
so.Least
squares
estimates
of
regression
coefficients
are
notpossible,
or
not
reliable
in
that
case.Can
use
Principal
Components
to
address
the
problem.yi
=
b0
+
b1y1i
+
b2
y2i
+
miyi
=
a
0
+a1z1i
+niz1
=
m1y1
+
m2
y2Z
=
XU,
Y
=
ZA
+
ε
Y=
XUA
+
ε用XU估計(jì)A,然后用UA估計(jì)βOriginal
ModelY
=
Xβ
+
εOuter
Relation特征根與多重共線診斷Collinearity
DiagnosticsaModelDimensionEigenvalueConditionIndexVariance
Proportions(Constant)失業(yè)率x2(%)預(yù)預(yù)預(yù)預(yù)膨膨率x3(%)112.9241.000.00.00.0126.058E-026.947.26.01.6631.591E-0213.556.73.99.33a.
Dependent
Variable:預(yù)預(yù)膨膨率y(%)一道考題:多元回歸分析中,常用病態(tài)指數(shù)(多元回歸模型的自變量協(xié)方差陣的最大特征根與最小特征根之比)作為多重共線的判別指標(biāo),變態(tài)指數(shù)越大,多重共線越嚴(yán)重。請(qǐng)用主成分分析解釋其合理性。最小特征根最大特征根病態(tài)指數(shù)CI
=例醫(yī)院生產(chǎn)率因變量:manhrs解釋變量:load,xray,beddays,stay,elgpopmanhrs
=
b0
+
b1load
+
b2
xray
+
b3beddays
+
b4
stay
+
b5elgpop
+
m散點(diǎn)圖系數(shù)a模型非標(biāo)準(zhǔn)化系數(shù)標(biāo)準(zhǔn)化系數(shù)t顯著性共線性統(tǒng)計(jì)量B標(biāo)準(zhǔn)誤Beta容差VIF1
(常量)1964.3581071.4721.833.094load-15.52497.661-.450-.159.877.0009598.207xray.056.021.2142.631.023.1267.940beddays1.5793.0921.394.511.620.0008933.554elgpop-4.2357.177-.082-.590.567.04323.292stay-394.635209.666-.112-1.882.087.2344.280a.因變量:manhrs主成分回首先找出解變量對(duì)主成(1)對(duì)協(xié)方說(shuō)明的總歸(對(duì)協(xié)方差陣):釋變量的主成分,然后作因分的回歸。差陣做主成分分析方差成分初始特征值a提取平方和載入合計(jì)方差的%累積%合計(jì)方差的%累積%原始
147278168099.14399.14347278168099.143998.857100.0004087792.8.857100.00031238.090.000100.00043.6267.603E-07100.0005.5551.164E-07100.000重新標(biāo)度
147278168099.14399.1433.72474.47574.47524087792.8.857100.000.76915.37589.85031238.090.000100.00043.6267.603E-07100.0005.5551.164E-07100.0000.0E01.0E82.0E83.0E84.0E85.0E8Eigenvalue保留幾個(gè)主成分?Total
Variance
ExplainedComponentInitial
EigenvaluesaExtraction
Sums
of
Squared
LoadingsTotal%
of
VarianceCumulative%Total%
of
VarianceCumulative%Raw
147278168099.14399.14347278168099.143998.857100.0004087792.8.857100.00031238.0902.596E-04100.00043.6267.603E-07100.0005.5551.164E-07100.000成分矩陣a原始重新標(biāo)度成分成分1212load147.42464.767.915.402xray21274.007-417.8831.000-.020beddays4490.7481976.946.915.403elgpop98.81025.583.915.237stay.727.987.459.623提取方法:主成分分析法。a.已提取了2個(gè)成分。公因子方差原始重新標(biāo)度初始提取初始提取load25933.54225928.6251.0001.000xray4527579894527579891.0001.000beddays24075136240751361.0001.000elgpop11654.09910418.0191.000.894stay2.5091.5021.000.599提取方法:主成分分析。成分矩陣a原始重新標(biāo)度成分成分11load147.424.915xray21274.0071.000beddays4490.748.915elgpop98.810.915stay.727.459公因子方差原始重新標(biāo)度初始提取初始提取load25933.54221733.8351.000.838xray4527579894525833621.0001.000beddays24075136201668201.000.838elgpop11654.0999763.5121.000.838stay2.509.5291.000.211提取方法:主成分分析。成分矩陣a原始重新標(biāo)度成分成分1212load147.42464.767.915.402xray21274.007-417.8831.000-.020beddays4490.7481976.946.915.403elgpop98.81025.583.915.237stay.727.987.459.623提取方法:主成分分析法。m1
m2load0.0067800.032034xray0.978406-.206686beddays0.2065320.977801elgpop0.0045440.012654stay0.0000330.000488Descriptive
StatisticsMeanStd.DeviationAnalysis
NLOAD148.2764161.038917XRAY18163.23521278.110517BEDDAYS4480.61824906.642117ELGPOP106.3176107.954217STAY5.89351.584117求特征向量m1、m2:z1
=
.00678load
d
+.97841xrayd
+.20653bed
d
+.00454
popd
+.000033stay
dz2
=
.03203load
d
-.20669
xrayd
+.97780bed
d
+.01265
popd
+.000488
stay
d注:load
d
=
load
-
load\
U
=
FΛ-1
2
F
=
UΛ1
2
,計(jì)算主成分得分zi=faci-i×l1/2注意:SPSS給出的主成分得分是標(biāo)準(zhǔn)化分,需轉(zhuǎn)化成非標(biāo)準(zhǔn)化分.方法二:用SPSS的標(biāo)準(zhǔn)化主成分得分計(jì)算主成分得分模型摘要模型
R R
方
調(diào)整的
R
方
估計(jì)的標(biāo)準(zhǔn)差1
.993a
.987
.985
685.16107a.預(yù)測(cè)變量:(常量),z2,z1。ANOVA
b模型
平方和
df
均方
F
顯著性1
回歸
488140301
2
244070150
519.911
.000a殘差
6572239.7
14
469445.69合計(jì)
494712540
16a.預(yù)測(cè)變量:(常量),z2,z1。因變量:manhrsb.(2)主成分回歸:做manhrs對(duì)主成分z1、z2的回歸manhrs
=
b0
+
b1z1
+
b2
z2
+
ese
:
(166.176)
(0.008)
(0.085)F
=
519.9R
2
=
0.987
D.W
.
=
2.874manhrs?
=
4978.48
+
0.243z1
+
0.789z2manhrs?
=
c0
+
0.027load
+
0.075xray
+
0.822bed
+
0.011
pop
+
0.00039stayz1
=.00678loadd
+.97841xrayd
+.20653bedd
+.00454
popd
+.000033staydz2
=.03203loadd
-.20669xrayd
+.97780bedd
+.01265
popd
+.000488stayd系數(shù)a模型非標(biāo)準(zhǔn)化系數(shù)標(biāo)準(zhǔn)化系數(shù)t顯著性共線性統(tǒng)計(jì)量B標(biāo)準(zhǔn)誤Beta容差VIF1
(常量)4978.480166.17629.959.000z1.243.008.95130.872.0001.0001.000z2.789.085.2879.315.0001.0001.000成分得分系數(shù)矩陣.237.221.237.225.163-.002-.411-.001-.3801.088loadxraybeddayselgpopstay12成分公因子方差初始提取load1.000.988xray1.000.937beddays1.000.987elgpop1.000.956stay1.000.995提取方法:主成分分析。說(shuō)明的總方差%942292初始特征值提取平方和載入成分合計(jì)方差的%累積%合計(jì)方差的%累積14.19783.94283.9424.19783.94283.2.66713.35097.292.66713.35097.3.0951.89399.1854.041.81499.99955.397E-05.001100.000對(duì)相關(guān)陣提取主成分fac1
=
0.237loads
+
0.221xrays
+
0.237beds
+
0.225
pops
+
0.163staysfac2
=
-0.002loads
-
0.411xrays
-
0.001beds
-
0.380
pops
+1.088stays系數(shù)結(jié)構(gòu)與協(xié)方差主成分完全不同成分矩陣a.994.929.994.944.684-.002-.274-.001-.254.727loadxraybeddayselgpopstay12成分主成分回歸:先將Manhrs標(biāo)準(zhǔn)化,作Manhrss對(duì)標(biāo)準(zhǔn)化主成分回歸manhrs?s
=
0.982
fac1
-
0.120
fac2fac2
=
-0.002loads
-
0.411xrays
-
0.001beds
-
0.380
pops
+1.088staysfac1
=
0.237loads
+
0.221xrays
+
0.237beds
+
0.225
pops
+
0.163staysmanhrs?s
=
0.233loads
+
0.266xrays
+
0.233beds
+
0.267
pops
+
0.0295stays(1)(2)(2)代入(1)(3)(3)非標(biāo)準(zhǔn)化:manhrs?
=
c
+
8.045load
+
0.070xray
+
0.264bed
+13.75
pop
+103.56stayii
is
(x
)b
=
s
(y)
Beta0.233的含義?系數(shù)a,b模型非標(biāo)準(zhǔn)化系數(shù)標(biāo)準(zhǔn)化系數(shù)t顯著性共線性統(tǒng)計(jì)量B標(biāo)準(zhǔn)誤Beta容差VIF1 REGR
factor
score1
for
analysis
1.982.038.98225.825.0001.0001.000REGR
factor
score2
for
analysis
1-.120.038-.120-3.149.0071.0001.000因變量:Zscore(manhrs)預(yù)過(guò)原點(diǎn)的線性回歸一般回歸法:刪除變量系數(shù)模型非標(biāo)準(zhǔn)化系數(shù)標(biāo)準(zhǔn)化系數(shù)t顯著性共線性統(tǒng)計(jì)量B標(biāo)準(zhǔn)誤Beta容差VIF1 (常量)1964.3581071.4721.833.094load-15.52497.661-.450-.159.877.0009598.207xray.056.021.2142.631.023.1267.940beddays1.5793.0921.394.511.620.0008933.554elgpop-4.2357.177-.082-.590.567.04323.292stay-394.635209.666-.112-1.882.087.2344.280系數(shù)a模型非標(biāo)準(zhǔn)化系數(shù)標(biāo)準(zhǔn)化系數(shù)t顯著性共線性統(tǒng)計(jì)量B標(biāo)準(zhǔn)誤Beta容差VIF1 (常量)2032.188942.0752.157.052xray.056.020.2152.755.017.1267.926beddays1.088.153.9607.095.000.04223.927elgpop-5.0045.081-.097-.985.344.07912.706stay-410.083178.078-.117-2.303.040.2983.361a.
因變量:manhrs系數(shù)a模型非標(biāo)準(zhǔn)化系數(shù)標(biāo)準(zhǔn)化系數(shù)t顯著性共線性統(tǒng)計(jì)量B標(biāo)準(zhǔn)誤Beta容差VIF1
(常量)1523.389786.8981.936.075xray.053.020.2032.637.021.1297.737beddays.978.105.8639.305.000.08911.269stay-320.951153.192-.091-2.095.056.4012.493a.
因變量:manhrs模型比較?=0c0.l+00.+x0..0manhrs?
=
c
+
0.027load
+
0.075xray
+
0.822bed
+
0.011pop
+
0.00039stay相關(guān)陣主成分回歸結(jié)果:manhrs?
=1964
-15.524load
+
0.0559xray
+1.579bed
-
4.235
pop
-
394.6staymanhrs?
=1523
+
0.053xray
+
0.978bed
-
320.95stay方差陣主成分回歸結(jié)果:manhrs?
=
c
+8.045load
+
0.070xray
+
0.264bed
+13.75
pop
+103.56stayDescriptive
StatisticsNMinimumMaximumMeanStd.
Deviationload1715.57510.22148.2764161.03895xray172048.0086533.0018163.235321278.11055beddays17472.9215524.004480.61824906.64206elgpop179.50371.60106.3176107.95415stay173.9010.785.89351.58407Valid
N
(listwise)17050001000015000200002500012
3
4
5
6
7
8
9
10
11
12
13
14
151617manhrs
reg5pre
reg3pre
prinvpre
prinrpreImportant
ComparisonReflecting
on
PCRAll
about
summarizing
the
variability
in
theregressor
spaceNo
attention
paid
to
the
intended
use
for
thevariatesReflecting
on
OLSNo
attention
paid
summarizing
the
variability
inthe
regressor
spaceAll
about
attention
paid
to
the
intended
use
forthe
variates
(maximizing
correlation)UpshotsummarypurposeClearly
in
need
of
a
compromise2.
Intelligent
Index
Formation從調(diào)查或?qū)嶒?yàn)數(shù)據(jù)創(chuàng)建綜合評(píng)價(jià)指標(biāo)May
have
answers
to
p
questions,
say
X1,
X2,
…,
Xp.And
you
may
want
to
summarize
these
p
responses
withone
number
(“index”)
that
best
captures
the
diversity
inresponses.E.g.
is
common
to
add
the
responses,
or
average
them,perhaps
being
sensitive
to
questions
that
are
reversecoded.Already
should
be
clear
to
you
that
a
simple
averagingmay
not
be
the
best
way
to
summarize
the
originalpquestions.綜合評(píng)價(jià)例:人的“認(rèn)知需要”認(rèn)知需要:一個(gè)人喜歡或執(zhí)迷于思考問(wèn)題、解決問(wèn)題的程度如何判斷某個(gè)人是否具有“認(rèn)知需要”?通常可以進(jìn)行測(cè)驗(yàn):讓被測(cè)者回答一些問(wèn)題,然后根據(jù)其答案做出判斷。Cacioppo,
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 新版通 用規(guī)范對(duì)設(shè)計(jì)影響交流分享
- 2025年撫順師范高等專(zhuān)科學(xué)校高職單招高職單招英語(yǔ)2016-2024歷年頻考點(diǎn)試題含答案解析
- 山西省孝義市高三上學(xué)期入學(xué)摸底考試語(yǔ)文試題(含答案)
- 滬教版(上海)七年級(jí)地理第一學(xué)期中國(guó)區(qū)域篇(上)1.2《臺(tái)灣省》聽(tīng)課評(píng)課記錄
- 中班幼兒系列活動(dòng)策劃方案五篇
- 2025年科學(xué)儀器行業(yè)技術(shù)革新與發(fā)展前景
- 鋼材購(gòu)銷(xiāo)合同范文年
- 代償協(xié)議與擔(dān)保合同
- 跨境貿(mào)易線上支付服務(wù)合同
- 投資公司借款的合同樣本
- 醫(yī)保政策與健康管理培訓(xùn)計(jì)劃
- 無(wú)人化農(nóng)場(chǎng)項(xiàng)目可行性研究報(bào)告
- 2024屆上海市金山區(qū)高三下學(xué)期二模英語(yǔ)試題(原卷版)
- 學(xué)生春節(jié)安全教育
- 2024年重慶市中考數(shù)學(xué)試卷(AB合卷)【附答案】
- DB43-T 2142-2021學(xué)校食堂建設(shè)與食品安全管理規(guī)范
- 宏觀利率篇:債券市場(chǎng)研究分析框架
- 六年級(jí)語(yǔ)文(上冊(cè))選擇題集錦
- 《游戲界面設(shè)計(jì)專(zhuān)題實(shí)踐》課件-知識(shí)點(diǎn)5:圖標(biāo)繪制準(zhǔn)備與繪制步驟
- MOOC 材料科學(xué)基礎(chǔ)-西安交通大學(xué) 中國(guó)大學(xué)慕課答案
- 復(fù)產(chǎn)復(fù)工試題含答案
評(píng)論
0/150
提交評(píng)論