版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
Class5:ANOVA(AnalysisofVariance)andF-tests
I.WhatisANOVA
WhatisANOVA?ANOVAistheshortnamefortheAnalysisofVariance.TheessenceofANOVAistodecomposethetotalvarianceofthedependentvariableintotwoadditivecomponents,oneforthestructuralpart,andtheotherforthestochasticpart,ofaregression.Todaywearegoingtoexaminetheeasiestcase.
II.ANOVA:AnIntroduction
Letthemodelbe
.
Assumingxisacolumnvector(oflengthp)ofindependentvariablevaluesfortheith'observation,
.
Thenisthepredictedvalue.
sumofsquarestotal:
because.
ThisisalwaystruebyOLS.
=SSE+SSR
Important:thetotalvarianceofthedependentvariableisdecomposedintotwoadditiveparts:SSE,whichisduetoerrors,andSSR,whichisduetoregression.
Geometricinterpretation:[blackboard]
DecompositionofVariance
Ifwetreat(yī)Xasarandomvariable,wecandecomposetotalvariancetothebetween-groupportionandthewithin-groupportioninanypopulation:
?
Prove:
?
?
(bytheassumptionthat,forallpossiblek.)
TheANOVAtableistoestimat(yī)ethethree(cuò)quantitiesofequation(1)fromthesample.Asthesamplesizegetslargerandlarger,theANOVAtablewillapproachthee(cuò)quationcloserandcloser.
Inasample,decompositionofestimatedvarianceisnotstrictlytrue.Wethusneedtoseparatelydecomposesumsofsquaresanddegreesoffreedom.IsANOVAamisnomer?
III.ANOVAinMatrix
Iwilltrytogiveasimpliedrepresentat(yī)ionofANOVAasfollows:
(because)
(inyourtextbook,monsterlook)
SSE=e'e
(because,asalways)
(inyourtextbook,monsterlook)
IV.ANOVATable
SOURCE
SS
DF
MS
F
with
Regression
SSR
DF(R)
MSR
MSR/MSE
DF(R)
Error
SSE
DF(E)
MSE
DF(E)
Total
SST
DF(T)
Letususearealexample.Assumethatwehavearegressionestimatedtobe
y=-1.70+0.840x
ANOVATable
SOURCE
SS
DF
MS
F
with
Regression
6.44
1
6.44
6.44/0.19=33.89
1,18
Error
3.40
18
0.19
Total
9.84
19
Weknow,,,,.IfweknowthatDFforSST=19,whatisn?
n=20
?=201.71.7+0.840.84509.12-21.70.84100-125.0
?=6.44
SSE=SST-SSR=9.84-6.44=3.40
DF(Degreesoffreedom):demonstration.Note:discountingtheinterceptwhencalculatingSST.
MS=SS/DF
p=0.000[askstudents].Whatdoesthep-valuesay?
V.F-Tests
F-testsaremoregeneralthant-tests,t-testscanbeseenasaspecialcaseofF-tests.IfyouhavedifficultywithF-tests,pleaseaskyourGSIstoreviewF-testsinthelab.F-teststakestheformofafractionoftwoMS's.
AnFstatistichastwodegree(cuò)soffreedomassociatedwithit:thedegreeoffreedominthenumerator,andthedegreeoffreedominthedenominator.
AnFstatisticisusuallylargerthan1.TheinterpretationofanFstatisticsisthatwhethertheexplainedvariancebythealternativehypothesisisduetochance.Inotherwords,thenullhypothesisisthattheexplainedvarianceisduetochance,orallthecoefficientsarezero.
ThelargeranF-statistic,themorelikelythat(yī)thenullhypothesisisnottrue.Thereisat(yī)ableinthebackofyourbookfromwhichyoucanfindexactprobabilityvalues.
Inourexample,theFis34,whichishighlysignificant.
VI.R2
R2=SSR/SST
Theproportionofvarianceexplainedbythemodel.
Inourexample,
R-sq=65.4%
VII.Whathappensifweincreasemoreindependentvariables.
1.SSTstaysthesame.
2.SSRalwaysincreases.
3.SSEalwaysdecreases.
4.R2alwaysincreases.
5.MSRusuallyincreases.
6.MSEusuallydecreases.
7.F-testusuallyincreases.
Exceptionsto5and7:irrelevantvariablesmaynotexplainthevariancebuttakeupdegreesoffreedom.Wereallyneedtolookat(yī)theresults.
VIII.Important:GeneralWaysofHypothesisTestingwithF-Statistics.
AlltestsinlinearregressioncanbeperformedwithF-teststatistics.Thetrickistorun"nestedmodels."
Twomodelsarenestediftheindependentvariablesinonemodelareasubsetorlinearcombinat(yī)ionsofasubset(子集)oftheindependentvariablesintheothermodel.
Thatistosay.IfmodelAhasindependentvariables(1,,),andmodelBhasindependentvariables(1,,,),AandBarenested.Aiscalledtherestrictedmodel;Biscalledlessrestrictedorunrestrictedmodel.WecallArestrictedbecauseAimpliesthat.Thisisarestriction.
Anotherexample:Chasindependentvariable(1,,+),Dhas(1,+).
CandAarenotnested.
CandBarenested.OnerestrictioninC:.
CandDarenested.OnerestrictioninD:.
DandAarenotnested.
DandBarenested:tworestrictioninD:;.
Wecanalwaystesthypothesesimpliedintherestrictedmodels.Steps:runtworegressionforeachhypothesis,onefortherestrictedmodelandonefortheunrestrictedmodel.TheSSTshouldbethesameacrossthetwomodels.WhatisdifferentisSSEandSSR.Thatis,whatisdifferentisR2.Let
;
Usethefollowingformulas:
?
or
(proof:useSST=SSE+SSR)
Note,df(SSEr)-df(SSEu)=df(SSRu)-df(SSRr)=,
isthenumberofconstraints(notnumberofparameters)impliedbytherestrictedmodel
or
Notethat
Thatis,for1dftests,youcaneitherdoanF-testorat-test.Theyyieldthesameresult.Anotherwaytolookatitisthat(yī)thet-testisaspecialcaseoftheFtest,withthenumeratorDFbeing1.
IX.AssumptionsofF-tests
Whatassumptionsdowenee(cuò)dtomakeanANOVAtablework?
Notmuchanassumption.Allweneedistheassumptionthat(X'X)isnotsingular,sothattheleastsquareestimatebexists.
Theassumptionof=0isnee(cuò)dedifyouwanttheANOVAtabletobeanunbiasedestimat(yī)eofthetrueANOVA(equation1)inthepopulation.Reason:wewantbtobeanunbiasedestimatorof,andthecovariancebetweenbandtodisappear.
ForreasonsIdiscussedearlier,theassumptionsofhomoscedasticityandnon-serialcorrelationarenecessaryforthee(cuò)stimat(yī)ionof.
Thenormalityassumptionthatiisdistributedinanormaldistributionisneededforsmallsamples.
X.TheConceptofIncrement
Everytimeyouputonemoreindependentvariableintoyourmodel,yougetanincreasein.Wesometimecalledtheincrease"incremental."Whatismeansisthatmorevarianceisexplained,orSSRisincreased,SSEisreduced.Whatyoushouldunderstandisthattheincrementalat(yī)tributedtoavariableisalwayssmallerthanthewhenothervariablesareabsent.
?XI.ConsequencesofOmittingRelevantIndependentVariables
Saythetruemodelisthefollowing:
.
Butforsomereasonweonlycollectorconsiderdataon.Therefore,weomitintheregression.Thatis,weomitinourmodel.Webrieflydiscussedthisproblembefore.Theshortstoryisthatwearelikelytohaveabiasduetotheomissionofarelevantvariableinthemodel.Thisissoeventhoughourprimaryinterestistoestimatetheeffectoforony.
Why?Wewillhaveaformalpresentationofthisproblem.
XII.MeasuresofGoodness-of-Fit
Therearedifferentwaystoassessthegoodness-of-fitofamodel.
A.R2
R2isaheuristicmeasurefortheoverallgoodness-of-fit.Itdoesnothaveanassociatedteststat(yī)istic.
R2measurestheproportionofthevarianceinthedependentvariablethat(yī)is“explained”bythemodel:
R2=
B.ModelF-test
ThemodelF-testteststhejointhypothesesthat(yī)allthemodelcoefficientsexceptfortheconstanttermarezero.
DegreesoffreedomsassociatedwiththemodelF-test:
Numerator:p-1
Denominator:n-p.
C.t-testsforindividualparameters
At-testforanindividualparameterteststhehypothesisthat(yī)aparticularcoefficientisequaltoaparticularnumber(commonlyzero).
tk=(bk-k0)/SEk,whereSEkisthe(k,k)elementofMSE(X’X)-1,withdegree(cuò)offreedom=n-p.
D.IncrementalR2
Relativetoarestrictedmodel,thegaininR2fortheunrestrictedmodel:
R2=Ru2-Rr2
?E.F-testsforNestedModel
ItisthemostgeneralformofF-testsandt-tests.
?
Itisequaltoat(yī)-testiftheunrestrictedandrestrictedmodelsdifferonlybyonesingleparameter.
ItisequaltothemodelF-testifwesettherestrictedmodeltotheconstant-onlymodel.
[Askstudents]WhatareSST,SSE,andSSR,andtheirassociateddegreesoffreedom,fortheconstant-onlymodel?
NumericalExample
Asociologicalstudyisinterestedinunderstandingthesocialdeterminantsofmat(yī)hematicalachievementamonghighschoolstudents.Youarenowaskedtoansweraseriesofquestions.Thedataarerealbuthavebeentailoredforeducat(yī)ionalpurposes.Thetotalnumberofobservationsis400.Thevariablesaredefinedas:
y:mathscore
x1:fat(yī)her'seducation
x2:mother'seducation
x3:family'ssocioeconomicstatus
x4:numberofsiblings
x5:classrank
x6:parents'totaleducation(note:x6
=
x1
+
x2)
Forthefollowingregressionmodels,weknow:
Table1
SST?SSR?SSE?DFR2
(1)yon(1x1x2x3x4)?34863?4201
(2)yon(1x6x3x4)?34863???396 .1065
(3)yon(1x6x3x4x5)?34863?10426?24437?395?.2991
(4)x5on(1x6x3x4)???269753 396?.0210
1.PleasefillthemissingcellsinTable1.
2.Testthehypothesisthattheeffectsoffather'seducat(yī)ion(x1)andmother'seducation(x2)onmathscorearethesameaftercontrollingforx3andx4.
3.Testthehypothesisthatx6,x3andx4inModel(2)allhaveazeroeffectony.
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年度云南省高校教師資格證之高等教育學(xué)通關(guān)試題庫(kù)(有答案)
- 黑龍江省龍東地區(qū)2024-2025學(xué)年七年級(jí)上學(xué)期期中地理試卷(含答案)
- 2024-2025學(xué)年廣東省惠州市高三(上)第二次調(diào)研考試物理試卷(含答案)
- 數(shù)據(jù)中心需求分析
- 贛南師范大學(xué)《蘇區(qū)歷史文化》2022-2023學(xué)年第一學(xué)期期末試卷
- 贛南師范大學(xué)《合同法》2022-2023學(xué)年第一學(xué)期期末試卷
- 阜陽(yáng)師范大學(xué)《中國(guó)音樂史一》2022-2023學(xué)年第一學(xué)期期末試卷
- 阜陽(yáng)師范大學(xué)《語(yǔ)文學(xué)科課程標(biāo)準(zhǔn)與教材分析》2021-2022學(xué)年第一學(xué)期期末試卷
- 福建師范大學(xué)協(xié)和學(xué)院《跳繩》2021-2022學(xué)年第一學(xué)期期末試卷
- 福建師范大學(xué)《中國(guó)傳統(tǒng)文化概論》2022-2023學(xué)年第一學(xué)期期末試卷
- 醫(yī)德醫(yī)風(fēng)培訓(xùn)課件圖文
- 新P90用百分?jǐn)?shù)解決問(wèn)題(例4)市公開課一等獎(jiǎng)省賽課微課金獎(jiǎng)?wù)n件
- 2024年初中升學(xué)考試地理專題復(fù)習(xí)(含習(xí)題)09土地資源
- 《青春飛揚(yáng)》初中 心理健康教育教學(xué)課件
- 中國(guó)網(wǎng)絡(luò)媒體的基本格局和態(tài)勢(shì)
- 產(chǎn)品設(shè)計(jì)燈具設(shè)計(jì)說(shuō)明
- 《動(dòng)物在生物圈中的作用》公開課課件
- 剪映課件pptx-2024鮮版
- 第三單元 傳統(tǒng)工藝制作:08 動(dòng)畫手翻書 教學(xué)設(shè)計(jì)蘇科版一年級(jí)上冊(cè)小學(xué)勞動(dòng)
- 第9章 輸血不良反應(yīng)課件
- 月子中心醫(yī)護(hù)培訓(xùn)課件
評(píng)論
0/150
提交評(píng)論