優(yōu)秀大學(xué)統(tǒng)計(jì)學(xué)授課講義_第1頁(yè)
優(yōu)秀大學(xué)統(tǒng)計(jì)學(xué)授課講義_第2頁(yè)
優(yōu)秀大學(xué)統(tǒng)計(jì)學(xué)授課講義_第3頁(yè)
優(yōu)秀大學(xué)統(tǒng)計(jì)學(xué)授課講義_第4頁(yè)
優(yōu)秀大學(xué)統(tǒng)計(jì)學(xué)授課講義_第5頁(yè)
已閱讀5頁(yè),還剩84頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

QuantitativeDataAnalysis:

Statistics第一頁(yè),共89頁(yè)。SherlockHolmes"...whilemanisaninsolublepuzzle,intheaggregatehebecomesamathematicalcertainty.Youcan,forexample,neverforetellwhatanyonemanwilldo,butyoucansaywithprecisionwhatanaveragenumberwillbeupto.Individualsvary,butpercentagesremainconstant.Sosaysthestatistician"

第二頁(yè),共89頁(yè)。OverviewGeneralStatisticsTheNormalDistributionZ-TestsConfidenceIntervalsT-Tests第三頁(yè),共89頁(yè)。GeneralStatistics

~THEGOLDENRULE~StatisticsNEVERreplacethejudgmentoftheexpert.第四頁(yè),共89頁(yè)。ApproachtoStatisticalResearchFormulateaHypothesisStatepredictionsofthehypothesisPerformexperimentsorobservationsInterpretexperimentsorobservationsEvaluateresultswithrespecttohypothesisRefinehypothesisandstartagain(Basicallythesameasallotherresearch)第五頁(yè),共89頁(yè)。HypothesisTestingH0:

NullHypothesis,statusquoHA:AlternativeHypothesis,researchquestionSo,either:"ThedatadoesnotsupportH0"or"WefailtorejectH0"第六頁(yè),共89頁(yè)。TypesofDataContinuousheight,age,timeDiscrete#ofdaysworkedthisweek,#leavesonatreeOrdinal{Good,O.K.,Bad}Nominal{Yes/No},{Teacher/Chemist/Haberdasher}第七頁(yè),共89頁(yè)。PicturingTheData

第八頁(yè),共89頁(yè)。PieChartsNominal/OrdinalOnlysuitablefordatathataddsupto1Hardtocomparevaluesinthechart第九頁(yè),共89頁(yè)。BarChartsNominal/OrdinalEasiertocomparevaluesthanpiechartSuitableforawiderrangeofdata第十頁(yè),共89頁(yè)。DotPlotsNominal/Ordinal

RepresentsallthedataDifficulttoread第十一頁(yè),共89頁(yè)。BoxPlotsNominal/Ordinal1IQR,3IQROutliers第十二頁(yè),共89頁(yè)。ScatterPlotsExcellentforexaminingassociationbetweentwovariables第十三頁(yè),共89頁(yè)。HistogramsContinuousDataDivideDataintoranges第十四頁(yè),共89頁(yè)。Time-SeriesPlotsTimerelatedDatae.g.StockPrices第十五頁(yè),共89頁(yè)。Question1Inatelephonesurveyof68households,whenaskeddotheyhavepets,thefollowingweretheresponses:16:NoPets28:Dogs32:CatsDrawtheappropriategraphictoillustratetheresults!!第十六頁(yè),共89頁(yè)。Question1-SolutionTotalnumbersurveyed=68Numberwithnopets=16=>Totalwithpets=(68-16)=52Buttotal28dogs+32cats=60=>Sosomepeoplehavebothcatsanddogs第十七頁(yè),共89頁(yè)。Question1-SolutionHowmany?Itmustbe(60-52)=8peopleNopets=16Dogs=20Cats=24Both=8-------------------------Total=68第十八頁(yè),共89頁(yè)。Question1-SolutionGraphic:PieChartorBarChart第十九頁(yè),共89頁(yè)。TheLiteraryDigestPoll1936USPresidentialElectionAlfLandon(R)vs.FranklinD.Roosevelt(D)第二十頁(yè),共89頁(yè)。TheLiteraryDigestPollLiteraryDigesthadbeenconductingsuccessfulpresidentialelectionpollssince1916Theyhadcorrectlypredictedtheoutcomesofthe1916,1920,1924,1928,and1932electionsbyconductingpolls.Thesepollswerealucrativeventureforthemagazine:readerslikedthem;newspapersplayedthemup;andeach“ballot”includedasubscriptionblank.第二十一頁(yè),共89頁(yè)。TheLiteraryDigestPollTheysentout10millionballotstotwogroupsofpeople:prospectivesubscribers,“whowerechieflyupper-andmiddle-incomepeople”alistdesignedto"correctforbias"fromthefirstlist,consistingofnamesselectedfromtelephonebooksandmotorvehicleregistries第二十二頁(yè),共89頁(yè)。TheLiteraryDigestPollResponserate:approximately25%,or2,376,523responsesResult:Landoninalandslide(predicted57%ofthevote,Rooseveltpredicted40%)Electionresult:Rooseveltreceivedapproximately60%ofthevote第二十三頁(yè),共89頁(yè)。TheLiteraryDigestPollPOSSIBLECAUSESOFERRORSelectionBias:Bytakingnamesandaddressesfromtelephonedirectories,surveysystematicallyexcludedpoorvoters.Republicansweremarkedlyoverrepresentedin1936,Democratsdidnothaveasmanyphones,

notaslikelytodrivecars,anddidnotreadtheLiteraryDigest“SamplingFrame”istheactualpopulationofindividualsfromwhichasampleisdrawn:Selectionbiasresultswhensamplingframeisnotrepresentativeofthepopulationofinterest第二十四頁(yè),共89頁(yè)。TheLiteraryDigestPollPOSSIBLECAUSESOFERRORNon-responseBias:Becauseonly20%of10millionpeoplereturnedsurveys,non-respondentsmayhavedifferentpreferencesfromrespondentsIndeed,respondentsfavoredLandonGreaterresponseratesreducetheoddsofbiasedsamples第二十五頁(yè),共89頁(yè)。TerminologyPopulation:isasetofentitiesconcerningwhichstatisticalinferencesaretobedrawn.Sample:anumberofindependentobservationsfromthesameprobabilitydistributionParameter:thedistributionofarandomvariableasbelongingtoafamilyofprobabilitydistributions,distinguishedfromeachotherbythevaluesofafinitenumberofparametersBias:afactorthatcausesastatisticalsampleofapopulationtohavesomeexamplesofthepopulationlessrepresentedthanothers.第二十六頁(yè),共89頁(yè)。Outliers(andtheirtreatment)An"outlier"isanobservationthatdoesnotfitthepatternintherestofthedataCheckthedataCheckwiththemeasurerIfreasontobelieveitisNOTreal,changeitifpossible,otherwiseleaveitout(butnote).Ifreasontobelieveitisreal,leaveitoutandnote.第二十七頁(yè),共89頁(yè)。TheMeanTheMean(Arithmetic)Themeanisdefinedasthesumofalltheelements,dividedbythenumberofelements.Thestatisticalmeanofasetofobservationsistheaverageofthemeasurementsinasetofdata第二十八頁(yè),共89頁(yè)。TheVarianceButtherecanbealotofvarianceinindividualelements,e.g.teachersalariesAverage=€22,000Lowest=€12,000Difference=12,000-22,000=-10,000第二十九頁(yè),共89頁(yè)。TheVarianceSumof(Sample-Average)=0,thusweneedtodefinevariance.Thevarianceofasetofdataisacumulativemeasureofthesquaresofthedifferenceofallthedatavaluesfromthemeandividedbysamplesizeminusone.第三十頁(yè),共89頁(yè)。StandardDeviationThestandarddeviationofasetofdataisthepositivesquarerootofthevariance.-1-1第三十一頁(yè),共89頁(yè)。Question2Findthemeanandvarianceofthefollowingsamplevalues:36,41,43,44,46第三十二頁(yè),共89頁(yè)。Question2Mean:(36+41+43+44+46)/5=42Variance

DifferenceSquare36–42=-63641–42=-1143–42=1144–42=2446–42=416----------------------------------------5858/(5-1)=58/4=14.5第三十三頁(yè),共89頁(yè)。TheNormalDistribution第三十四頁(yè),共89頁(yè)。第三十五頁(yè),共89頁(yè)。DensityCurves:Properties第三十六頁(yè),共89頁(yè)。TheNormalDistributionThegraphhasasinglepeakatthecenter,thispeakoccursatthemeanThegraphissymmetricalaboutthemeanThegraphnevertouchesthehorizontalaxisTheareaunderthegraphisequalto1第三十七頁(yè),共89頁(yè)。CharacterizationAnormaldistributionisbell-shapedandsymmetric.Thedistributionisdeterminedbythemeanmu,m,andthestandarddeviationsigma,s.Themeanmucontrolsthecenterandsigmacontrolsthespread.第三十八頁(yè),共89頁(yè)。第三十九頁(yè),共89頁(yè)。第四十頁(yè),共89頁(yè)。第四十一頁(yè),共89頁(yè)。第四十二頁(yè),共89頁(yè)。第四十三頁(yè),共89頁(yè)。第四十四頁(yè),共89頁(yè)。第四十五頁(yè),共89頁(yè)。第四十六頁(yè),共89頁(yè)。第四十七頁(yè),共89頁(yè)。TheNormalDistributionIfavariableisnormallydistributed,then:withinonestandarddeviationofthemeantherewillbeapproximately68%ofthedatawithintwostandarddeviationsofthemeantherewillbeapproximately95%ofthedatawithinthreestandarddeviationsofthemeantherewillbeapproximately99.7%ofthedata第四十八頁(yè),共89頁(yè)。TheNormalDistribution第四十九頁(yè),共89頁(yè)。Why?Onereasonthenormaldistributionisimportantisthatmanypsychologicalandorgansationalvariablesaredistributedapproximatelynormally.Measuresofreadingability,introversion,jobsatisfaction,andmemoryareamongthemanypsychologicalvariablesapproximatelynormallydistributed.Althoughthedistributionsareonlyapproximatelynormal,theyareusuallyquiteclose.第五十頁(yè),共89頁(yè)。Why?Asecondreasonthenormaldistributionissoimportantisthatitiseasyformathematicalstatisticianstoworkwith.Thismeansthatmanykindsofstatisticaltestscanbederivedfornormaldistributions.Almostallstatisticaltestsdiscussedinthistextassumenormaldistributions.Fortunately,thesetestsworkverywellevenifthedistributionisonlyapproximatelynormallydistributed.Sometestsworkwellevenwithverywidedeviationsfromnormality.第五十一頁(yè),共89頁(yè)。OneTail/TwoTailImagineweundertookanexperimentwherewemeasuredstaffproductivitybeforeandafterweintroducedacomputersystemtohelprecordsolutionstocommonissuesofworkAverageproductivitybefore=6.4Averageproductivityafter=9.2第五十二頁(yè),共89頁(yè)。OneTail/TwoTailBefore=6.4After=9.2010第五十三頁(yè),共89頁(yè)。OneTail/TwoTailIsthisasignificantdifference?Before=6.4After=9.2100第五十四頁(yè),共89頁(yè)。OneTail/TwoTailorisitmorelikelyasamplingvariation?Before=6.4After=9.2100第五十五頁(yè),共89頁(yè)。OneTail/TwoTailBefore=6.4After=9.2100第五十六頁(yè),共89頁(yè)。OneTail/TwoTailBefore=6.4After=9.2100第五十七頁(yè),共89頁(yè)。OneTail/TwoTailBefore=6.4After=9.2Howmanystandarddevaitionsfromthemeanisthis?100第五十八頁(yè),共89頁(yè)。OneTail/TwoTailBefore=6.4After=9.2Howmanystandarddevaitionsfromthemeanisthis?100andisitstatisticallysignificant?第五十九頁(yè),共89頁(yè)。OneTail/TwoTailBefore=6.4After=9.2100σσσ第六十頁(yè),共89頁(yè)。OneTail/TwoTailOne-TailedH0:m1>=m2HA:m1<m2Two-TailedH0:m1=m2HA:m1<>m2第六十一頁(yè),共89頁(yè)。STANDARDNORMALDISTRIBUTIONNormalDistributionisdefinedasN(mean,(Stddev)^2)StandardNormalDistributionisdefinedasN(0,(1)^2)第六十二頁(yè),共89頁(yè)。STANDARDNORMALDISTRIBUTIONUsingthefollowingformula:willconvertanormaltableintoastandardnormaltable.第六十三頁(yè),共89頁(yè)。ExerciseIftheaverageIQinagivenpopulationis100,andthestandarddeviationis15,whatpercentageofthepopulationhasanIQof145orhigher?第六十四頁(yè),共89頁(yè)。AnswerP(X>=145)P(Z>=((145-100)/15))P(Z>=3)Fromtables:99.87%arelessthan3=>0.13%ofpopulation第六十五頁(yè),共89頁(yè)。TrendsinStatisticalTestsusedinResearchPapersHistoricallyCurrentlyTestingEstimationHypothesisTestsQuotingP-ValuesConfidenceIntervalsResultsin:Accept/RejectResultsin:p-ValueResultsin:Approx.Mean第六十六頁(yè),共89頁(yè)。ConfidenceIntervals

Aconfidenceintervalisusedtoexpresstheuncertaintyinaquantitybeingestimated.Thereisuncertaintybecauseinferencesarebasedonarandomsampleoffinitesizefromapopulationorprocessofinterest.Tojudgethestatisticalprocedurewecanaskwhatwouldhappenifweweretorepeatthesamestudy,overandover,gettingdifferentdata(andthusdifferentconfidenceintervals)eachtime.第六十七頁(yè),共89頁(yè)。ConfidenceIntervals

Ifweknowthetruepopulationmeanandsamplenindividuals,weknowthatifthedataisnormallydistributed,Averagemeanofthesensampleshasa95%chanceoffallingintotheinterval第六十八頁(yè),共89頁(yè)。ConfidenceIntervals

wherethestandarderrorfora95%CImaybecalculatedasfollows;第六十九頁(yè),共89頁(yè)。Example1第七十頁(yè),共89頁(yè)。Example1DoesFF-PD-GhavemoreofthepopularvotethanFG-L?Inarandomsampleof721respondents:382FF-PD-G339FG-LCanweconcludethatFF-PD-Ghasmorethan50%ofthepopularvote?第七十一頁(yè),共89頁(yè)。Example1-SolutionSampleproportion=p=382/721=0.53Samplesize=n=721StandardError=(SqRt((p(1-p)/n)))=0.0295%ConfidenceInterval0.53+/-1.96(0.02)0.53+/-0.04[0.49,0.57]Thus,wecannotconcludethatFF-PD-Ghadmoreofthepopularvote,sincethisintervalspans50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifference"

第七十二頁(yè),共89頁(yè)。Example2第七十三頁(yè),共89頁(yè)。Example2DidObamahavemoreofthepopularvotethanMcCain?Inarandomsampleof1000respondents532Obama468McCainCanweconcludethatObamahadmorethan50%ofthepopularvote?第七十四頁(yè),共89頁(yè)。Example2–95%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01695%ConfidenceInterval0.532+/-1.96(0.016)0.532+/-0.03136[0.5006,0.56336]Thus,wecanconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesnotspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisadifferenceina95%CI"

第七十五頁(yè),共89頁(yè)。Example2–99%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01699%ConfidenceInterval0.532+/-2.58(0.016)0.532+/-0.041[0.491,0.573]Thus,wecannotconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifferenceina99%CI"

第七十六頁(yè),共89頁(yè)。Example2–99.99%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01699.99%ConfidenceInterval0.532+/-3.87(0.016)0.532+/-0.06[0.472,0.592]Thus,wecannotconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifferenceina99.99%CI"

第七十七頁(yè),共89頁(yè)。T-Tests

第七十八頁(yè),共89頁(yè)。OneTail/TwoTailT-testZ-test第七十九頁(yè),共89頁(yè)。T-Testspowerfulparametrictestforcalculatingthesignificanceofasmallsamplemeannecessaryforsmallsamplesbecausetheirdistributionsarenotnormalonefirsthastocalculatethe"degreesoffreedom"第八十頁(yè),共89頁(yè)。T-TestsThet-testisoftencalledtheStudent'st-test.ItwascreatedbyachiefbrewernamedWilliamS.GossettwhoworkedfortheGuinnessBrewery.Hediscoveredthisstatisticaspartofhisworkinthebrewerytocomparethedifferentbrewingprocessesforchangingrawmaterialsintobeer.GuinnessdidnotallowitsemployeestopublishresultsbutthemanagementdecidedtoallowGossetttopublishitunderapseudonym-Student.Hencewehaveth

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論