版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
QuantitativeDataAnalysis:
Statistics第一頁,共89頁。SherlockHolmes"...whilemanisaninsolublepuzzle,intheaggregatehebecomesamathematicalcertainty.Youcan,forexample,neverforetellwhatanyonemanwilldo,butyoucansaywithprecisionwhatanaveragenumberwillbeupto.Individualsvary,butpercentagesremainconstant.Sosaysthestatistician"
第二頁,共89頁。OverviewGeneralStatisticsTheNormalDistributionZ-TestsConfidenceIntervalsT-Tests第三頁,共89頁。GeneralStatistics
~THEGOLDENRULE~StatisticsNEVERreplacethejudgmentoftheexpert.第四頁,共89頁。ApproachtoStatisticalResearchFormulateaHypothesisStatepredictionsofthehypothesisPerformexperimentsorobservationsInterpretexperimentsorobservationsEvaluateresultswithrespecttohypothesisRefinehypothesisandstartagain(Basicallythesameasallotherresearch)第五頁,共89頁。HypothesisTestingH0:
NullHypothesis,statusquoHA:AlternativeHypothesis,researchquestionSo,either:"ThedatadoesnotsupportH0"or"WefailtorejectH0"第六頁,共89頁。TypesofDataContinuousheight,age,timeDiscrete#ofdaysworkedthisweek,#leavesonatreeOrdinal{Good,O.K.,Bad}Nominal{Yes/No},{Teacher/Chemist/Haberdasher}第七頁,共89頁。PicturingTheData
第八頁,共89頁。PieChartsNominal/OrdinalOnlysuitablefordatathataddsupto1Hardtocomparevaluesinthechart第九頁,共89頁。BarChartsNominal/OrdinalEasiertocomparevaluesthanpiechartSuitableforawiderrangeofdata第十頁,共89頁。DotPlotsNominal/Ordinal
RepresentsallthedataDifficulttoread第十一頁,共89頁。BoxPlotsNominal/Ordinal1IQR,3IQROutliers第十二頁,共89頁。ScatterPlotsExcellentforexaminingassociationbetweentwovariables第十三頁,共89頁。HistogramsContinuousDataDivideDataintoranges第十四頁,共89頁。Time-SeriesPlotsTimerelatedDatae.g.StockPrices第十五頁,共89頁。Question1Inatelephonesurveyof68households,whenaskeddotheyhavepets,thefollowingweretheresponses:16:NoPets28:Dogs32:CatsDrawtheappropriategraphictoillustratetheresults!!第十六頁,共89頁。Question1-SolutionTotalnumbersurveyed=68Numberwithnopets=16=>Totalwithpets=(68-16)=52Buttotal28dogs+32cats=60=>Sosomepeoplehavebothcatsanddogs第十七頁,共89頁。Question1-SolutionHowmany?Itmustbe(60-52)=8peopleNopets=16Dogs=20Cats=24Both=8-------------------------Total=68第十八頁,共89頁。Question1-SolutionGraphic:PieChartorBarChart第十九頁,共89頁。TheLiteraryDigestPoll1936USPresidentialElectionAlfLandon(R)vs.FranklinD.Roosevelt(D)第二十頁,共89頁。TheLiteraryDigestPollLiteraryDigesthadbeenconductingsuccessfulpresidentialelectionpollssince1916Theyhadcorrectlypredictedtheoutcomesofthe1916,1920,1924,1928,and1932electionsbyconductingpolls.Thesepollswerealucrativeventureforthemagazine:readerslikedthem;newspapersplayedthemup;andeach“ballot”includedasubscriptionblank.第二十一頁,共89頁。TheLiteraryDigestPollTheysentout10millionballotstotwogroupsofpeople:prospectivesubscribers,“whowerechieflyupper-andmiddle-incomepeople”alistdesignedto"correctforbias"fromthefirstlist,consistingofnamesselectedfromtelephonebooksandmotorvehicleregistries第二十二頁,共89頁。TheLiteraryDigestPollResponserate:approximately25%,or2,376,523responsesResult:Landoninalandslide(predicted57%ofthevote,Rooseveltpredicted40%)Electionresult:Rooseveltreceivedapproximately60%ofthevote第二十三頁,共89頁。TheLiteraryDigestPollPOSSIBLECAUSESOFERRORSelectionBias:Bytakingnamesandaddressesfromtelephonedirectories,surveysystematicallyexcludedpoorvoters.Republicansweremarkedlyoverrepresentedin1936,Democratsdidnothaveasmanyphones,
notaslikelytodrivecars,anddidnotreadtheLiteraryDigest“SamplingFrame”istheactualpopulationofindividualsfromwhichasampleisdrawn:Selectionbiasresultswhensamplingframeisnotrepresentativeofthepopulationofinterest第二十四頁,共89頁。TheLiteraryDigestPollPOSSIBLECAUSESOFERRORNon-responseBias:Becauseonly20%of10millionpeoplereturnedsurveys,non-respondentsmayhavedifferentpreferencesfromrespondentsIndeed,respondentsfavoredLandonGreaterresponseratesreducetheoddsofbiasedsamples第二十五頁,共89頁。TerminologyPopulation:isasetofentitiesconcerningwhichstatisticalinferencesaretobedrawn.Sample:anumberofindependentobservationsfromthesameprobabilitydistributionParameter:thedistributionofarandomvariableasbelongingtoafamilyofprobabilitydistributions,distinguishedfromeachotherbythevaluesofafinitenumberofparametersBias:afactorthatcausesastatisticalsampleofapopulationtohavesomeexamplesofthepopulationlessrepresentedthanothers.第二十六頁,共89頁。Outliers(andtheirtreatment)An"outlier"isanobservationthatdoesnotfitthepatternintherestofthedataCheckthedataCheckwiththemeasurerIfreasontobelieveitisNOTreal,changeitifpossible,otherwiseleaveitout(butnote).Ifreasontobelieveitisreal,leaveitoutandnote.第二十七頁,共89頁。TheMeanTheMean(Arithmetic)Themeanisdefinedasthesumofalltheelements,dividedbythenumberofelements.Thestatisticalmeanofasetofobservationsistheaverageofthemeasurementsinasetofdata第二十八頁,共89頁。TheVarianceButtherecanbealotofvarianceinindividualelements,e.g.teachersalariesAverage=€22,000Lowest=€12,000Difference=12,000-22,000=-10,000第二十九頁,共89頁。TheVarianceSumof(Sample-Average)=0,thusweneedtodefinevariance.Thevarianceofasetofdataisacumulativemeasureofthesquaresofthedifferenceofallthedatavaluesfromthemeandividedbysamplesizeminusone.第三十頁,共89頁。StandardDeviationThestandarddeviationofasetofdataisthepositivesquarerootofthevariance.-1-1第三十一頁,共89頁。Question2Findthemeanandvarianceofthefollowingsamplevalues:36,41,43,44,46第三十二頁,共89頁。Question2Mean:(36+41+43+44+46)/5=42Variance
DifferenceSquare36–42=-63641–42=-1143–42=1144–42=2446–42=416----------------------------------------5858/(5-1)=58/4=14.5第三十三頁,共89頁。TheNormalDistribution第三十四頁,共89頁。第三十五頁,共89頁。DensityCurves:Properties第三十六頁,共89頁。TheNormalDistributionThegraphhasasinglepeakatthecenter,thispeakoccursatthemeanThegraphissymmetricalaboutthemeanThegraphnevertouchesthehorizontalaxisTheareaunderthegraphisequalto1第三十七頁,共89頁。CharacterizationAnormaldistributionisbell-shapedandsymmetric.Thedistributionisdeterminedbythemeanmu,m,andthestandarddeviationsigma,s.Themeanmucontrolsthecenterandsigmacontrolsthespread.第三十八頁,共89頁。第三十九頁,共89頁。第四十頁,共89頁。第四十一頁,共89頁。第四十二頁,共89頁。第四十三頁,共89頁。第四十四頁,共89頁。第四十五頁,共89頁。第四十六頁,共89頁。第四十七頁,共89頁。TheNormalDistributionIfavariableisnormallydistributed,then:withinonestandarddeviationofthemeantherewillbeapproximately68%ofthedatawithintwostandarddeviationsofthemeantherewillbeapproximately95%ofthedatawithinthreestandarddeviationsofthemeantherewillbeapproximately99.7%ofthedata第四十八頁,共89頁。TheNormalDistribution第四十九頁,共89頁。Why?Onereasonthenormaldistributionisimportantisthatmanypsychologicalandorgansationalvariablesaredistributedapproximatelynormally.Measuresofreadingability,introversion,jobsatisfaction,andmemoryareamongthemanypsychologicalvariablesapproximatelynormallydistributed.Althoughthedistributionsareonlyapproximatelynormal,theyareusuallyquiteclose.第五十頁,共89頁。Why?Asecondreasonthenormaldistributionissoimportantisthatitiseasyformathematicalstatisticianstoworkwith.Thismeansthatmanykindsofstatisticaltestscanbederivedfornormaldistributions.Almostallstatisticaltestsdiscussedinthistextassumenormaldistributions.Fortunately,thesetestsworkverywellevenifthedistributionisonlyapproximatelynormallydistributed.Sometestsworkwellevenwithverywidedeviationsfromnormality.第五十一頁,共89頁。OneTail/TwoTailImagineweundertookanexperimentwherewemeasuredstaffproductivitybeforeandafterweintroducedacomputersystemtohelprecordsolutionstocommonissuesofworkAverageproductivitybefore=6.4Averageproductivityafter=9.2第五十二頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2010第五十三頁,共89頁。OneTail/TwoTailIsthisasignificantdifference?Before=6.4After=9.2100第五十四頁,共89頁。OneTail/TwoTailorisitmorelikelyasamplingvariation?Before=6.4After=9.2100第五十五頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2100第五十六頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2100第五十七頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2Howmanystandarddevaitionsfromthemeanisthis?100第五十八頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2Howmanystandarddevaitionsfromthemeanisthis?100andisitstatisticallysignificant?第五十九頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2100σσσ第六十頁,共89頁。OneTail/TwoTailOne-TailedH0:m1>=m2HA:m1<m2Two-TailedH0:m1=m2HA:m1<>m2第六十一頁,共89頁。STANDARDNORMALDISTRIBUTIONNormalDistributionisdefinedasN(mean,(Stddev)^2)StandardNormalDistributionisdefinedasN(0,(1)^2)第六十二頁,共89頁。STANDARDNORMALDISTRIBUTIONUsingthefollowingformula:willconvertanormaltableintoastandardnormaltable.第六十三頁,共89頁。ExerciseIftheaverageIQinagivenpopulationis100,andthestandarddeviationis15,whatpercentageofthepopulationhasanIQof145orhigher?第六十四頁,共89頁。AnswerP(X>=145)P(Z>=((145-100)/15))P(Z>=3)Fromtables:99.87%arelessthan3=>0.13%ofpopulation第六十五頁,共89頁。TrendsinStatisticalTestsusedinResearchPapersHistoricallyCurrentlyTestingEstimationHypothesisTestsQuotingP-ValuesConfidenceIntervalsResultsin:Accept/RejectResultsin:p-ValueResultsin:Approx.Mean第六十六頁,共89頁。ConfidenceIntervals
Aconfidenceintervalisusedtoexpresstheuncertaintyinaquantitybeingestimated.Thereisuncertaintybecauseinferencesarebasedonarandomsampleoffinitesizefromapopulationorprocessofinterest.Tojudgethestatisticalprocedurewecanaskwhatwouldhappenifweweretorepeatthesamestudy,overandover,gettingdifferentdata(andthusdifferentconfidenceintervals)eachtime.第六十七頁,共89頁。ConfidenceIntervals
Ifweknowthetruepopulationmeanandsamplenindividuals,weknowthatifthedataisnormallydistributed,Averagemeanofthesensampleshasa95%chanceoffallingintotheinterval第六十八頁,共89頁。ConfidenceIntervals
wherethestandarderrorfora95%CImaybecalculatedasfollows;第六十九頁,共89頁。Example1第七十頁,共89頁。Example1DoesFF-PD-GhavemoreofthepopularvotethanFG-L?Inarandomsampleof721respondents:382FF-PD-G339FG-LCanweconcludethatFF-PD-Ghasmorethan50%ofthepopularvote?第七十一頁,共89頁。Example1-SolutionSampleproportion=p=382/721=0.53Samplesize=n=721StandardError=(SqRt((p(1-p)/n)))=0.0295%ConfidenceInterval0.53+/-1.96(0.02)0.53+/-0.04[0.49,0.57]Thus,wecannotconcludethatFF-PD-Ghadmoreofthepopularvote,sincethisintervalspans50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifference"
第七十二頁,共89頁。Example2第七十三頁,共89頁。Example2DidObamahavemoreofthepopularvotethanMcCain?Inarandomsampleof1000respondents532Obama468McCainCanweconcludethatObamahadmorethan50%ofthepopularvote?第七十四頁,共89頁。Example2–95%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01695%ConfidenceInterval0.532+/-1.96(0.016)0.532+/-0.03136[0.5006,0.56336]Thus,wecanconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesnotspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisadifferenceina95%CI"
第七十五頁,共89頁。Example2–99%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01699%ConfidenceInterval0.532+/-2.58(0.016)0.532+/-0.041[0.491,0.573]Thus,wecannotconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifferenceina99%CI"
第七十六頁,共89頁。Example2–99.99%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01699.99%ConfidenceInterval0.532+/-3.87(0.016)0.532+/-0.06[0.472,0.592]Thus,wecannotconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifferenceina99.99%CI"
第七十七頁,共89頁。T-Tests
第七十八頁,共89頁。OneTail/TwoTailT-testZ-test第七十九頁,共89頁。T-Testspowerfulparametrictestforcalculatingthesignificanceofasmallsamplemeannecessaryforsmallsamplesbecausetheirdistributionsarenotnormalonefirsthastocalculatethe"degreesoffreedom"第八十頁,共89頁。T-TestsThet-testisoftencalledtheStudent'st-test.ItwascreatedbyachiefbrewernamedWilliamS.GossettwhoworkedfortheGuinnessBrewery.Hediscoveredthisstatisticaspartofhisworkinthebrewerytocomparethedifferentbrewingprocessesforchangingrawmaterialsintobeer.GuinnessdidnotallowitsemployeestopublishresultsbutthemanagementdecidedtoallowGossetttopublishitunderapseudonym-Student.Hencewehaveth
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 防溺水培訓(xùn)心得7篇
- 交叉口施工交通組織方案
- 二手房買賣合同
- 機(jī)械設(shè)計之平面連桿機(jī)構(gòu)
- 項目部管理人員安全培訓(xùn)試題打印
- 施工組織設(shè)計-范本
- 中國床上用品市場調(diào)查分析報告
- 公司項目部負(fù)責(zé)人安全培訓(xùn)試題全套
- 項目部治理人員安全培訓(xùn)試題附答案(突破訓(xùn)練)
- 甘肅省武威市涼州區(qū)長城中學(xué)教研組2024-2025學(xué)年八年級上學(xué)期11月期中數(shù)學(xué)試題
- 過敏性休克的急救及處理流程教材課件(28張)
- PCB專業(yè)英語資料PCB海外銷售英語資料
- 新教材湘教湘科版四年級上冊科學(xué) 1.1 各種各樣的聲音 教案(教學(xué)設(shè)計)
- 簡支梁、懸臂梁撓度計算程序(自動版)
- 附件16-10smtc工裝夾具命名及標(biāo)識車身
- 寧波參考資料習(xí)俗-歲時節(jié)物
- 全國已建橋梁一覽表
- 中等職業(yè)學(xué)校數(shù)學(xué)課程標(biāo)準(zhǔn)(2020年版)(精排word版)
- DB32T 3904-2020 電動自行車停放充電場所消防技術(shù)規(guī)范
- 社會轉(zhuǎn)型與受眾變遷課件
- 牛津高中英語模塊一-unit2-Language-points-語言點
評論
0/150
提交評論