




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、MCMC Estimation for Random Effect Modelling The MLwiN experienceDr William J. BrowneSchool of Mathematical SciencesUniversity of NottinghamContents Random effect modelling, MCMC and MLwiN. Methods comparison Guatemalan child health example. Extendibility of MCMC algorithms: Cross classified and mult
2、iple membership models. Artificial insemination and Danish chicken examples. Further Extensions.Random effect models Models that account for the underlying structure in the dataset. Originally developed for nested structures (multilevel models), for example in education, pupils nested within schools
3、. An extension of linear modelling with the inclusion of random effects. A typical 2-level model is Here i indexes pupils and j indexes schools.), 0(), 0(2210eijujijjijijNeNueuxyMLwiN Software package designed specifically for fitting multilevel models. Developed by a team led by Harvey Goldstein an
4、d Jon Rasbash at the Institute of Education in London over past 15 years or so. Earlier incarnations ML2, ML3, MLN. Originally contained classical IGLS estimation methods for fitting models. MLwiN launched in 2019 also included MCMC estimation. My role in team was as developer of MCMC functionality
5、in MLwiN during 4.5 years at the IOE.Estimation Methods for Multilevel ModelsDue to additional random effects no simple matrix formulae exist for finding estimates in multilevel models.Two alternative approaches exist:Iterative algorithms e.g. IGLS, RIGLS, EM in HLM that alternate between estimating
6、 fixed and random effects until convergence. Can produce ML and REML estimates.Simulation-based Bayesian methods e.g. MCMC that attempt to draw samples from the posterior distribution of the model.MCMC Algorithm Consider the 2-level model MCMC algorithms work in a Bayesian framework and so we need t
7、o add prior distributions for the unknown parameters. Here there are 4 sets of unknown parameters: We will add prior distributions ), 0(), 0(2210eijujijjijijNeNueuxy22,euu)(),(),(22eupppMCMC Algorithm (2)The algorithm for this model then involves simulating in turn from the 4 sets of conditional dis
8、tributions. Such an algorithm is known as Gibbs Sampling. MLwiN uses Gibbs sampling for all normal response models.Firstly we set starting values for each group of unknown parameters, Then sample from the following conditional distributions, firstly To get .)0(2)0(2)0()0(,euu),|(2)0(2)0()0(euuyp)1(M
9、CMC Algorithm (3)We next sample fromto get , thento get , then finallyTo get . We have then updated all of the unknowns in the model. The process is then simply repeated many times, each time using the previously generated parameter values to generate the next set),|(2)0(2)0()1(euyup)1(u),|(2)0()1()
10、1(2euuyp2)1(u),|(2)1()1()1(2ueuyp2)1(eBurn-in and estimatesBurn-in: It is general practice to throw away the first n values to allow the Markov chain to approach its equilibrium distribution namely the joint posterior distribution of interest. These iterations are known as the burn-in.Finding Estima
11、tes: We continue generating values at the end of the burn-in for another m iterations. These m values are then average to give point estimates of the parameter of interest. Posterior standard deviations and other summary measures can also be obtained from the chains.Methods for non-normal responses
12、When the response variable is Binomial or Poisson then different algorithms are required. IGLS/RIGLS methods give quasilikelihood estimates e.g. MQL, PQL. MCMC algorithms including Metropolis Hastings sampling and Adaptive Rejection sampling are possible. Numerical Quadrature can give ML estimates b
13、ut is not without problems. So why use MCMC? Often gives better estimates for non-normal responses. Gives full posterior distribution so interval estimates for derived quantities are easy to produce. Can easily be extended to more complex problems. Potential downside 1: Prior distributions required
14、for all unknown parameters. Potential downside 2: MCMC estimation is much slower than the IGLS algorithm.The Guatemalan Child Health dataset.This consists of a subsample of 2,449 respondents from the 1987 National Survey of Maternal and Child Helath, with a 3-level structure of births within mothers
15、 within communities.The subsample consists of all women from the chosen communities who had some form of prenatal care during pregnancy. The response variable is whether this prenatal care was modern (physician or trained nurse) or not.Rodriguez and Goldman (2019) use the structure of this dataset t
16、o consider how well quasi-likelihood methods compare with considering the dataset without the multilevel structure and fitting a standard logistic regression.They perform this by constructing simulated datasets based on the original structure but with known true values for the fixed effects and vari
17、ance parameters.They consider the MQL method and show that the estimates of the fixed effects produced by MQL are worse than the estimates produced by standard logistic regression disregarding the multilevel structure!The Guatemalan Child Health dataset.Goldstein and Rasbash (2019) consider the same
18、 problem but use the PQL method. They show that the results produced by PQL 2nd order estimation are far better than for MQL but still biased.The model in this situation is In this formulation i,j and k index the level 1, 2 and 3 units respectively.The variables x1,x2 and x3 are composite scales at
19、each level because the original model contained many covariates at each level.Browne and Draper (2019) considered the hybrid Metropolis-Gibbs method in MLwiN and two possible variance priors (Gamma-1(,) and Uniform.)., 0( and ), 0( where )(logit with)(Bernouilli223322110vkujkkjkkjkijkijkijkijkNvNuvu
20、xxxppySimulation ResultsThe following gives point estimates (MCSE) for 4 methods and 500 simulated datasets.Parameter (True)MQL1PQL2GammaUniform0 (0.65)0.474 (0.01)0.612 (0.01)0.638 (0.01)0.655 (0.01)1 (1.00)0.741 (0.01)0.945 (0.01)0.991 (0.01)1.015 (0.01)2 (1.00)0.753 (0.01)0.958 (0.01)1.006 (0.01)
21、1.031 (0.01)3 (1.00)0.727 (0.01)0.942 (0.01)0.982 (0.01)1.007 (0.01)2v (1.00)0.550 (0.01)0.888 (0.01)1.023 (0.01)1.108 (0.01)2u (1.00)0.026 (0.01)0.568 (0.01)0.964 (0.02)1.130 (0.02)Simulation ResultsThe following gives interval coverage probabilities (90%/95%) for 4 methods and 500 simulated datase
22、ts.Parameter (True)MQL1PQL2GammaUniform0 (0.65)67.6/76.886.2/92.086.8/93.288.6/93.61 (1.00)56.2/68.690.4/96.292.8/96.492.2/96.42 (1.00)13.2/17.684.6/90.888.4/92.688.6/92.83 (1.00)59.0/69.685.2/89.886.2/92.288.6/93.62v (1.00)0.6/2.470.2/77.689.4/94.487.8/92.22u (1.00)0.0/0.021.2/26.884.2/88.688.0/93.
23、0Summary of simulationsThe Bayesian approach yields excellent bias and coverage results.For the fixed effects, MQL performs badly but the other 3 methods all do well.For the random effects, MQL and PQL both perform badly but MCMC with both priors is much better.Note that this is an extreme scenario
24、with small levels 1 in level 2 yet high level 2 variance and in other examples MQL/PQL will not be so bad.Extension 1: Cross-classified modelsFor example, schools by neighbourhoods. Schools will draw pupils from many different neighbourhoods and the pupils of a neighbourhood will go to several schoo
25、ls. No pure hierarchy can be found and pupils are said to be contained within a cross-classification of schools by neighbourhoods : nbhd 1nbhd 2Nbhd 3School 1xxxSchool 2xxSchool 3xxxSchool 4xxxxSchool S1 S2 S3 S4Pupil P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12Nbhd N1 N2 N3NotationWith hierarchical model
26、s we use a subscript notation that has one subscript per level and nesting is implied reading from the left. For example, subscript pattern ijk denotes the ith level 1 unit within the jth level 2 unit within the kth level 3 unit.If models become cross-classified we use the term classification instea
27、d of level. With notation that has one subscript per classification, that captures the relationship between classifications, notation can become very cumbersome. We propose an alternative notation introduced in Browne et al. (2019) that only has a single subscript no matter how many classifications
28、are in the model.Single subscript notationSchool S1 S2 S3 S4Pupil P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12Nbhd N1 N2 N3inbhd(i)sch(i)111221311422512622723833934102411341234) 1 ()3()()2()(0iischinbhdieuuyWe write the model as1)3(4)2(30111)3(1)2(101euuyeuuyWhere classification 2 is neighbourhood and cla
29、ssification 3 is school. Classification 1 always corresponds to the classification at which the response measurements are made, in this case patients. For pupils 1 and 11 equation (1) becomes:Classification diagramsSchoolPupilNeighbourhoodSchoolPupilNeighbourhoodNested structure where schools are co
30、ntained within neighbourhoodsCross-classified structure where pupils from a school come from many neighbourhoods and pupils from a neighbourhood attend several schools.In the single subscript notation we lose information about the relationship(crossed or nested) between classifications. A useful way
31、 of conveying this information is with the classification diagram. Which has one node per classification and nodes linked by arrows have a nested relationship and unlinked nodes have a crossed relationship.Example : Artificial insemination by donor Women w1 w2 w3 Cycles c1 c2 c3 c4 c1 c2 c3 c4 c1 c2
32、 c3 c4 Donations d1 d2 d1 d2 d3 d1 d2 Donors m1 m2 m3 1901 women279 donors 1328 donations12100 ovulatory cyclesresponse is whether conception occurs in a given cycle In terms of a unit diagram:DonorWomanCycleDonationOr a classification diagram:Model for artificial insemination data), 0(), 0(), 0()()
33、logit(), 1 (2)4()4()(2)3()3()(2)2()2()()4()()3()()2()(iuidonoruidonationuiwomanidonoridonationiwomaniiiNuNuNuuuuXBinomialyWe can write the model as2)4(u012345672)2(u2) 3(uParameterDescriptionEstimate(se)intercept-4.04(2.30)azoospermia *0.22(0.11)semen quality0.19(0.03)womens age35-0.30(0.14)sperm co
34、unt0.20(0.07)sperm motility0.02(0.06)insemination to early-0.72(0.19)insemination to late-0.27(0.10)women variance1.02(0.21)donation variance0.644(0.21)donor variance0.338(0.07)Results:Note cross-classified models can be fitted in IGLS but are far easier to fit using MCMC estimation.Extension 2: Mul
35、tiple membership models When level 1 units are members of more than one higher level unit we describe a model for such data as a multiple membership model.For example, Pupils change schools/classes and each school/class has an effect on pupil outcomes. Patients are seen by more than one nurse during
36、 the course of their treatment.Notation), 0()2(), 0()(22)2()2()()2()2(,eiujinursejijjiiiNeNueuwXByNote that nurse(i) now indexes the set of nurses that treat patient i and w(2)i,j is a weighting factor relating patient i to nurse j. For example, with four patients and three nurses, we may have the f
37、ollowing weights: n1(j=1)n2(j=2)n3(j=3)p1(i=1)0.500.5p2(i=2)100p3(i=3)00.50.5p4(i=4)0.50.504)2(2)2(1443)2(3)2(2332)2(1221)2(3)2(1115 . 05 . 0)(5 . 05 . 0)(1)(5 . 05 . 0)(euuXByeuuXByeuXByeuuXByHere patient 1 was seen by nurse 1 and 3 but not nurse 2 and so on. If we substitute the values of w(2)i,j
38、, i and j. from the table into (2) we get the series of equations :Classification diagrams for multiple membership relationshipsDouble arrows indicate a multiple membership relationship between classifications.patientnurseWe can mix multiple membership, crossed and hierarchical structures in a singl
39、e model.patientnursehospitalGP practiceHere patients are multiple members of nurses, nurses are nested within hospitals and GP practice is crossed with both nurse and hospital. Example involving nesting, crossing and multiple membership Danish chickensProduction hierarchy10,127 child flocks 725 hous
40、es 304 farmsBreeding hierarchy10,127 child flocks200 parent flocks farm f1 f2 Houses h1 h2 h1 h2 Child flocks c1 c2 c3 c1 c2 c3. c1 c2 c3. c1 c2 c3. Parent flock p1 p2 p3 p4 p5. Child flockHouseFarmParent flockAs a unit diagram:As a classification diagram:Model and results), 0(), 0(), 0()()logit(),
41、1 (2)4()4()(2)3()3()(2)2()2()(.)4()()3()()2()2(,iuifarmuihouseujiflockpjiifarmihousejjiiiiNuNuNueuuuwXBBinomialy0123452)2(u2)3(u2)4(uParameterDescriptionEstimate(se)intercept-2.322(0.213)2019-1.239(0.162)2019-1.165(0.187)hatchery 2-1.733(0.255)hatchery 3-0.211(0.252)hatchery 4-1.062(0.388)parent flo
42、ck variance0.895(0.179)house variance0.208(0.108)farm variance0.927(0.197)Results:Note multiple membership models can be fitted in IGLS and this model/dataset represents roughly the most complex model that the method can handle.Such models are far easier to fit using MCMC estimation.Further Extensio
43、ns / Work in progressMultilevel factor modelsResponse variables at different levelsMissing data and multiple imputationESRC grant: Sample size calculations, MCMC efficiency & Model identifiabilityWellcome Fellowship grant for Martin GreenMultilevel factor analysis modelling In sample surveys the
44、re are often many responses for each individual. Techniques like factor analysis are often used to identify underlying latent traits amongst these responses. Multilevel factor analysis allows factor analysis modelling to identify factors at various levels/classifications in the dataset so we can ide
45、ntify shared latent traits as well as individual level traits. Due to the nature of MCMC algorithms by adding a step to allow for multilevel factor models in MLwiN, cross-classified models can also be fitted without any additional programming! See Goldstein and Browne (2019,2019) for more detail.Res
46、ponses at different levels In a medical survey some responses may refer to patients in a hospital while others may refer to the hospital itself. Models that combine these responses can be fitted using the IGLS algorithm in MLwiN and shouldnt pose any problems to MCMC estimation. The Centre for Multi
47、level modelling in Bristol are investigating such models as part of their LEMMA node in the ESRC research methods program. I am a named collaborator for the Lemma project. They are also looking at MCMC algorithms for latent growth models. Missing data and multiple imputation Missing data is prolifer
48、ate in survey research. One popular approach to dealing with missing data is multiple imputation (Rubin 1987) where several imputed datasets are created and then the model of interest is fitted to each dataset and the estimates combined. Using a multivariate normal response multilevel model to gener
49、ate the imputations using MCMC in MLwiN is described in chapter 17 of Browne (2019) James Carpenter (LSHTM) has begun work on macros in MLwiN that automate the multiple imputation procedure. Sample size calculations Another issue in data collection is how big a sample do we need to collect? Such sam
50、ple size calculations have simple formulae if we can assume that an independent sample can be generated. If however we wish to account for the data structure in the calculation then things are more complex. One possibility is a simulation-based approach similar to that used in the model comparisons
51、described earlier where many datasets are simulated to look at the power for a fixed sample size. Mousa Golalizadeh Lehi will be joining me in February on an ESRC grant looking at such an approach. A 4th year MMath. student (Lynda Leese) is looking at the approach for nested models.Efficient MCMC al
52、gorithms In MLwiN we have tended to use the simplest, most generally applicable MCMC algorithms for multilevel models. For particular models there are many approaches that may improve the performance / mixing of the MCMC algorithm. We will also investigate some of these methods in the ESRC grant. Br
53、owne (2019) looked at some reparameterisation methods for cross-classified models in a bird nesting dataset. A second 4th year MMath. student (Francis Bourchier) is looking at MCMC methods based around the IGLS representation of nested models which are interesting.Model Identifiability The final par
54、t of the ESRC grant is to look at whether a model is identifiable/estimable given a particular set of data. Cross-classified datasets where there are few level 1 units per higher level unit can result in each observation being factored into several random effects with very few observations being use
55、d to estimate each random effect. We are interested in establishing whether we can really estimate all parameters in such models. An example where we cant would be a dataset of patients who are attended by doctors in wards. Now if there is only one doctor per ward and likewise one ward per doctor th
56、en we cannot tease out doctor and ward effects. Again this work was motivated by a bird nesting dataset.Wellcome Fellowship Martin Green has been successful in obtaining 4 years of funding from Wellcome to come and work with me. The project is entitled Use of Bayesian statistical methods to investig
57、ate farm management strategies, cow traits and decision-making in the prevention of clinical and sub-clinical mastitis in dairy cows. Martin is a qualified veterinary and a specialist farm animal surgeon. He completed a PhD in 2019 at the University of Warwick veterinary epidemiology and used MCMC t
58、o fit binary response multilevel models in both MLwiN and WinBUGS to look at various factors that affect clinical mastitis in dairy cows. Wellcome Fellowship In the 4 years we are aiming to analyse a huge dataset that Martin has been collecting in a Milk Development Council grant. In particular we w
59、ill look at how farm management practices, environmental conditions and cow characteristics influence the risk of mastitis during the dry period. We will hope to get interesting applied results and also some novel statistical methodology from the grant. MCMC will again play a big part. Martin has be
60、en appointed as a professor in the new vet school and will be working there 1 day a fortnight during the grant before moving there full time after the four years.Conclusions In this talk we have attempted to show the flexibility of MCMC methods in attempting to match the complexity of data structures fo
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- T-ZSM 0049-2024“領(lǐng)跑者”評價技術(shù)要求 機(jī)織兒童服裝
- 二零二五年度高效節(jié)能大棚租賃及能源管理協(xié)議
- 二零二五年度個人環(huán)保項目貸款抵押擔(dān)保合同
- 二零二五年度汽車銷售區(qū)域代理退出協(xié)議
- 二零二五年度街道辦事處社區(qū)工作者績效激勵聘用合同
- 二零二五年度智能交通管理系統(tǒng)知識產(chǎn)權(quán)授權(quán)協(xié)議
- 2025年度車輛質(zhì)押融資服務(wù)協(xié)議
- 二零二五年度高新技術(shù)園區(qū)建設(shè)資金委托墊資合同
- 2025年度終止供貨協(xié)議函模板與合同終止后的利益平衡
- 企業(yè)采購管理流程改進(jìn)調(diào)研報告
- 豬場消防安全培訓(xùn)
- 歐式古典風(fēng)格-室內(nèi)設(shè)計風(fēng)67課件講解
- 2024解析:第十章 浮力綜合應(yīng)用-基礎(chǔ)練(解析版)
- 【MOOC】社會調(diào)查與研究方法-北京大學(xué) 中國大學(xué)慕課MOOC答案
- 汽車維護(hù)課件 1.3 舉升機(jī)的使用
- 醫(yī)院培訓(xùn)課件:《民法典》“醫(yī)療損害責(zé)任”逐條解讀
- 自身免疫性腦炎護(hù)理常規(guī)
- 《信息技術(shù)基礎(chǔ)》高職全套教學(xué)課件
- GB/T 19077-2024粒度分析激光衍射法
- 露天礦山開采施工組織方案
- 北京市西城區(qū)2022-2023學(xué)年高三上學(xué)期1月期末考試歷史試題 附答案
評論
0/150
提交評論