![數(shù)量遺傳學(xué)12-13群體結(jié)構(gòu)和關(guān)聯(lián)分析_第1頁](http://file4.renrendoc.com/view/4406093f0081c70e6ed7071ef8a335ae/4406093f0081c70e6ed7071ef8a335ae1.gif)
![數(shù)量遺傳學(xué)12-13群體結(jié)構(gòu)和關(guān)聯(lián)分析_第2頁](http://file4.renrendoc.com/view/4406093f0081c70e6ed7071ef8a335ae/4406093f0081c70e6ed7071ef8a335ae2.gif)
![數(shù)量遺傳學(xué)12-13群體結(jié)構(gòu)和關(guān)聯(lián)分析_第3頁](http://file4.renrendoc.com/view/4406093f0081c70e6ed7071ef8a335ae/4406093f0081c70e6ed7071ef8a335ae3.gif)
![數(shù)量遺傳學(xué)12-13群體結(jié)構(gòu)和關(guān)聯(lián)分析_第4頁](http://file4.renrendoc.com/view/4406093f0081c70e6ed7071ef8a335ae/4406093f0081c70e6ed7071ef8a335ae4.gif)
![數(shù)量遺傳學(xué)12-13群體結(jié)構(gòu)和關(guān)聯(lián)分析_第5頁](http://file4.renrendoc.com/view/4406093f0081c70e6ed7071ef8a335ae/4406093f0081c70e6ed7071ef8a335ae5.gif)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
Populationstructure群體結(jié)構(gòu)Populationstructuremeansthe“makeup”orcompositionofapopulation.Bypopulationstructure,populationgeneticistsmeanthat,insteadofasingle,simplepopulation,populationsaresubdividedinsomeway.Theoverall"populationofpopulations"isoftencalledametapopulation,whiletheindividualcomponentpopulationsareoftencalled,well...subpopulations,butalsolocalpopulations,ordemes.Infact,inmanyrealpopulations,theremaynotbeanyobviousindividualpopulationsorsubstructureatall,andthepopulationsarecontinuous.However,evenineffectivelycontinuouspopulations,differentareascanhavedifferentgenefrequencies,becausethewholemetapopulationisnotpanmictic
(隨機(jī)交配的).Forinstance,amonghumans,Scotland,theNorthofEngland,andLondonhavesomequitemajorlanguagedifferences,suggestingsubstructure,butyouwouldbehardputtofindanexactboundarywherethereisachangeover.Suchpopulationsarestructured,butcontinuously,inspace.AverygooddefinitionofpopulationstructureiswhenpopulationshavedeviationsfromHardy-Weinbergproportions,ordeviationsfrompanmixia.Ifthereisinbreeding,orselection,orifmigrationisimportant,thenpopulationscanbesaidtobestructuredinsomeway.genefrequenciesandgenotyperatiosinarandomly-breedingpopulationremainconstantfromgenerationtogeneration.Hardy-Weinberg
law
Evolutioninvolveschangesinthegenepool.ApopulationinHardy-Weinbergequilibriumshowsnochange.Ifrecessivealleleswerecontinuallytendingtodisappear,thepopulationwouldsoonbecomehomozygous.UnderHardy-Weinbergconditions,genesthathavenopresentselectivevaluewillnonethelessberetained.WhentheHardy-WeinbergLawFailstoApplyMutationGeneFlowMembersofonepopulationmaybreedwithoccasionalimmigrantsfromanadjacentpopulationofthesamespecies.Thiscanintroducenewgenesoralterexistinggenefrequenciesintheresidents.Inmanyplantsandsomeanimals,geneflowcanoccurbetweendifferent(butstillrelated)species.Hybridization/
introgression.Ineithercase,geneflowincreasesthevariabilityofthegenepool.GeneticDriftAllelefrequencychangingsimplybychance.Noteverymemberofthepopulationwillbecomeaparentandnoteverysetofparentswillproducethesamenumberofoffspring.NonrandomMatingOneofthecornerstonesoftheHardy-Weinbergequilibriumisthatmatinginthepopulationmustberandom.Ifindividuals(usuallyfemales)arechoosyintheirselectionofmates,thegenefrequenciesmaybecomealtered.Darwincalledthissexualselection.Nonrandommatingseemstobequitecommon.Method
testingforpopulationstructureA
standardapproachinvolvessamplingDNAfrommembersofanumberofpotentialsourcepopulationsand
usingthesesamplestoestimateallelefrequenciesineachpopulationataseriesofunlinkedloci.Usingtheestimatedallelefrequencies,itisthenpossibletocomputethelikelihoodthatagivengenotypeoriginatedfromeachpopulation.Individualsofunknownorigincanbegeneticinformationandmightbeassignedtopopulationsaccordingtotheselikelihoods.Forexample,whenassociationmappingisusedtofinddiseasegenes,thepresenceofundetectedpopulationstructurecanleadtospuriousassociationsandthusinvalidatestandardtests.AncestryModelsFourmainmodelsfortheancestryofindividuals:Noadmixturemodel.EachindividualcomespurelyfromoneoftheKpopulations.Theoutputreportstheposteriorprobabilitythatindividualiisfrompopulationk.Thepriorprobabilityforeachpopulationis1=K.Thismodelisappropriateforstudyingfullydiscretepopulationsandisoftenmorepowerfulthantheadmixturemodelatdetectingsubtlestructure.先驗(yàn)概率是在缺乏某個事實(shí)的情況下描述一個變量;
而后驗(yàn)概率是在考慮了一個事實(shí)之后的條件概率。先驗(yàn)概率通常是經(jīng)驗(yàn)豐富的專家的純主觀的估計,比如在美國大選中奧巴馬的支持率p,
在進(jìn)行民意調(diào)查之前,可以先驗(yàn)概率來表達(dá)這個不確定性。
theadmixturemodeleachindividualdrawssomefractionofhis/hergenomefromeachoftheKpopulations;Individualsmayhavemixedancestry.Thisismodeledbysayingthat
individualihasinheritedsomefractionofhis/hergenomefromancestorsinpopulationk.The
outputrecordstheposteriormeanestimatesoftheseproportions.Conditionalontheancestry
vector,q(i),theoriginofeachalleleisindependent.Werecommendthismodelasastartingpointformostanalyses.Itisareasonablyflexiblemodel
fordealingwithmanyofthecomplexitiesofrealpopulations.Admixtureisacommonfeatureofrealdata,andyouprobablywon'tfinditifyouusetheno-admixturemodel.LinkagemodelThisisessentiallyageneralizationoftheadmixturemodeltodealwith“admixturelinkagedisequilibrium”i.e.,thecorrelationsthatarisebetweenlinkedmarkersinrecentlyadmixedpopulations.Thebasicmodelisthat,tgenerationsinthepast,therewasanadmixtureeventthatmixedtheKpopulations.Ifyouconsideranindividualchromosome,itiscomposedofaseriesof“chunks"thatareinheritedasdiscreteunitsfromancestorsatthetimeoftheadmixture.AdmixtureLDarisesbecauselinkedallelesareoftenonthesamechunk,andthereforecomefromthesameancestralpopulation.Thesizesofthechunksareassumedtobeindependentexponentialrandomvariableswithmeanlength1/t(inMorgans).Inpracticeweestimatea“recombinationrate"rfromthedatathatcorrespondstotherateofswitchingfromthepresentchunktoanewchunk.Eachchunkinindividualiisderivedindependentlyfrompopulationkwithprobabilityq(i)k,whereq(i)kistheproportionofthatindividual'sancestryfrompopulationk.Usingpriorpopulationinformation.Geneticinformationtolearnaboutpopulationstructure.However,thereisoftenotherinformationthatmightberelevanttotheclustering(e.g.,physicalcharacteristicsofsampledindividualsorgeographicsamplinglocation).Atpresent,structurecanusethisinformationintwoways.First,theusermightfindthatthepre-definedgroups(egsamplinglocations)correspondalmostexactlytostructureclusters.Second,priorinformationmaybeintroducedthroughtheuseoflearningsamples:ie.,someindividualsareofknownorigin,andareusedtoclassifyindividualsofunknownorigin.ForexampleBeaumontetal.(2001)wantedtolearnabouttheancestryofScottishwildcats(manyofwhicharehybridizedwithferaldomesticcats).Theyhadgeneticdatafromabunchofpethousecatswhichweredenfinedasbeinginonepopulation,andtheyinferredQforthewildcats(withK=2).Useofthissortofpriorinformationwillnormallyimprovetheaccuracyoftheinference.AllelefrequencymodelsTwobasicmodelsOnemodelassumesthattheallelefrequenciesineachpopulationareindependentdrawsfromadistributionthatisspeciedbyaparametercalled.=1isthedefaultsetting.Anothermodelwithcorrelatedallelefrequencies.Thissaysthatfrequenciesinthedifferentpopulationsarelikelytobesimilar(probablyduetomigrationorsharedancestry).Theindependentmodelworkswellformanydatasets.Roughlyspeaking,thispriorsaysthatweexpectallelefrequenciesindifferentpopulationstobereasonablydifferentfromeachother.Thecorrelatedfrequenciesmodelsaysthattheymayactuallybequitesimilar.Thisoftenimprovesclusteringforcloselyrelatedpopulations,butmayincreasetheriskofover-estimatingK.Ifonepopulationisquitedivergentfromtheothers,thecorrelatedmodelcansometimesachievebetterinferenceifthatpopulationisremoved.Estimating:Fixing=1isagoodideaformostdata,butinsomesituationse.g.,SNPdatawheremostminorallelesarerare-smallervaluesmayworkbetter.Forthisreason,youcangettheprogramtoestimateforyourdata.Youmaywanttodothisonce,perhapsforK=1,andthenfixattheestimatedvaluethereafter,becausethereseemtobesomeproblemswithnon-identifiabilitywhentryingtoestimatetoomanyofthehyperparameters(,,F)atthesametime.EstimationofK(thenumberofpopulations)Takingcarefortworeasons:(1)itiscomputationallydifficulttoobtainaccurateestimatesofPr(X|K),andourmethodmerelyprovidesanadhocapproximation,and(2)thebiologicalinterpretationofKmaynotbestraightforward.TheprocedureforestimatingKgenerallyworkswellindatasetswithasmallnumberofdiscretepopulations.However,manyreal-worlddatasetsdonotconformpreciselytothestructuremodel(e.g.,duetoisolationbydistanceorinbreeding).Inthosecasestheremaynotbeanaturalanswertowhatisthe“correct"valueofK.Perhapsforthiskindofreason,itisnotinfrequentthatinrealdatathevalueofourmodelchoicecriterioncontinuestoincreasewithincreasingK.ThenitusuallymakessensetofocusonvaluesofKthatcapturemostofthestructureinthedataandthatseembiologicallysensible.StepsinestimatingK1)(Command-lineversion)SetCOMPUTEPROBSandINFERALPHAto1inthefileextraparams.(FrontEndversion)Makesurethatisallowedtovary.2)RuntheMCMCschemefordifferentvaluesofMAXPOPS(K).Attheenditwilloutputaline"EstimatedLnProbofData".ThisistheestimateoflnPr(X|K).YoushouldrunseveralindependentrunsforeachK,inordertoverifythattheestimatesareconsistentacrossruns.IfthevariabilityacrossrunsforagivenKissubstantialcomparedtothevariabilityofestimatesobtainedfordifferentK,youmayneedtouselongerrunsoralongerburn-inperiod.IflnPr(X|K)appearstobebimodalormultimodal,theMCMCschememaybefindingdifferentsolutions.YoucancheckforthisbycomparingtheQfordifferentrunsatasingleK.3)ComputeposteriorprobabilitiesofKForexample,whereKwas2,wegotK lnPr(X|K)1 -43562 -39833 -39824 -39835 -4006WecanstartbyassumingauniformprioronK=1-5.ThenfromBayes'Rule,Pr(K=2)isgivenbyMilddeparturesfromthemodelcanleadtooverestimatingKWhenthereisrealpopulationstructure,thisleadstoLDamongunlinkedlocianddeparturesfromHardy-Weinbergproportions.ButsomedeparturesfromthemodelcanalsoleadtoHardy-Weinbergorlinkagedisequilibrium.BeginninginVersion2,wehavesuggestedthatthecorrelatedallelefrequencymodelshouldbeusedasadefaultbecauseitoftenachievesbetterperformanceondifficultproblems,buttheusershouldbeawarethatthismaymakeiteasiertooverestimateKinsuchsettingsthanundertheindependentfrequenciesmodelFalushetal.(2003a).Howtodecidewhetherinferredstructureisreal.InformalpointersforchoosingK;isthestructurereal?ThereareacoupleofinformalpointerswhichmightbehelpfulinselectingK.Thefirstisthatit'softenthesituationthatPr(K)isverysmallforKlessthantheappropriatevalue(effectivelyzero),andthenmore-or-lessplateausforlargerK,asintheexampleofDataSet2Ashownabove.InthissortofsituationwhereseveralvaluesofKgivesimilarestimatesoflogPr(X|K),itseemsthatthesmallestoftheseisoften“correct".Itisabitdifficulttoprovideafirmruleforwhatwemeanbya“more-or-lessplateaus".Forsmalldatasets,thismightmeanthatthevaluesoflogPr(X|K)arewithin5-10,butinverybigdatasets,thedifferencebetweenK=3andK=4maybe50,butifthedifferencebetweenK=3andK=2is5,000,thenIwoulddefinitelychooseK=3.WemaynotalwaysbeabletoknowtheTRUEvalueofK,butweshouldaimforthesmallestvalueofKthatcapturesthemajorstructureinthedata.Thesecondpointeristhatiftherereallyareseparatepopulations,thereistypicallyalotofinformationaboutthevalueof,andoncetheMarkovchainconverges,willnormallysettledowntoberelativelyconstant(oftenwitharangeofperhaps0.2orless).However,ifthereisn'tanyrealstructure,willusuallyvarygreatlyduringthecourseoftherun.Supposethatyouhaveasituationwithtwoclearpopulations,butyouaretryingtodecidewhetheroneoftheseisfurthersubdivided(ie,thevalueofPr(X|K=3)issimilarto,orperhapsalittlelargerthanP(X|K=2)).Thenonethingyoucouldtryistorunstructureusingonlytheindividualsinthepopulationthatyoususpectmightbesubdivided,andseewhetherthereisastrongsignalasdescribedabove.Insummary,youshouldbeskepticalaboutpopulationstructureinferredonthebasisofsmalldifferencesinPr(K)ifthereisnoclearbiologicalinterpretationfortheassignments,andtheassignmentsareroughlysymmetrictoallpopulationsandnoindividualsarestronglyassigned.IsolationbydistancedataIsolationbydistancereferstotheideathatindividualsmaybespatiallydistributedacrosssomeregion,withlocaldispersal.Inthissituation,allelefrequenciesvarygraduallyacrosstheregion.Theunderlyingstructuremodelisnotwellsuitedtodatafromthiskindofscenario.Whenthisoccurs,theinferredvalueofK,andthecorrespondingallelefrequenciesineachgroupcanberatherarbitrary.Dependingonthesamplingscheme,mostindividualsmayhavemixedmembershipinmultiplegroups.Thatis,thealgorithmwillattempttomodeltheallelefrequenciesacrosstheregionusingweightedaveragesofKdistinctcomponents.Insuchsituations,interpretingtheresultsmaybechallenging.WhenFSTissignificant,butstructurefindsnostructureWeoccasionallygetthefollowingsortofquestion:“Ihavegenotypedataforindividualssampledfromnlocations.TestsofallelefrequencydifferencesindicatesmallbutsignificantFST
betweenatleastsomelocations.Howeverstructuredoesnotfindanydifferences.HowdoIinterprettheseresults?"Whenthepredefinedpopulationscorrespondcloselytogeneticpopulations,testingforfrequencydifferencesbetweenpredefinedgroupscanbemorepowerfulthanapplyingstructure.Thisisbecausethebasicstructuremodelsaimtosolveamuchharderstatisticalproblem,i.e.,identifyingpopulationclusterswithoutbeingtoldthelikelysubgroupsinadvance.Forthisreasonthereisapartofparameterspacewherethereisnotquiteenoughdataforstructuretogetthe“right"answer,eventhoughatestofFSTusingthepredefinedlabelsdetectspopulationdifferentiation.關(guān)聯(lián)分析associationmappingAssociationmappingreferstosignificantassociationofamolecularmarkerwithaphenotypictrait.LD(特例LA)referstonon-randomassociationbetweentwomarkersortwogenes/QTLsorbetweenagene/QTLandamarkerlocus.Thus,associationmappingisactuallyoneoftheseveralusesofLD.Instatisticalsense,associationreferstocovarianceofamarkerpolymorphismandatraitofinterest,whileLDrepresentscovarianceofpolymorphismsexhibitedbytwomolecularmarkers/genes.HowIsLDMeasured?AvarietyofstatisticshavebeenusedtomeasureLDthetwomostcommonstatistics
formeasuringLD:considerapairoflociwithallelesAandaat
locusone,andBandbatlocustwo,withallelefrequencies respectively.Theresultinghaplotypefrequenciesare
.ThebasiccomponentofallLDstatisticsisthedifferencebetweentheobservedandexpectedhaplotypefrequencies,haplotype:acombinationofallelesatmultiplelinkedlocithataretransmittedtogether.D’isscaledbasedontheobservedallelefrequencies,soitwillrangebetween0and1evenifallelefrequenciesdifferbetweentheloci.=Norecombination(mutationsattwolinkedlocinotseparatedintime);無重組Independent
assortment(mutationsattwolocinotseparatedintime);獨(dú)立搭配(C)Norecombination(onlymutationsseparatedintime);(D)Low
recombination(mutationsattwolocinotseparatedintime).LDdecayFactorsaffectingLDThefactors,whichleadtoanincreaseinLD,includeinbreeding,smallpopulationsize,geneticisolationbetweenlineages,populationsubdivision,lowrecombinationrate,populationadmixture,naturalandartificialselection,balancingselection,etc.Admixtureresultsintheintroductionofchromosomesofdifferentancestryandallelefrequencies.Often,theresultingLDextendstounlinkedsites,evenondifferentchromosomes,butbreaksdownrapidlywithrandommating.Areductioninpopulationsize(bottleneck)withaccompanyingextremegeneticdrift.Duringabottleneck,onlyfewalleliccombinationsarepassedontofuture.Someotherfactors,whichleadtoadecrease/disruptioninLD,includeoutcrossing,highrecombinationrate,highmutationrate,etc.Generally,LDdecaysmorerapidlyinoutcrossingspeciesascomparedtoselfingspecies.Thisisbecauserecombinationislesseffectiveinselfingspecies,whereindividualsaremorelikelytobehomozygous,thaninoutcrossingspecies.Thereareotherfactors,whichmayleadtoeitherincreaseordecreaseinLD,ormayincreaseLDbetweensomepairsofallelesanddecreaseLDbetweenotherpairs.Forinstance,mutationswilldisruptLDbetweenpairsinvolvingwildalleles,andwillpromoteLDbetweenpairsinvolvingmutantalleles.Similarly,genomicrearrangementsmaydisruptLDbetweengenesseparatedduetorearrangement,butLDmayincreasebetweennewgenecombinationsinthevicinityofbreakpointsduetosuppressionoflocalrecombination.GeneconversionisanonreciprocaltransferofgeneticinformationinDNAgeneticrecombination,whichoccursduringmeioticdivision.ItisaprocessbywhichDNAsequenceinformationistransferredfromoneDNAhelix(whichremainsunchanged)toanotherDNAhelix,whosesequenceisaltered.Itisoneofthewaysagenemaybemutated.Geneconversionmayleadtonon-Mendelianinheritanceandhasoftenbeenrecordedinfungalcrosses.基因轉(zhuǎn)換OtherfactorsaffectingLDincludepopulationstructure,epistasis,geneconversionandascertainmentbias.Ascertainmentbias(AB)isthebiasintroducedbythecriteriausedtoselectindividualsand/orlociinwhichgeneticvariationisassayed,sothatitleadstoinaccurateestimatesofLD.AscertainmentisthewayindividualswithatraitareselectedorfoundforgeneticstudiesandbiasisadifferencebetweentheestimatedandtruevalueofLDinastatisticalsample.MutationprovidestherawmaterialforproducingpolymorphismsthatwillbeinLD.RecombinationisthemainphenomenonthatweakensintrachromosomalLD,whereasinterchromosomalLDisbrokendownbyindependentassortment.Populationsizealsoplaysanimportantrole.Insmallpopulations,theeffectsofgeneticdriftresultintheconsistentlossofrarealleliccombinations,whichincreaseLDlevels.WhereNistheeffectivepopulationsizeandcistherecombinationfractionbetweensitesWhengeneticdriftandrecombinationareatequilibrium,Inanyorganism,LDcanbeusedforidentifying
genomicregions,whichhavebeenthetargetsof
naturalselection(directionalandbalancingselection)duringevolutionaryprocess.NaturalselectionTheprocessinnaturebywhich,accordingtoDarwin'stheoryofevolution,onlytheorganismsbestadaptedtotheirenvironmenttendtosurviveandtransmittheirgeneticcharacteristicsinincreasingnumberstosucceedinggenerationswhilethoselessadaptedtendtobeeliminatedPositivenaturalselectionistheforcethatdrivestheincreaseinprevalenceofadvantageoustraits,andithasplayedacentralroleinourdevelopmentasaspecies.將因含有有利突變而提高個體適合度的等位基因固定下來的選擇作用Positiveselection/Darwinianselection/adaptiveselectionistheprocessbywhichnewadvantageousgeneticvariantssweepapopulation.Genetichitchhiking
istheprocessbywhichanevolutionarilyneutralorinsomecasesdeleteriousalleleormutationmayspreadthroughthegenepoolbyvirtueofbeinglinkedtoagenethatispositivelyselected遺傳搭車效應(yīng)directionalselection
occurswhennaturalselectionfavorsasinglephenotypeandthereforeallelefrequencycontinuouslyshiftsinonedirection.Underdirectionalselection,theadvantageousallelewillincreaseinfrequencyindependentlyofitsdominancerelativetootheralleles(i.e.eveniftheadvantageousalleleisrecessive,itwilleventuallybecomefixed).Directionalselectionstandsincontrasttobalancingselectionwhereselectionmayfavormultiplealleles,andisthesameaspurifyingselectionwhichremovesdeleteriousmutationsfromapopulation.Purifyingselection凈化選擇PurifyingselectionreferstoselectionagainstnonsynonymoussubstitutionsattheDNAlevel.Inthiscase,theevolutionarydistancebasedonsynonymoussubstitutionsisexpectedtobegreaterthanthedistancebasedonnonsynonymoussubstitutions.Balancingselectionreferstoanumberofselectiveprocessesbywhichmultiplealleles(differentversionsofagene)areactivelymaintainedinthegenepoolofapopulationatfrequenciesabovethatofgenemutation.non-synonymousmutation在剛出現(xiàn)時是受到正選擇壓力的Structure2.0群體結(jié)構(gòu)/structure.htmlItsusesincludeinferringthepresenceofdistinctpopulations,assigningindividualstopopulations,studyinghybridzones,identifyingmigrantsandadmixedindividuals,andestimatingpopulationallelefrequenciesinsituationswheremanyindividualsaremigrantsoradmixed.Itcanbeappliedtomostofthecommonly-usedgeneticmarkers,includingSNPS,microsatellites,RFLPsandAFLPs.DepartmentofHumanGeneticsUniversityofChicagoTheprogramstructureimplementsamodel-basedclusteringmethodforinferringpopulationstructureusinggenotypedataconsistingofunlinkedmarkers.AJavaRuntimeEnvironment(JREVersion>1.5.0)bySunMicrosystemisrequiredbeforestructureinstallation.ThecompatibleJREforvariousoperatingsystemscanbedownloadedfreefrom/download.FrontEnd分析起始界面
Thefrontendorganizesdataanalysisinto“projects".Eachprojectisconnectedtoasingledatafile.Whencreatingaproject,theuseralsoprovidesinformationthatspecifyhowtoreadthedatafile(numberofloci,numberofindividuals,etc).Thesearecharacteristicsofthedatafile,andarealwaysthesamewithinthisproject.Parametersinfileextraparams.
torefinethemodelinvariousways.Booleanoptions(布爾型/是非型,測試某個對象是否是指定子類)type1for“Yes",or“Usethisoption";0for“No"or“Don'tusethisoption".Programoptions.
NOADMIX(Boolean)Assumethemodelwithoutadmixture(EachindividualisassumedtobecompletelyfromoneoftheKpopulations.)Intheoutput,insteadofprintingtheaveragevalueofQasintheadmixturecase,theprogramprintstheposteriorprobabilitythateachindividualisfromeachpopulation.1=noadmixture;0=modelwithadmixture.LINKAGE(Boolean)Usethelinkagemodel.RLOG10STARTsetstheinitialvalueofrecombinationraterperunitdistance.RLOG10MINandRLOG10MAXsettheminimumandmaximumallowedvaluesforlog10r.RLOG10PROPSDsetsthesizeoftheproposedchangestolog10rineachupdate.Eachprojectalsocontainsoneormore“parametersets".TheseallowtheusertospecifythedetailsoftheMCMCruns,includingthenumberofrepetitions,burn-inlength,etc,aswellasspecifyingthemodelofanalysis(e.g.,whethertoallowadmixture,modelsofallelefrequencies,etc).TheusercanthenruntheMarkovchainatchosenvaluesofK,foragivenparameterset.Thefrontendstoresvarioussummariesoftheresults,includinganumberofgraphicalplots,describedbelow.Buildingaproject.Firstyouneedtoconstructaninputfile.Now,clickonFileNewProject.Thisopensupawizardtoimportthedata(Figure2).Thedataarecopiedfromthespecifiedinputfileintotheworkdirectorychosenfortheproject.Thewizardconsistsoffourframes:1.Specifytheprojectdirectory,projectname,andinputdatafile.2.Specifythebasiccharacteristicsofthedatafile(numberofindividuals,ploidyofthedata(enter'2'fordiploidorganisms),numberofloci,andthevaluethatisusedtoindicatemissingdata.Clickon“Showdatafileformat"togetasummaryofthelengthsandnumberoflinesinthedatafile.FormatforthedatafileEssentially,theentiredatasetisarrangedasamatrixinasinglefile,inwhichthedataforindividualsareinrows,andthelociareincolumns.Foradiploidorganism,dataforeachindividualcanbestoredeitheras2consecutiverows,whereeachlocusisinonecolumn,orinonerow,whereeachlocusisintwoconsecutivecolumns.Unlessyouplantousethelinkagemodel(seebelow)theorderoftheallelesforasingleindividualdoesnotmatter.GenotypeData(Required;integer)Eachalleleatagivenlocusshouldbecodedbyaunique
integer(egmicrosatelliterepeatscore).Thefrontendrequiresreturnsattheendsofeachrow,and
doesnotallow
returnswithinrows;thecommand-lineversionofstructuretreatsreturnsinthesamewayasspaces
ortabs.MarkernamesRecessiveAlleles(inmaporderwithinlinkagegroups)Intermarkerdisance,-1unlinkedPhaseinformation個體編號群體數(shù)據(jù)指示(可用?)樣本來源的群體SampledatafilePOPDATA=1,NUMINDS=7,
NUMLOCI=5,andMISSING=-9.Also,POPFLAG=0,PHENOTYPE=0,EXTRACOLS=0.The
secondcolumnshowsthegeographicsamplinglocationofindividuals.不必寫出label3.(Rows)Specifywhich,ifany,oftheoptionalextrarowdataarepresent:rowofmarkernames;rowofinter-markerdistances;andarowofphasedataaftereachindividual.Alsotickthe“singleline"boxifdataforeachindividualarestoredinasinglerow,insteadofinthestandardformatoftworowsperindividual.4.(Columns)Specifywhichoftheoptionalcolumndataarethere:IndividualID(LABEL);Populationoforigin(POPDATA);USEPOPINFOflag-flagthatsaystousethePOPDATAinformationforcertainindividualswhenusingthepriorpopulationinformationmodel;phenotypedata(foruseinassociationmapping(Pritchardetal.,2000b));otherextracolumnsofdatapriortothegenotypedatathatshouldbeignoredbystructure.Whenyou'vefinishedthesesteps,you'llgetasummaryofthedataformat;ifthislookscorrect,clickon'proceed'.Theprogramwillnowattempttoloadthedatafileandcreatethenewproject.Configuringaparameterset.Onceyou'vesuccessfullyloadedadatafile,youarereadytostartrunningstructure.Youwillcreateo
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 環(huán)境友好設(shè)備供應(yīng)合同(2篇)
- 理療科主治醫(yī)師職責(zé)
- 五年級口算及
- 湘教版數(shù)學(xué)七年級上冊1.4.2《有理數(shù)的加減混合運(yùn)算》聽評課記錄1
- 蘇教版小學(xué)五年級數(shù)學(xué)下冊口算測試卷試題
- 深圳市民辦學(xué)校教師聘用合同范本
- 銀行房屋按揭貸款合同范本
- 股權(quán)質(zhì)押融資協(xié)議書范本
- 污水處理廠委托運(yùn)營協(xié)議書范本
- 二零二五年度2025年臨街門面租賃及裝修合同
- 產(chǎn)后修復(fù)學(xué)習(xí)培訓(xùn)課件
- 高考志愿咨詢培訓(xùn)課件
- mysql課件第五章數(shù)據(jù)查詢
- 超濾培訓(xùn)課件
- 熱線電話管理制度
- AutoCAD 2020中文版從入門到精通(標(biāo)準(zhǔn)版)
- 《海峽兩岸經(jīng)濟(jì)合作框架協(xié)議》全文
- 紡絲原液制造工(中級)理論考試復(fù)習(xí)題庫(含答案)
- ArcGIS軟件入門培訓(xùn)教程演示文稿
- 大梅沙河道河道流量水位
- 人教版初二英語八年級上冊全冊英語單詞表
評論
0/150
提交評論