




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
Populationstructure群體結(jié)構(gòu)Populationstructuremeansthe“makeup”orcompositionofapopulation.Bypopulationstructure,populationgeneticistsmeanthat,insteadofasingle,simplepopulation,populationsaresubdividedinsomeway.Theoverall"populationofpopulations"isoftencalledametapopulation,whiletheindividualcomponentpopulationsareoftencalled,well...subpopulations,butalsolocalpopulations,ordemes.Infact,inmanyrealpopulations,theremaynotbeanyobviousindividualpopulationsorsubstructureatall,andthepopulationsarecontinuous.However,evenineffectivelycontinuouspopulations,differentareascanhavedifferentgenefrequencies,becausethewholemetapopulationisnotpanmictic
(隨機(jī)交配的).Forinstance,amonghumans,Scotland,theNorthofEngland,andLondonhavesomequitemajorlanguagedifferences,suggestingsubstructure,butyouwouldbehardputtofindanexactboundarywherethereisachangeover.Suchpopulationsarestructured,butcontinuously,inspace.AverygooddefinitionofpopulationstructureiswhenpopulationshavedeviationsfromHardy-Weinbergproportions,ordeviationsfrompanmixia.Ifthereisinbreeding,orselection,orifmigrationisimportant,thenpopulationscanbesaidtobestructuredinsomeway.genefrequenciesandgenotyperatiosinarandomly-breedingpopulationremainconstantfromgenerationtogeneration.Hardy-Weinberg
law
Evolutioninvolveschangesinthegenepool.ApopulationinHardy-Weinbergequilibriumshowsnochange.Ifrecessivealleleswerecontinuallytendingtodisappear,thepopulationwouldsoonbecomehomozygous.UnderHardy-Weinbergconditions,genesthathavenopresentselectivevaluewillnonethelessberetained.WhentheHardy-WeinbergLawFailstoApplyMutationGeneFlowMembersofonepopulationmaybreedwithoccasionalimmigrantsfromanadjacentpopulationofthesamespecies.Thiscanintroducenewgenesoralterexistinggenefrequenciesintheresidents.Inmanyplantsandsomeanimals,geneflowcanoccurbetweendifferent(butstillrelated)species.Hybridization/
introgression.Ineithercase,geneflowincreasesthevariabilityofthegenepool.GeneticDriftAllelefrequencychangingsimplybychance.Noteverymemberofthepopulationwillbecomeaparentandnoteverysetofparentswillproducethesamenumberofoffspring.NonrandomMatingOneofthecornerstonesoftheHardy-Weinbergequilibriumisthatmatinginthepopulationmustberandom.Ifindividuals(usuallyfemales)arechoosyintheirselectionofmates,thegenefrequenciesmaybecomealtered.Darwincalledthissexualselection.Nonrandommatingseemstobequitecommon.Method
testingforpopulationstructureA
standardapproachinvolvessamplingDNAfrommembersofanumberofpotentialsourcepopulationsand
usingthesesamplestoestimateallelefrequenciesineachpopulationataseriesofunlinkedloci.Usingtheestimatedallelefrequencies,itisthenpossibletocomputethelikelihoodthatagivengenotypeoriginatedfromeachpopulation.Individualsofunknownorigincanbegeneticinformationandmightbeassignedtopopulationsaccordingtotheselikelihoods.Forexample,whenassociationmappingisusedtofinddiseasegenes,thepresenceofundetectedpopulationstructurecanleadtospuriousassociationsandthusinvalidatestandardtests.AncestryModelsFourmainmodelsfortheancestryofindividuals:Noadmixturemodel.EachindividualcomespurelyfromoneoftheKpopulations.Theoutputreportstheposteriorprobabilitythatindividualiisfrompopulationk.Thepriorprobabilityforeachpopulationis1=K.Thismodelisappropriateforstudyingfullydiscretepopulationsandisoftenmorepowerfulthantheadmixturemodelatdetectingsubtlestructure.先驗(yàn)概率是在缺乏某個事實(shí)的情況下描述一個變量;
而后驗(yàn)概率是在考慮了一個事實(shí)之后的條件概率。先驗(yàn)概率通常是經(jīng)驗(yàn)豐富的專家的純主觀的估計,比如在美國大選中奧巴馬的支持率p,
在進(jìn)行民意調(diào)查之前,可以先驗(yàn)概率來表達(dá)這個不確定性。
theadmixturemodeleachindividualdrawssomefractionofhis/hergenomefromeachoftheKpopulations;Individualsmayhavemixedancestry.Thisismodeledbysayingthat
individualihasinheritedsomefractionofhis/hergenomefromancestorsinpopulationk.The
outputrecordstheposteriormeanestimatesoftheseproportions.Conditionalontheancestry
vector,q(i),theoriginofeachalleleisindependent.Werecommendthismodelasastartingpointformostanalyses.Itisareasonablyflexiblemodel
fordealingwithmanyofthecomplexitiesofrealpopulations.Admixtureisacommonfeatureofrealdata,andyouprobablywon'tfinditifyouusetheno-admixturemodel.LinkagemodelThisisessentiallyageneralizationoftheadmixturemodeltodealwith“admixturelinkagedisequilibrium”i.e.,thecorrelationsthatarisebetweenlinkedmarkersinrecentlyadmixedpopulations.Thebasicmodelisthat,tgenerationsinthepast,therewasanadmixtureeventthatmixedtheKpopulations.Ifyouconsideranindividualchromosome,itiscomposedofaseriesof“chunks"thatareinheritedasdiscreteunitsfromancestorsatthetimeoftheadmixture.AdmixtureLDarisesbecauselinkedallelesareoftenonthesamechunk,andthereforecomefromthesameancestralpopulation.Thesizesofthechunksareassumedtobeindependentexponentialrandomvariableswithmeanlength1/t(inMorgans).Inpracticeweestimatea“recombinationrate"rfromthedatathatcorrespondstotherateofswitchingfromthepresentchunktoanewchunk.Eachchunkinindividualiisderivedindependentlyfrompopulationkwithprobabilityq(i)k,whereq(i)kistheproportionofthatindividual'sancestryfrompopulationk.Usingpriorpopulationinformation.Geneticinformationtolearnaboutpopulationstructure.However,thereisoftenotherinformationthatmightberelevanttotheclustering(e.g.,physicalcharacteristicsofsampledindividualsorgeographicsamplinglocation).Atpresent,structurecanusethisinformationintwoways.First,theusermightfindthatthepre-definedgroups(egsamplinglocations)correspondalmostexactlytostructureclusters.Second,priorinformationmaybeintroducedthroughtheuseoflearningsamples:ie.,someindividualsareofknownorigin,andareusedtoclassifyindividualsofunknownorigin.ForexampleBeaumontetal.(2001)wantedtolearnabouttheancestryofScottishwildcats(manyofwhicharehybridizedwithferaldomesticcats).Theyhadgeneticdatafromabunchofpethousecatswhichweredenfinedasbeinginonepopulation,andtheyinferredQforthewildcats(withK=2).Useofthissortofpriorinformationwillnormallyimprovetheaccuracyoftheinference.AllelefrequencymodelsTwobasicmodelsOnemodelassumesthattheallelefrequenciesineachpopulationareindependentdrawsfromadistributionthatisspeciedbyaparametercalled.=1isthedefaultsetting.Anothermodelwithcorrelatedallelefrequencies.Thissaysthatfrequenciesinthedifferentpopulationsarelikelytobesimilar(probablyduetomigrationorsharedancestry).Theindependentmodelworkswellformanydatasets.Roughlyspeaking,thispriorsaysthatweexpectallelefrequenciesindifferentpopulationstobereasonablydifferentfromeachother.Thecorrelatedfrequenciesmodelsaysthattheymayactuallybequitesimilar.Thisoftenimprovesclusteringforcloselyrelatedpopulations,butmayincreasetheriskofover-estimatingK.Ifonepopulationisquitedivergentfromtheothers,thecorrelatedmodelcansometimesachievebetterinferenceifthatpopulationisremoved.Estimating:Fixing=1isagoodideaformostdata,butinsomesituationse.g.,SNPdatawheremostminorallelesarerare-smallervaluesmayworkbetter.Forthisreason,youcangettheprogramtoestimateforyourdata.Youmaywanttodothisonce,perhapsforK=1,andthenfixattheestimatedvaluethereafter,becausethereseemtobesomeproblemswithnon-identifiabilitywhentryingtoestimatetoomanyofthehyperparameters(,,F)atthesametime.EstimationofK(thenumberofpopulations)Takingcarefortworeasons:(1)itiscomputationallydifficulttoobtainaccurateestimatesofPr(X|K),andourmethodmerelyprovidesanadhocapproximation,and(2)thebiologicalinterpretationofKmaynotbestraightforward.TheprocedureforestimatingKgenerallyworkswellindatasetswithasmallnumberofdiscretepopulations.However,manyreal-worlddatasetsdonotconformpreciselytothestructuremodel(e.g.,duetoisolationbydistanceorinbreeding).Inthosecasestheremaynotbeanaturalanswertowhatisthe“correct"valueofK.Perhapsforthiskindofreason,itisnotinfrequentthatinrealdatathevalueofourmodelchoicecriterioncontinuestoincreasewithincreasingK.ThenitusuallymakessensetofocusonvaluesofKthatcapturemostofthestructureinthedataandthatseembiologicallysensible.StepsinestimatingK1)(Command-lineversion)SetCOMPUTEPROBSandINFERALPHAto1inthefileextraparams.(FrontEndversion)Makesurethatisallowedtovary.2)RuntheMCMCschemefordifferentvaluesofMAXPOPS(K).Attheenditwilloutputaline"EstimatedLnProbofData".ThisistheestimateoflnPr(X|K).YoushouldrunseveralindependentrunsforeachK,inordertoverifythattheestimatesareconsistentacrossruns.IfthevariabilityacrossrunsforagivenKissubstantialcomparedtothevariabilityofestimatesobtainedfordifferentK,youmayneedtouselongerrunsoralongerburn-inperiod.IflnPr(X|K)appearstobebimodalormultimodal,theMCMCschememaybefindingdifferentsolutions.YoucancheckforthisbycomparingtheQfordifferentrunsatasingleK.3)ComputeposteriorprobabilitiesofKForexample,whereKwas2,wegotK lnPr(X|K)1 -43562 -39833 -39824 -39835 -4006WecanstartbyassumingauniformprioronK=1-5.ThenfromBayes'Rule,Pr(K=2)isgivenbyMilddeparturesfromthemodelcanleadtooverestimatingKWhenthereisrealpopulationstructure,thisleadstoLDamongunlinkedlocianddeparturesfromHardy-Weinbergproportions.ButsomedeparturesfromthemodelcanalsoleadtoHardy-Weinbergorlinkagedisequilibrium.BeginninginVersion2,wehavesuggestedthatthecorrelatedallelefrequencymodelshouldbeusedasadefaultbecauseitoftenachievesbetterperformanceondifficultproblems,buttheusershouldbeawarethatthismaymakeiteasiertooverestimateKinsuchsettingsthanundertheindependentfrequenciesmodelFalushetal.(2003a).Howtodecidewhetherinferredstructureisreal.InformalpointersforchoosingK;isthestructurereal?ThereareacoupleofinformalpointerswhichmightbehelpfulinselectingK.Thefirstisthatit'softenthesituationthatPr(K)isverysmallforKlessthantheappropriatevalue(effectivelyzero),andthenmore-or-lessplateausforlargerK,asintheexampleofDataSet2Ashownabove.InthissortofsituationwhereseveralvaluesofKgivesimilarestimatesoflogPr(X|K),itseemsthatthesmallestoftheseisoften“correct".Itisabitdifficulttoprovideafirmruleforwhatwemeanbya“more-or-lessplateaus".Forsmalldatasets,thismightmeanthatthevaluesoflogPr(X|K)arewithin5-10,butinverybigdatasets,thedifferencebetweenK=3andK=4maybe50,butifthedifferencebetweenK=3andK=2is5,000,thenIwoulddefinitelychooseK=3.WemaynotalwaysbeabletoknowtheTRUEvalueofK,butweshouldaimforthesmallestvalueofKthatcapturesthemajorstructureinthedata.Thesecondpointeristhatiftherereallyareseparatepopulations,thereistypicallyalotofinformationaboutthevalueof,andoncetheMarkovchainconverges,willnormallysettledowntoberelativelyconstant(oftenwitharangeofperhaps0.2orless).However,ifthereisn'tanyrealstructure,willusuallyvarygreatlyduringthecourseoftherun.Supposethatyouhaveasituationwithtwoclearpopulations,butyouaretryingtodecidewhetheroneoftheseisfurthersubdivided(ie,thevalueofPr(X|K=3)issimilarto,orperhapsalittlelargerthanP(X|K=2)).Thenonethingyoucouldtryistorunstructureusingonlytheindividualsinthepopulationthatyoususpectmightbesubdivided,andseewhetherthereisastrongsignalasdescribedabove.Insummary,youshouldbeskepticalaboutpopulationstructureinferredonthebasisofsmalldifferencesinPr(K)ifthereisnoclearbiologicalinterpretationfortheassignments,andtheassignmentsareroughlysymmetrictoallpopulationsandnoindividualsarestronglyassigned.IsolationbydistancedataIsolationbydistancereferstotheideathatindividualsmaybespatiallydistributedacrosssomeregion,withlocaldispersal.Inthissituation,allelefrequenciesvarygraduallyacrosstheregion.Theunderlyingstructuremodelisnotwellsuitedtodatafromthiskindofscenario.Whenthisoccurs,theinferredvalueofK,andthecorrespondingallelefrequenciesineachgroupcanberatherarbitrary.Dependingonthesamplingscheme,mostindividualsmayhavemixedmembershipinmultiplegroups.Thatis,thealgorithmwillattempttomodeltheallelefrequenciesacrosstheregionusingweightedaveragesofKdistinctcomponents.Insuchsituations,interpretingtheresultsmaybechallenging.WhenFSTissignificant,butstructurefindsnostructureWeoccasionallygetthefollowingsortofquestion:“Ihavegenotypedataforindividualssampledfromnlocations.TestsofallelefrequencydifferencesindicatesmallbutsignificantFST
betweenatleastsomelocations.Howeverstructuredoesnotfindanydifferences.HowdoIinterprettheseresults?"Whenthepredefinedpopulationscorrespondcloselytogeneticpopulations,testingforfrequencydifferencesbetweenpredefinedgroupscanbemorepowerfulthanapplyingstructure.Thisisbecausethebasicstructuremodelsaimtosolveamuchharderstatisticalproblem,i.e.,identifyingpopulationclusterswithoutbeingtoldthelikelysubgroupsinadvance.Forthisreasonthereisapartofparameterspacewherethereisnotquiteenoughdataforstructuretogetthe“right"answer,eventhoughatestofFSTusingthepredefinedlabelsdetectspopulationdifferentiation.關(guān)聯(lián)分析associationmappingAssociationmappingreferstosignificantassociationofamolecularmarkerwithaphenotypictrait.LD(特例LA)referstonon-randomassociationbetweentwomarkersortwogenes/QTLsorbetweenagene/QTLandamarkerlocus.Thus,associationmappingisactuallyoneoftheseveralusesofLD.Instatisticalsense,associationreferstocovarianceofamarkerpolymorphismandatraitofinterest,whileLDrepresentscovarianceofpolymorphismsexhibitedbytwomolecularmarkers/genes.HowIsLDMeasured?AvarietyofstatisticshavebeenusedtomeasureLDthetwomostcommonstatistics
formeasuringLD:considerapairoflociwithallelesAandaat
locusone,andBandbatlocustwo,withallelefrequencies respectively.Theresultinghaplotypefrequenciesare
.ThebasiccomponentofallLDstatisticsisthedifferencebetweentheobservedandexpectedhaplotypefrequencies,haplotype:acombinationofallelesatmultiplelinkedlocithataretransmittedtogether.D’isscaledbasedontheobservedallelefrequencies,soitwillrangebetween0and1evenifallelefrequenciesdifferbetweentheloci.=Norecombination(mutationsattwolinkedlocinotseparatedintime);無重組Independent
assortment(mutationsattwolocinotseparatedintime);獨(dú)立搭配(C)Norecombination(onlymutationsseparatedintime);(D)Low
recombination(mutationsattwolocinotseparatedintime).LDdecayFactorsaffectingLDThefactors,whichleadtoanincreaseinLD,includeinbreeding,smallpopulationsize,geneticisolationbetweenlineages,populationsubdivision,lowrecombinationrate,populationadmixture,naturalandartificialselection,balancingselection,etc.Admixtureresultsintheintroductionofchromosomesofdifferentancestryandallelefrequencies.Often,theresultingLDextendstounlinkedsites,evenondifferentchromosomes,butbreaksdownrapidlywithrandommating.Areductioninpopulationsize(bottleneck)withaccompanyingextremegeneticdrift.Duringabottleneck,onlyfewalleliccombinationsarepassedontofuture.Someotherfactors,whichleadtoadecrease/disruptioninLD,includeoutcrossing,highrecombinationrate,highmutationrate,etc.Generally,LDdecaysmorerapidlyinoutcrossingspeciesascomparedtoselfingspecies.Thisisbecauserecombinationislesseffectiveinselfingspecies,whereindividualsaremorelikelytobehomozygous,thaninoutcrossingspecies.Thereareotherfactors,whichmayleadtoeitherincreaseordecreaseinLD,ormayincreaseLDbetweensomepairsofallelesanddecreaseLDbetweenotherpairs.Forinstance,mutationswilldisruptLDbetweenpairsinvolvingwildalleles,andwillpromoteLDbetweenpairsinvolvingmutantalleles.Similarly,genomicrearrangementsmaydisruptLDbetweengenesseparatedduetorearrangement,butLDmayincreasebetweennewgenecombinationsinthevicinityofbreakpointsduetosuppressionoflocalrecombination.GeneconversionisanonreciprocaltransferofgeneticinformationinDNAgeneticrecombination,whichoccursduringmeioticdivision.ItisaprocessbywhichDNAsequenceinformationistransferredfromoneDNAhelix(whichremainsunchanged)toanotherDNAhelix,whosesequenceisaltered.Itisoneofthewaysagenemaybemutated.Geneconversionmayleadtonon-Mendelianinheritanceandhasoftenbeenrecordedinfungalcrosses.基因轉(zhuǎn)換OtherfactorsaffectingLDincludepopulationstructure,epistasis,geneconversionandascertainmentbias.Ascertainmentbias(AB)isthebiasintroducedbythecriteriausedtoselectindividualsand/orlociinwhichgeneticvariationisassayed,sothatitleadstoinaccurateestimatesofLD.AscertainmentisthewayindividualswithatraitareselectedorfoundforgeneticstudiesandbiasisadifferencebetweentheestimatedandtruevalueofLDinastatisticalsample.MutationprovidestherawmaterialforproducingpolymorphismsthatwillbeinLD.RecombinationisthemainphenomenonthatweakensintrachromosomalLD,whereasinterchromosomalLDisbrokendownbyindependentassortment.Populationsizealsoplaysanimportantrole.Insmallpopulations,theeffectsofgeneticdriftresultintheconsistentlossofrarealleliccombinations,whichincreaseLDlevels.WhereNistheeffectivepopulationsizeandcistherecombinationfractionbetweensitesWhengeneticdriftandrecombinationareatequilibrium,Inanyorganism,LDcanbeusedforidentifying
genomicregions,whichhavebeenthetargetsof
naturalselection(directionalandbalancingselection)duringevolutionaryprocess.NaturalselectionTheprocessinnaturebywhich,accordingtoDarwin'stheoryofevolution,onlytheorganismsbestadaptedtotheirenvironmenttendtosurviveandtransmittheirgeneticcharacteristicsinincreasingnumberstosucceedinggenerationswhilethoselessadaptedtendtobeeliminatedPositivenaturalselectionistheforcethatdrivestheincreaseinprevalenceofadvantageoustraits,andithasplayedacentralroleinourdevelopmentasaspecies.將因含有有利突變而提高個體適合度的等位基因固定下來的選擇作用Positiveselection/Darwinianselection/adaptiveselectionistheprocessbywhichnewadvantageousgeneticvariantssweepapopulation.Genetichitchhiking
istheprocessbywhichanevolutionarilyneutralorinsomecasesdeleteriousalleleormutationmayspreadthroughthegenepoolbyvirtueofbeinglinkedtoagenethatispositivelyselected遺傳搭車效應(yīng)directionalselection
occurswhennaturalselectionfavorsasinglephenotypeandthereforeallelefrequencycontinuouslyshiftsinonedirection.Underdirectionalselection,theadvantageousallelewillincreaseinfrequencyindependentlyofitsdominancerelativetootheralleles(i.e.eveniftheadvantageousalleleisrecessive,itwilleventuallybecomefixed).Directionalselectionstandsincontrasttobalancingselectionwhereselectionmayfavormultiplealleles,andisthesameaspurifyingselectionwhichremovesdeleteriousmutationsfromapopulation.Purifyingselection凈化選擇PurifyingselectionreferstoselectionagainstnonsynonymoussubstitutionsattheDNAlevel.Inthiscase,theevolutionarydistancebasedonsynonymoussubstitutionsisexpectedtobegreaterthanthedistancebasedonnonsynonymoussubstitutions.Balancingselectionreferstoanumberofselectiveprocessesbywhichmultiplealleles(differentversionsofagene)areactivelymaintainedinthegenepoolofapopulationatfrequenciesabovethatofgenemutation.non-synonymousmutation在剛出現(xiàn)時是受到正選擇壓力的Structure2.0群體結(jié)構(gòu)/structure.htmlItsusesincludeinferringthepresenceofdistinctpopulations,assigningindividualstopopulations,studyinghybridzones,identifyingmigrantsandadmixedindividuals,andestimatingpopulationallelefrequenciesinsituationswheremanyindividualsaremigrantsoradmixed.Itcanbeappliedtomostofthecommonly-usedgeneticmarkers,includingSNPS,microsatellites,RFLPsandAFLPs.DepartmentofHumanGeneticsUniversityofChicagoTheprogramstructureimplementsamodel-basedclusteringmethodforinferringpopulationstructureusinggenotypedataconsistingofunlinkedmarkers.AJavaRuntimeEnvironment(JREVersion>1.5.0)bySunMicrosystemisrequiredbeforestructureinstallation.ThecompatibleJREforvariousoperatingsystemscanbedownloadedfreefrom/download.FrontEnd分析起始界面
Thefrontendorganizesdataanalysisinto“projects".Eachprojectisconnectedtoasingledatafile.Whencreatingaproject,theuseralsoprovidesinformationthatspecifyhowtoreadthedatafile(numberofloci,numberofindividuals,etc).Thesearecharacteristicsofthedatafile,andarealwaysthesamewithinthisproject.Parametersinfileextraparams.
torefinethemodelinvariousways.Booleanoptions(布爾型/是非型,測試某個對象是否是指定子類)type1for“Yes",or“Usethisoption";0for“No"or“Don'tusethisoption".Programoptions.
NOADMIX(Boolean)Assumethemodelwithoutadmixture(EachindividualisassumedtobecompletelyfromoneoftheKpopulations.)Intheoutput,insteadofprintingtheaveragevalueofQasintheadmixturecase,theprogramprintstheposteriorprobabilitythateachindividualisfromeachpopulation.1=noadmixture;0=modelwithadmixture.LINKAGE(Boolean)Usethelinkagemodel.RLOG10STARTsetstheinitialvalueofrecombinationraterperunitdistance.RLOG10MINandRLOG10MAXsettheminimumandmaximumallowedvaluesforlog10r.RLOG10PROPSDsetsthesizeoftheproposedchangestolog10rineachupdate.Eachprojectalsocontainsoneormore“parametersets".TheseallowtheusertospecifythedetailsoftheMCMCruns,includingthenumberofrepetitions,burn-inlength,etc,aswellasspecifyingthemodelofanalysis(e.g.,whethertoallowadmixture,modelsofallelefrequencies,etc).TheusercanthenruntheMarkovchainatchosenvaluesofK,foragivenparameterset.Thefrontendstoresvarioussummariesoftheresults,includinganumberofgraphicalplots,describedbelow.Buildingaproject.Firstyouneedtoconstructaninputfile.Now,clickonFileNewProject.Thisopensupawizardtoimportthedata(Figure2).Thedataarecopiedfromthespecifiedinputfileintotheworkdirectorychosenfortheproject.Thewizardconsistsoffourframes:1.Specifytheprojectdirectory,projectname,andinputdatafile.2.Specifythebasiccharacteristicsofthedatafile(numberofindividuals,ploidyofthedata(enter'2'fordiploidorganisms),numberofloci,andthevaluethatisusedtoindicatemissingdata.Clickon“Showdatafileformat"togetasummaryofthelengthsandnumberoflinesinthedatafile.FormatforthedatafileEssentially,theentiredatasetisarrangedasamatrixinasinglefile,inwhichthedataforindividualsareinrows,andthelociareincolumns.Foradiploidorganism,dataforeachindividualcanbestoredeitheras2consecutiverows,whereeachlocusisinonecolumn,orinonerow,whereeachlocusisintwoconsecutivecolumns.Unlessyouplantousethelinkagemodel(seebelow)theorderoftheallelesforasingleindividualdoesnotmatter.GenotypeData(Required;integer)Eachalleleatagivenlocusshouldbecodedbyaunique
integer(egmicrosatelliterepeatscore).Thefrontendrequiresreturnsattheendsofeachrow,and
doesnotallow
returnswithinrows;thecommand-lineversionofstructuretreatsreturnsinthesamewayasspaces
ortabs.MarkernamesRecessiveAlleles(inmaporderwithinlinkagegroups)Intermarkerdisance,-1unlinkedPhaseinformation個體編號群體數(shù)據(jù)指示(可用?)樣本來源的群體SampledatafilePOPDATA=1,NUMINDS=7,
NUMLOCI=5,andMISSING=-9.Also,POPFLAG=0,PHENOTYPE=0,EXTRACOLS=0.The
secondcolumnshowsthegeographicsamplinglocationofindividuals.不必寫出label3.(Rows)Specifywhich,ifany,oftheoptionalextrarowdataarepresent:rowofmarkernames;rowofinter-markerdistances;andarowofphasedataaftereachindividual.Alsotickthe“singleline"boxifdataforeachindividualarestoredinasinglerow,insteadofinthestandardformatoftworowsperindividual.4.(Columns)Specifywhichoftheoptionalcolumndataarethere:IndividualID(LABEL);Populationoforigin(POPDATA);USEPOPINFOflag-flagthatsaystousethePOPDATAinformationforcertainindividualswhenusingthepriorpopulationinformationmodel;phenotypedata(foruseinassociationmapping(Pritchardetal.,2000b));otherextracolumnsofdatapriortothegenotypedatathatshouldbeignoredbystructure.Whenyou'vefinishedthesesteps,you'llgetasummaryofthedataformat;ifthislookscorrect,clickon'proceed'.Theprogramwillnowattempttoloadthedatafileandcreatethenewproject.Configuringaparameterset.Onceyou'vesuccessfullyloadedadatafile,youarereadytostartrunningstructure.Youwillcreateo
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 影像密集匹配點(diǎn)云數(shù)據(jù)的建筑物平面分割方法
- 營養(yǎng)細(xì)胞在調(diào)節(jié)免疫耐受中的作用
- 企業(yè)培訓(xùn)服務(wù)理念課件
- 企業(yè)培訓(xùn)員工課件下載
- 企業(yè)垃圾分類實(shí)踐課件
- 廁所改造工程安全評估與施工監(jiān)管協(xié)議
- 常州醫(yī)療場所租賃合同標(biāo)準(zhǔn)
- 機(jī)房值守實(shí)施方案
- 高端寫字樓場房屋租賃及物業(yè)管理合同
- 租房電器處理方案
- 陜西省幼兒教師通識性知識大賽考試題庫(含答案)
- 超級辯論賽辯論比賽流程主題課件
- 脊柱轉(zhuǎn)移性腫瘤臨床研究現(xiàn)狀與進(jìn)展課件
- 銀行貿(mào)易融資業(yè)務(wù)介紹
- 跑步運(yùn)動知識講座
- IATF16949質(zhì)量體系審核檢查表2019
- 發(fā)電機(jī)應(yīng)急預(yù)案處理方案
- 果皮箱、垃圾桶等公共維保潔方案
- 人防物防技防三位一體的施工安全防護(hù)體系
- 經(jīng)外周插管的中心靜脈導(dǎo)管(PICC)護(hù)理技術(shù)標(biāo)準(zhǔn)作業(yè)程序帶彩圖
- 隆基樂葉光伏板560檢測報告
評論
0/150
提交評論