版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
Populationstructure群體結(jié)構(gòu)Populationstructuremeansthe“makeup”orcompositionofapopulation.Bypopulationstructure,populationgeneticistsmeanthat,insteadofasingle,simplepopulation,populationsaresubdividedinsomeway.Theoverall"populationofpopulations"isoftencalledametapopulation,whiletheindividualcomponentpopulationsareoftencalled,well...subpopulations,butalsolocalpopulations,ordemes.Infact,inmanyrealpopulations,theremaynotbeanyobviousindividualpopulationsorsubstructureatall,andthepopulationsarecontinuous.However,evenineffectivelycontinuouspopulations,differentareascanhavedifferentgenefrequencies,becausethewholemetapopulationisnotpanmictic
(隨機(jī)交配的).Forinstance,amonghumans,Scotland,theNorthofEngland,andLondonhavesomequitemajorlanguagedifferences,suggestingsubstructure,butyouwouldbehardputtofindanexactboundarywherethereisachangeover.Suchpopulationsarestructured,butcontinuously,inspace.AverygooddefinitionofpopulationstructureiswhenpopulationshavedeviationsfromHardy-Weinbergproportions,ordeviationsfrompanmixia.Ifthereisinbreeding,orselection,orifmigrationisimportant,thenpopulationscanbesaidtobestructuredinsomeway.genefrequenciesandgenotyperatiosinarandomly-breedingpopulationremainconstantfromgenerationtogeneration.Hardy-Weinberg
law
Evolutioninvolveschangesinthegenepool.ApopulationinHardy-Weinbergequilibriumshowsnochange.Ifrecessivealleleswerecontinuallytendingtodisappear,thepopulationwouldsoonbecomehomozygous.UnderHardy-Weinbergconditions,genesthathavenopresentselectivevaluewillnonethelessberetained.WhentheHardy-WeinbergLawFailstoApplyMutationGeneFlowMembersofonepopulationmaybreedwithoccasionalimmigrantsfromanadjacentpopulationofthesamespecies.Thiscanintroducenewgenesoralterexistinggenefrequenciesintheresidents.Inmanyplantsandsomeanimals,geneflowcanoccurbetweendifferent(butstillrelated)species.Hybridization/
introgression.Ineithercase,geneflowincreasesthevariabilityofthegenepool.GeneticDriftAllelefrequencychangingsimplybychance.Noteverymemberofthepopulationwillbecomeaparentandnoteverysetofparentswillproducethesamenumberofoffspring.NonrandomMatingOneofthecornerstonesoftheHardy-Weinbergequilibriumisthatmatinginthepopulationmustberandom.Ifindividuals(usuallyfemales)arechoosyintheirselectionofmates,thegenefrequenciesmaybecomealtered.Darwincalledthissexualselection.Nonrandommatingseemstobequitecommon.Method
testingforpopulationstructureA
standardapproachinvolvessamplingDNAfrommembersofanumberofpotentialsourcepopulationsand
usingthesesamplestoestimateallelefrequenciesineachpopulationataseriesofunlinkedloci.Usingtheestimatedallelefrequencies,itisthenpossibletocomputethelikelihoodthatagivengenotypeoriginatedfromeachpopulation.Individualsofunknownorigincanbegeneticinformationandmightbeassignedtopopulationsaccordingtotheselikelihoods.Forexample,whenassociationmappingisusedtofinddiseasegenes,thepresenceofundetectedpopulationstructurecanleadtospuriousassociationsandthusinvalidatestandardtests.AncestryModelsFourmainmodelsfortheancestryofindividuals:Noadmixturemodel.EachindividualcomespurelyfromoneoftheKpopulations.Theoutputreportstheposteriorprobabilitythatindividualiisfrompopulationk.Thepriorprobabilityforeachpopulationis1=K.Thismodelisappropriateforstudyingfullydiscretepopulationsandisoftenmorepowerfulthantheadmixturemodelatdetectingsubtlestructure.先驗(yàn)概率是在缺乏某個(gè)事實(shí)的情況下描述一個(gè)變量;
而后驗(yàn)概率是在考慮了一個(gè)事實(shí)之后的條件概率。先驗(yàn)概率通常是經(jīng)驗(yàn)豐富的專家的純主觀的估計(jì),比如在美國大選中奧巴馬的支持率p,
在進(jìn)行民意調(diào)查之前,可以先驗(yàn)概率來表達(dá)這個(gè)不確定性。
theadmixturemodeleachindividualdrawssomefractionofhis/hergenomefromeachoftheKpopulations;Individualsmayhavemixedancestry.Thisismodeledbysayingthat
individualihasinheritedsomefractionofhis/hergenomefromancestorsinpopulationk.The
outputrecordstheposteriormeanestimatesoftheseproportions.Conditionalontheancestry
vector,q(i),theoriginofeachalleleisindependent.Werecommendthismodelasastartingpointformostanalyses.Itisareasonablyflexiblemodel
fordealingwithmanyofthecomplexitiesofrealpopulations.Admixtureisacommonfeatureofrealdata,andyouprobablywon'tfinditifyouusetheno-admixturemodel.LinkagemodelThisisessentiallyageneralizationoftheadmixturemodeltodealwith“admixturelinkagedisequilibrium”i.e.,thecorrelationsthatarisebetweenlinkedmarkersinrecentlyadmixedpopulations.Thebasicmodelisthat,tgenerationsinthepast,therewasanadmixtureeventthatmixedtheKpopulations.Ifyouconsideranindividualchromosome,itiscomposedofaseriesof“chunks"thatareinheritedasdiscreteunitsfromancestorsatthetimeoftheadmixture.AdmixtureLDarisesbecauselinkedallelesareoftenonthesamechunk,andthereforecomefromthesameancestralpopulation.Thesizesofthechunksareassumedtobeindependentexponentialrandomvariableswithmeanlength1/t(inMorgans).Inpracticeweestimatea“recombinationrate"rfromthedatathatcorrespondstotherateofswitchingfromthepresentchunktoanewchunk.Eachchunkinindividualiisderivedindependentlyfrompopulationkwithprobabilityq(i)k,whereq(i)kistheproportionofthatindividual'sancestryfrompopulationk.Usingpriorpopulationinformation.Geneticinformationtolearnaboutpopulationstructure.However,thereisoftenotherinformationthatmightberelevanttotheclustering(e.g.,physicalcharacteristicsofsampledindividualsorgeographicsamplinglocation).Atpresent,structurecanusethisinformationintwoways.First,theusermightfindthatthepre-definedgroups(egsamplinglocations)correspondalmostexactlytostructureclusters.Second,priorinformationmaybeintroducedthroughtheuseoflearningsamples:ie.,someindividualsareofknownorigin,andareusedtoclassifyindividualsofunknownorigin.ForexampleBeaumontetal.(2001)wantedtolearnabouttheancestryofScottishwildcats(manyofwhicharehybridizedwithferaldomesticcats).Theyhadgeneticdatafromabunchofpethousecatswhichweredenfinedasbeinginonepopulation,andtheyinferredQforthewildcats(withK=2).Useofthissortofpriorinformationwillnormallyimprovetheaccuracyoftheinference.AllelefrequencymodelsTwobasicmodelsOnemodelassumesthattheallelefrequenciesineachpopulationareindependentdrawsfromadistributionthatisspeciedbyaparametercalled.=1isthedefaultsetting.Anothermodelwithcorrelatedallelefrequencies.Thissaysthatfrequenciesinthedifferentpopulationsarelikelytobesimilar(probablyduetomigrationorsharedancestry).Theindependentmodelworkswellformanydatasets.Roughlyspeaking,thispriorsaysthatweexpectallelefrequenciesindifferentpopulationstobereasonablydifferentfromeachother.Thecorrelatedfrequenciesmodelsaysthattheymayactuallybequitesimilar.Thisoftenimprovesclusteringforcloselyrelatedpopulations,butmayincreasetheriskofover-estimatingK.Ifonepopulationisquitedivergentfromtheothers,thecorrelatedmodelcansometimesachievebetterinferenceifthatpopulationisremoved.Estimating:Fixing=1isagoodideaformostdata,butinsomesituationse.g.,SNPdatawheremostminorallelesarerare-smallervaluesmayworkbetter.Forthisreason,youcangettheprogramtoestimateforyourdata.Youmaywanttodothisonce,perhapsforK=1,andthenfixattheestimatedvaluethereafter,becausethereseemtobesomeproblemswithnon-identifiabilitywhentryingtoestimatetoomanyofthehyperparameters(,,F)atthesametime.EstimationofK(thenumberofpopulations)Takingcarefortworeasons:(1)itiscomputationallydifficulttoobtainaccurateestimatesofPr(X|K),andourmethodmerelyprovidesanadhocapproximation,and(2)thebiologicalinterpretationofKmaynotbestraightforward.TheprocedureforestimatingKgenerallyworkswellindatasetswithasmallnumberofdiscretepopulations.However,manyreal-worlddatasetsdonotconformpreciselytothestructuremodel(e.g.,duetoisolationbydistanceorinbreeding).Inthosecasestheremaynotbeanaturalanswertowhatisthe“correct"valueofK.Perhapsforthiskindofreason,itisnotinfrequentthatinrealdatathevalueofourmodelchoicecriterioncontinuestoincreasewithincreasingK.ThenitusuallymakessensetofocusonvaluesofKthatcapturemostofthestructureinthedataandthatseembiologicallysensible.StepsinestimatingK1)(Command-lineversion)SetCOMPUTEPROBSandINFERALPHAto1inthefileextraparams.(FrontEndversion)Makesurethatisallowedtovary.2)RuntheMCMCschemefordifferentvaluesofMAXPOPS(K).Attheenditwilloutputaline"EstimatedLnProbofData".ThisistheestimateoflnPr(X|K).YoushouldrunseveralindependentrunsforeachK,inordertoverifythattheestimatesareconsistentacrossruns.IfthevariabilityacrossrunsforagivenKissubstantialcomparedtothevariabilityofestimatesobtainedfordifferentK,youmayneedtouselongerrunsoralongerburn-inperiod.IflnPr(X|K)appearstobebimodalormultimodal,theMCMCschememaybefindingdifferentsolutions.YoucancheckforthisbycomparingtheQfordifferentrunsatasingleK.3)ComputeposteriorprobabilitiesofKForexample,whereKwas2,wegotK lnPr(X|K)1 -43562 -39833 -39824 -39835 -4006WecanstartbyassumingauniformprioronK=1-5.ThenfromBayes'Rule,Pr(K=2)isgivenbyMilddeparturesfromthemodelcanleadtooverestimatingKWhenthereisrealpopulationstructure,thisleadstoLDamongunlinkedlocianddeparturesfromHardy-Weinbergproportions.ButsomedeparturesfromthemodelcanalsoleadtoHardy-Weinbergorlinkagedisequilibrium.BeginninginVersion2,wehavesuggestedthatthecorrelatedallelefrequencymodelshouldbeusedasadefaultbecauseitoftenachievesbetterperformanceondifficultproblems,buttheusershouldbeawarethatthismaymakeiteasiertooverestimateKinsuchsettingsthanundertheindependentfrequenciesmodelFalushetal.(2003a).Howtodecidewhetherinferredstructureisreal.InformalpointersforchoosingK;isthestructurereal?ThereareacoupleofinformalpointerswhichmightbehelpfulinselectingK.Thefirstisthatit'softenthesituationthatPr(K)isverysmallforKlessthantheappropriatevalue(effectivelyzero),andthenmore-or-lessplateausforlargerK,asintheexampleofDataSet2Ashownabove.InthissortofsituationwhereseveralvaluesofKgivesimilarestimatesoflogPr(X|K),itseemsthatthesmallestoftheseisoften“correct".Itisabitdifficulttoprovideafirmruleforwhatwemeanbya“more-or-lessplateaus".Forsmalldatasets,thismightmeanthatthevaluesoflogPr(X|K)arewithin5-10,butinverybigdatasets,thedifferencebetweenK=3andK=4maybe50,butifthedifferencebetweenK=3andK=2is5,000,thenIwoulddefinitelychooseK=3.WemaynotalwaysbeabletoknowtheTRUEvalueofK,butweshouldaimforthesmallestvalueofKthatcapturesthemajorstructureinthedata.Thesecondpointeristhatiftherereallyareseparatepopulations,thereistypicallyalotofinformationaboutthevalueof,andoncetheMarkovchainconverges,willnormallysettledowntoberelativelyconstant(oftenwitharangeofperhaps0.2orless).However,ifthereisn'tanyrealstructure,willusuallyvarygreatlyduringthecourseoftherun.Supposethatyouhaveasituationwithtwoclearpopulations,butyouaretryingtodecidewhetheroneoftheseisfurthersubdivided(ie,thevalueofPr(X|K=3)issimilarto,orperhapsalittlelargerthanP(X|K=2)).Thenonethingyoucouldtryistorunstructureusingonlytheindividualsinthepopulationthatyoususpectmightbesubdivided,andseewhetherthereisastrongsignalasdescribedabove.Insummary,youshouldbeskepticalaboutpopulationstructureinferredonthebasisofsmalldifferencesinPr(K)ifthereisnoclearbiologicalinterpretationfortheassignments,andtheassignmentsareroughlysymmetrictoallpopulationsandnoindividualsarestronglyassigned.IsolationbydistancedataIsolationbydistancereferstotheideathatindividualsmaybespatiallydistributedacrosssomeregion,withlocaldispersal.Inthissituation,allelefrequenciesvarygraduallyacrosstheregion.Theunderlyingstructuremodelisnotwellsuitedtodatafromthiskindofscenario.Whenthisoccurs,theinferredvalueofK,andthecorrespondingallelefrequenciesineachgroupcanberatherarbitrary.Dependingonthesamplingscheme,mostindividualsmayhavemixedmembershipinmultiplegroups.Thatis,thealgorithmwillattempttomodeltheallelefrequenciesacrosstheregionusingweightedaveragesofKdistinctcomponents.Insuchsituations,interpretingtheresultsmaybechallenging.WhenFSTissignificant,butstructurefindsnostructureWeoccasionallygetthefollowingsortofquestion:“Ihavegenotypedataforindividualssampledfromnlocations.TestsofallelefrequencydifferencesindicatesmallbutsignificantFST
betweenatleastsomelocations.Howeverstructuredoesnotfindanydifferences.HowdoIinterprettheseresults?"Whenthepredefinedpopulationscorrespondcloselytogeneticpopulations,testingforfrequencydifferencesbetweenpredefinedgroupscanbemorepowerfulthanapplyingstructure.Thisisbecausethebasicstructuremodelsaimtosolveamuchharderstatisticalproblem,i.e.,identifyingpopulationclusterswithoutbeingtoldthelikelysubgroupsinadvance.Forthisreasonthereisapartofparameterspacewherethereisnotquiteenoughdataforstructuretogetthe“right"answer,eventhoughatestofFSTusingthepredefinedlabelsdetectspopulationdifferentiation.關(guān)聯(lián)分析associationmappingAssociationmappingreferstosignificantassociationofamolecularmarkerwithaphenotypictrait.LD(特例LA)referstonon-randomassociationbetweentwomarkersortwogenes/QTLsorbetweenagene/QTLandamarkerlocus.Thus,associationmappingisactuallyoneoftheseveralusesofLD.Instatisticalsense,associationreferstocovarianceofamarkerpolymorphismandatraitofinterest,whileLDrepresentscovarianceofpolymorphismsexhibitedbytwomolecularmarkers/genes.HowIsLDMeasured?AvarietyofstatisticshavebeenusedtomeasureLDthetwomostcommonstatistics
formeasuringLD:considerapairoflociwithallelesAandaat
locusone,andBandbatlocustwo,withallelefrequencies respectively.Theresultinghaplotypefrequenciesare
.ThebasiccomponentofallLDstatisticsisthedifferencebetweentheobservedandexpectedhaplotypefrequencies,haplotype:acombinationofallelesatmultiplelinkedlocithataretransmittedtogether.D’isscaledbasedontheobservedallelefrequencies,soitwillrangebetween0and1evenifallelefrequenciesdifferbetweentheloci.=Norecombination(mutationsattwolinkedlocinotseparatedintime);無重組Independent
assortment(mutationsattwolocinotseparatedintime);獨(dú)立搭配(C)Norecombination(onlymutationsseparatedintime);(D)Low
recombination(mutationsattwolocinotseparatedintime).LDdecayFactorsaffectingLDThefactors,whichleadtoanincreaseinLD,includeinbreeding,smallpopulationsize,geneticisolationbetweenlineages,populationsubdivision,lowrecombinationrate,populationadmixture,naturalandartificialselection,balancingselection,etc.Admixtureresultsintheintroductionofchromosomesofdifferentancestryandallelefrequencies.Often,theresultingLDextendstounlinkedsites,evenondifferentchromosomes,butbreaksdownrapidlywithrandommating.Areductioninpopulationsize(bottleneck)withaccompanyingextremegeneticdrift.Duringabottleneck,onlyfewalleliccombinationsarepassedontofuture.Someotherfactors,whichleadtoadecrease/disruptioninLD,includeoutcrossing,highrecombinationrate,highmutationrate,etc.Generally,LDdecaysmorerapidlyinoutcrossingspeciesascomparedtoselfingspecies.Thisisbecauserecombinationislesseffectiveinselfingspecies,whereindividualsaremorelikelytobehomozygous,thaninoutcrossingspecies.Thereareotherfactors,whichmayleadtoeitherincreaseordecreaseinLD,ormayincreaseLDbetweensomepairsofallelesanddecreaseLDbetweenotherpairs.Forinstance,mutationswilldisruptLDbetweenpairsinvolvingwildalleles,andwillpromoteLDbetweenpairsinvolvingmutantalleles.Similarly,genomicrearrangementsmaydisruptLDbetweengenesseparatedduetorearrangement,butLDmayincreasebetweennewgenecombinationsinthevicinityofbreakpointsduetosuppressionoflocalrecombination.GeneconversionisanonreciprocaltransferofgeneticinformationinDNAgeneticrecombination,whichoccursduringmeioticdivision.ItisaprocessbywhichDNAsequenceinformationistransferredfromoneDNAhelix(whichremainsunchanged)toanotherDNAhelix,whosesequenceisaltered.Itisoneofthewaysagenemaybemutated.Geneconversionmayleadtonon-Mendelianinheritanceandhasoftenbeenrecordedinfungalcrosses.基因轉(zhuǎn)換OtherfactorsaffectingLDincludepopulationstructure,epistasis,geneconversionandascertainmentbias.Ascertainmentbias(AB)isthebiasintroducedbythecriteriausedtoselectindividualsand/orlociinwhichgeneticvariationisassayed,sothatitleadstoinaccurateestimatesofLD.AscertainmentisthewayindividualswithatraitareselectedorfoundforgeneticstudiesandbiasisadifferencebetweentheestimatedandtruevalueofLDinastatisticalsample.MutationprovidestherawmaterialforproducingpolymorphismsthatwillbeinLD.RecombinationisthemainphenomenonthatweakensintrachromosomalLD,whereasinterchromosomalLDisbrokendownbyindependentassortment.Populationsizealsoplaysanimportantrole.Insmallpopulations,theeffectsofgeneticdriftresultintheconsistentlossofrarealleliccombinations,whichincreaseLDlevels.WhereNistheeffectivepopulationsizeandcistherecombinationfractionbetweensitesWhengeneticdriftandrecombinationareatequilibrium,Inanyorganism,LDcanbeusedforidentifying
genomicregions,whichhavebeenthetargetsof
naturalselection(directionalandbalancingselection)duringevolutionaryprocess.NaturalselectionTheprocessinnaturebywhich,accordingtoDarwin'stheoryofevolution,onlytheorganismsbestadaptedtotheirenvironmenttendtosurviveandtransmittheirgeneticcharacteristicsinincreasingnumberstosucceedinggenerationswhilethoselessadaptedtendtobeeliminatedPositivenaturalselectionistheforcethatdrivestheincreaseinprevalenceofadvantageoustraits,andithasplayedacentralroleinourdevelopmentasaspecies.將因含有有利突變而提高個(gè)體適合度的等位基因固定下來的選擇作用Positiveselection/Darwinianselection/adaptiveselectionistheprocessbywhichnewadvantageousgeneticvariantssweepapopulation.Genetichitchhiking
istheprocessbywhichanevolutionarilyneutralorinsomecasesdeleteriousalleleormutationmayspreadthroughthegenepoolbyvirtueofbeinglinkedtoagenethatispositivelyselected遺傳搭車效應(yīng)directionalselection
occurswhennaturalselectionfavorsasinglephenotypeandthereforeallelefrequencycontinuouslyshiftsinonedirection.Underdirectionalselection,theadvantageousallelewillincreaseinfrequencyindependentlyofitsdominancerelativetootheralleles(i.e.eveniftheadvantageousalleleisrecessive,itwilleventuallybecomefixed).Directionalselectionstandsincontrasttobalancingselectionwhereselectionmayfavormultiplealleles,andisthesameaspurifyingselectionwhichremovesdeleteriousmutationsfromapopulation.Purifyingselection凈化選擇PurifyingselectionreferstoselectionagainstnonsynonymoussubstitutionsattheDNAlevel.Inthiscase,theevolutionarydistancebasedonsynonymoussubstitutionsisexpectedtobegreaterthanthedistancebasedonnonsynonymoussubstitutions.Balancingselectionreferstoanumberofselectiveprocessesbywhichmultiplealleles(differentversionsofagene)areactivelymaintainedinthegenepoolofapopulationatfrequenciesabovethatofgenemutation.non-synonymousmutation在剛出現(xiàn)時(shí)是受到正選擇壓力的Structure2.0群體結(jié)構(gòu)/structure.htmlItsusesincludeinferringthepresenceofdistinctpopulations,assigningindividualstopopulations,studyinghybridzones,identifyingmigrantsandadmixedindividuals,andestimatingpopulationallelefrequenciesinsituationswheremanyindividualsaremigrantsoradmixed.Itcanbeappliedtomostofthecommonly-usedgeneticmarkers,includingSNPS,microsatellites,RFLPsandAFLPs.DepartmentofHumanGeneticsUniversityofChicagoTheprogramstructureimplementsamodel-basedclusteringmethodforinferringpopulationstructureusinggenotypedataconsistingofunlinkedmarkers.AJavaRuntimeEnvironment(JREVersion>1.5.0)bySunMicrosystemisrequiredbeforestructureinstallation.ThecompatibleJREforvariousoperatingsystemscanbedownloadedfreefrom/download.FrontEnd分析起始界面
Thefrontendorganizesdataanalysisinto“projects".Eachprojectisconnectedtoasingledatafile.Whencreatingaproject,theuseralsoprovidesinformationthatspecifyhowtoreadthedatafile(numberofloci,numberofindividuals,etc).Thesearecharacteristicsofthedatafile,andarealwaysthesamewithinthisproject.Parametersinfileextraparams.
torefinethemodelinvariousways.Booleanoptions(布爾型/是非型,測試某個(gè)對象是否是指定子類)type1for“Yes",or“Usethisoption";0for“No"or“Don'tusethisoption".Programoptions.
NOADMIX(Boolean)Assumethemodelwithoutadmixture(EachindividualisassumedtobecompletelyfromoneoftheKpopulations.)Intheoutput,insteadofprintingtheaveragevalueofQasintheadmixturecase,theprogramprintstheposteriorprobabilitythateachindividualisfromeachpopulation.1=noadmixture;0=modelwithadmixture.LINKAGE(Boolean)Usethelinkagemodel.RLOG10STARTsetstheinitialvalueofrecombinationraterperunitdistance.RLOG10MINandRLOG10MAXsettheminimumandmaximumallowedvaluesforlog10r.RLOG10PROPSDsetsthesizeoftheproposedchangestolog10rineachupdate.Eachprojectalsocontainsoneormore“parametersets".TheseallowtheusertospecifythedetailsoftheMCMCruns,includingthenumberofrepetitions,burn-inlength,etc,aswellasspecifyingthemodelofanalysis(e.g.,whethertoallowadmixture,modelsofallelefrequencies,etc).TheusercanthenruntheMarkovchainatchosenvaluesofK,foragivenparameterset.Thefrontendstoresvarioussummariesoftheresults,includinganumberofgraphicalplots,describedbelow.Buildingaproject.Firstyouneedtoconstructaninputfile.Now,clickonFileNewProject.Thisopensupawizardtoimportthedata(Figure2).Thedataarecopiedfromthespecifiedinputfileintotheworkdirectorychosenfortheproject.Thewizardconsistsoffourframes:1.Specifytheprojectdirectory,projectname,andinputdatafile.2.Specifythebasiccharacteristicsofthedatafile(numberofindividuals,ploidyofthedata(enter'2'fordiploidorganisms),numberofloci,andthevaluethatisusedtoindicatemissingdata.Clickon“Showdatafileformat"togetasummaryofthelengthsandnumberoflinesinthedatafile.FormatforthedatafileEssentially,theentiredatasetisarrangedasamatrixinasinglefile,inwhichthedataforindividualsareinrows,andthelociareincolumns.Foradiploidorganism,dataforeachindividualcanbestoredeitheras2consecutiverows,whereeachlocusisinonecolumn,orinonerow,whereeachlocusisintwoconsecutivecolumns.Unlessyouplantousethelinkagemodel(seebelow)theorderoftheallelesforasingleindividualdoesnotmatter.GenotypeData(Required;integer)Eachalleleatagivenlocusshouldbecodedbyaunique
integer(egmicrosatelliterepeatscore).Thefrontendrequiresreturnsattheendsofeachrow,and
doesnotallow
returnswithinrows;thecommand-lineversionofstructuretreatsreturnsinthesamewayasspaces
ortabs.MarkernamesRecessiveAlleles(inmaporderwithinlinkagegroups)Intermarkerdisance,-1unlinkedPhaseinformation個(gè)體編號群體數(shù)據(jù)指示(可用?)樣本來源的群體SampledatafilePOPDATA=1,NUMINDS=7,
NUMLOCI=5,andMISSING=-9.Also,POPFLAG=0,PHENOTYPE=0,EXTRACOLS=0.The
secondcolumnshowsthegeographicsamplinglocationofindividuals.不必寫出label3.(Rows)Specifywhich,ifany,oftheoptionalextrarowdataarepresent:rowofmarkernames;rowofinter-markerdistances;andarowofphasedataaftereachindividual.Alsotickthe“singleline"boxifdataforeachindividualarestoredinasinglerow,insteadofinthestandardformatoftworowsperindividual.4.(Columns)Specifywhichoftheoptionalcolumndataarethere:IndividualID(LABEL);Populationoforigin(POPDATA);USEPOPINFOflag-flagthatsaystousethePOPDATAinformationforcertainindividualswhenusingthepriorpopulationinformationmodel;phenotypedata(foruseinassociationmapping(Pritchardetal.,2000b));otherextracolumnsofdatapriortothegenotypedatathatshouldbeignoredbystructure.Whenyou'vefinishedthesesteps,you'llgetasummaryofthedataformat;ifthislookscorrect,clickon'proceed'.Theprogramwillnowattempttoloadthedatafileandcreatethenewproject.Configuringaparameterset.Onceyou'vesuccessfullyloadedadatafile,youarereadytostartrunningstructure.Youwillcreateo
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 賀州學(xué)院《食品安全學(xué)》2022-2023學(xué)年第一學(xué)期期末試卷
- 賀州學(xué)院《商務(wù)導(dǎo)論》2021-2022學(xué)年第一學(xué)期期末試卷
- 賀州學(xué)院《果蔬貯運(yùn)學(xué)》2022-2023學(xué)年第一學(xué)期期末試卷
- 菏澤學(xué)院《幼兒美術(shù)》2022-2023學(xué)年第一學(xué)期期末試卷
- 菏澤學(xué)院《教育科學(xué)研究方法》2022-2023學(xué)年第一學(xué)期期末試卷
- 菏澤學(xué)院《計(jì)算機(jī)圖形學(xué)基礎(chǔ)》2021-2022學(xué)年第一學(xué)期期末試卷
- 菏澤學(xué)院《體育管理學(xué)》2023-2024學(xué)年第一學(xué)期期末試卷
- 河南師范大學(xué)《中學(xué)美術(shù)學(xué)科課程標(biāo)準(zhǔn)與教材研究》2023-2024學(xué)年第一學(xué)期期末試卷
- 河南師范大學(xué)《中國史學(xué)史》2023-2024學(xué)年第一學(xué)期期末試卷
- 河南師范大學(xué)《數(shù)據(jù)通信與協(xié)議》2022-2023學(xué)年第一學(xué)期期末試卷
- DB44∕T 858-2011 空調(diào)器高處作業(yè)安全規(guī)范
- (一中)報(bào)送三定方案的請示
- 2018年榮縣中學(xué)科技節(jié)活動方案
- 挑流消能計(jì)算書挑流消能計(jì)算書
- 四年級上冊語文生字表(帶拼音、部首、筆畫、組詞)
- 工程項(xiàng)目管理-英文課件-RiskManagement.ppt
- 手繪POP海報(bào)設(shè)計(jì)ppt課件
- 同花順公式函數(shù)手冊
- 中歷史課堂教學(xué)的不同課型的基本方法和要求
- 中央空調(diào)管網(wǎng)改造工程施工組織設(shè)計(jì)
- 單位公務(wù)用車加油登記表格模板正式版
評論
0/150
提交評論