生物信息學(xué)分析技巧_第1頁(yè)
生物信息學(xué)分析技巧_第2頁(yè)
生物信息學(xué)分析技巧_第3頁(yè)
生物信息學(xué)分析技巧_第4頁(yè)
生物信息學(xué)分析技巧_第5頁(yè)
已閱讀5頁(yè),還剩93頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

數(shù)據(jù)庫(kù)的相似序列搜索

軍事醫(yī)學(xué)科學(xué)院放射與輻射醫(yī)學(xué)研究所李稚鋒

2008.3.19生物信息學(xué)分析技巧數(shù)據(jù)庫(kù)相似性分析需求確定新基因特異引物和通用引物設(shè)計(jì)新基因的功能研究進(jìn)化分析內(nèi)容提綱基本知識(shí)核酸相似性分析編碼序列非編碼序列基因組蛋白質(zhì)相似性分析序列結(jié)構(gòu)域

第一部分:基本知識(shí)生物信息學(xué)生物信息學(xué)(Bioinformatics)是在數(shù)學(xué)、計(jì)算機(jī)科學(xué)和生命科學(xué)的基礎(chǔ)上形成的一門(mén)新型的交叉學(xué)科,它是為理解各種數(shù)據(jù)的生物學(xué)意義,運(yùn)用數(shù)學(xué)、計(jì)算機(jī)科學(xué)與生物學(xué)手段進(jìn)行生物信息的收集、加工、存儲(chǔ)、傳播、分析與解析的科學(xué)常用生物信息資源NCBI

http:///EBI

http://www.ebi.ac.uk/

UCSCGenomeBrowserhttp:///TRANSFAC/http://www.gene-regulation.de/KEGGhttp://www.genome.jp/kegg/GO

/DIP

/NCBIhttp:///DatabasesattheEBIEMBLNucleotideDatabase-Europe’sprimarycollectionofnucleotidesequencesismaintainedincollaborationwithGenbank(USA)andDDBJ(Japan)Ensembl-Providinguptodatecompletedmetazoicgenomesandthebestpossibleautomaticannotation.UniProt–(UniversalProteinResource)istheworld'smostcomprehensivecatalogueofinformationonproteins.ItisacentralrepositoryofproteinsequenceandfunctioncreatedbyjoiningtheinformationcontainedinUniProtKB/Swiss-Prot,UniProtKB/TrEMBL,andPIR.TheUniProtKnowledgebase(UniProtKB)TheUniProtReferenceClusters(UniRef)TheUniProtArchive(UniParc)TheUniProt

MetagenomicandEnvironmentalSequences(UniMES)InterPro-InterProisadatabaseofproteinfamilies,domains,repeatsandsitesinwhichidentifiablefeaturesfoundinknownproteinscanbeappliedtonewproteinsequences.GOA-GeneOntologyAnnotationDatabase.GOAhasbeenamemberoftheGOConsortiumsince2001.TheGOAprojectaimstoprovidehigh-qualityGeneOntology(GO)annotationstoproteinsintheUniProtKnowledgebase(UniProtKB)andInternationalProteinIndex(IPI)andisacentraldatasetforothermajormulti-speciesdatabases;suchasEnsemblandNCBI.MacromolecularStructureDatabase-EuropeanProjectforthemanagementanddistributionofdataonmacromolecularstructuresArrayExpress-forgeneexpressiondataIntAct-Providesafreelyavailable,opensourcedatabasesystemandanalysistoolsforproteininteractiondata.序列格式帶注釋的文本格式GenBank數(shù)據(jù)庫(kù)的序列格式EMBL數(shù)據(jù)庫(kù)的序列格式fasta序列格式核酸序列基序格式蛋白序列功能域格式LOCUSNM_0212571885bpmRNAlinearPRI10-FEB-2008DEFINITIONHomosapiensneuroglobin(NGB),mRNA.ACCESSIONNM_021257XM_001129381XM_001132327VERSIONNM_021257.3GI:61676205KEYWORDS.SOURCEHomosapiens(human)ORGANISMHomosapiens

Eukaryota;Metazoa;Chordata;Craniata;Vertebrata;Euteleostomi;

Mammalia;Eutheria;Euarchontoglires;Primates;Haplorrhini;

Catarrhini;Hominidae;Homo.REFERENCE1(bases1to1885)AUTHORSNicolis,S.,Monzani,E.,Ciaccio,C.,Ascenzi,P.,Moens,L.and

Casella,L.TITLEReactivityandendogenousmodificationbynitriteandhydrogenperoxide:doeshumanneuroglobinactonlyasascavenger?JOURNALBiochem.J.407(1),89-99(2007)PUBMED17600531REMARKGeneRIF:ferric-NGBactsnotonlyasscavengeroftoxicspecies,butalsoasatargetoftheself-generatedreactivespeciesREFERENCE2(bases1to1885)……COMMENTREVIEWEDREFSEQ:ThisrecordhasbeencuratedbyNCBIstaff.ThereferencesequencewasderivedfromAF422797.1andBC032509.1.OnorbeforeSep25,2007thissequenceversionreplacedgi:113424669,gi:113424936,gi:21361878.

Summary:Thisgeneencodesanoxygen-bindingproteinthatisdistantlyrelatedtomembersoftheglobingenefamily.Itishighlyconservedamongothervertebrates.Itisexpressedinthecentralandperipheralnervoussystemwhereitmaybeinvolvedinincreasingoxygenavailabilityandprovidingprotectionunderhypoxic/ischemicconditions.

PublicationNote:ThisRefSeqrecordincludesasubsetofthepublicationsthatareavailableforthisgene.Pleaseseethe

EntrezGenerecordtoaccessadditionalpublications.COMPLETENESS:fulllength.PRIMARYREFSEQ_SPANPRIMARY_IDENTIFIERPRIMARY_SPANCOMP1-1641AF422797.11-16411642-1885BC032509.11349-1592FEATURESLocation/Qualifierssource1..1885/organism="Homosapiens"/mol_type="mRNA"/db_xref="taxon:9606"/chromosome="14"/map="14q24.3"gene1..1885/gene="NGB"/note="neuroglobin"/db_xref="GeneID:58157"/db_xref="HGNC:14077"/db_xref="HPRD:05602"/db_xref="MIM:605304"

exon1..464/gene="NGB"/inference="alignment:Splign"/number=1CDS376..831/gene="NGB"/codon_start=1/product="neuroglobin"/protein_id="NP_067080.1"/db_xref="GI:10864065"/db_xref="CCDS:CCDS9856.1"/db_xref="GeneID:58157"/db_xref="HGNC:14077"/db_xref="HPRD:05602"/db_xref="MIM:605304"/translation="MERPEPELIRQSWRAVSRSPLEHGTVLFARLFALEPDLLPLFQYNCRQFSSPEDCLSSPEFLDHIRKVMLVIDAAVTNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKCLGPAFTPATRAAWSQLYGAVVQAMSRGWDGE"

exon465..576/gene="NGB"/inference="alignment:Splign"/number=2

exon577..696/gene="NGB"/inference="alignment:Splign"/number=3

exon697..1876/gene="NGB"/inference="alignment:Splign"/number=4STS1623..1775/gene="NGB"/standard_name="D14S693E"/db_xref="UniSTS:151573"

polyA_signal1857..1862/gene="NGB"

polyA_site1876/gene="NGB"/experiment="experimentalevidence,noadditionaldetailsrecorded"ORIGIN1ttcccaggcc

accatagcgg

ctggcggagg

gagcgcgcgc

cttgctggcc

tggagggggc61gggggccgtg

gcggctttaa

agcgcccagc

ccaggcgtcg

cggggtgggg

cggctctggc121ggctgcgggg

cgcagggcgc

agcggccaag

cggggtcccc

ggaagcacag

ctggggtgtc181tccacctacg

actggccgcg

cgccttttct

ctcccgcgcc

agggaaggag

cggctgcggc241ccccgccggg

cggaggcacg

gggggcgtac

gaggggcgga

ggggaccgcg

tcgcggagga……1741gtactcaggc

agctggagag

aagagaaggc

agcagcagag

gcccccgccc

tcaccccagc1801catctgcact

tgtaccattt

gctctgtgct

gactgtggtc

ctataaattc

atgagaaata1861aactggttct

gtgtgcaaaa

aaaaa//GenBank數(shù)據(jù)庫(kù)的序列格式IDUPHOXE13standard;DNA;HUM;133BP.XXACM60941;XXSVM60941.1XXDT10-APR-1991(Rel.28,Created)DT18-JAN-1995(Rel.42,Lastupdated,Version4)XXDEHumanmicrobicidaloxidase(p47-phox)gene,exon1andpartialcds.XXKWautosomalchronicgranulomatousdiseaseprotein;microbicidaloxidase;KWNADPHoxidase.XXOSHomosapiens(human)OCEukaryota;Metazoa;Chordata;Craniata;Vertebrata;Mammalia;Eutheria;OCPrimates;Catarrhini;Hominidae;Homo.XXRN[1]RP1-133RXMEDLINE;91187870.RACasimirC.M.,Bu-GhanimH.N.,RodawayA.R.,BentleyD.L.,RoweP.,RASegalA.W.;RT"AutosomalrecessivechronicgranulomatousdiseasecausedbydeletionataRTdinucleotiderepeat";RLProc.Natl.Acad.Sci.U.S.A.88:2753-2757(1991).XXEMBL數(shù)據(jù)庫(kù)的序列格式常染色體隱性肉芽腫性疾患CCNCBIgi:189948XXFHKeyLocation/QualifiersFHFTsource1..133FT/db_xref="taxon:9606"FT/sequenced_mol="DNA"FT/organism="Homosapiens"FT/tissue_type="blood"FTCDS<1..33FT/codon_start=1FT/note="NCBIgi:189949"FT/gene="CYBB"FT/map="Xp21.1"FT/product="microbicidal

oxidase"FT/protein_id="AAA60086.1"FT/translation="FEKRFVPSQHY"FTintron34..>133FT/note="G00-120-513"FT/gene="CYBB"FT/map="Xp21.1"XXSQSequence133BP;30A;37C;45G;21T;0other;

tttgagaagc

gcttcgtacc

cagccagcac

tatgtgagta

gctggtggag

ggcatcccgt60

ggggggaata

cgggagggac

agcacggcca

cccttgcagt

cccagggcca

accagctcca120

gtgaggacta

acg133//EMBL數(shù)據(jù)庫(kù)的序列格式核酸序列FASTA格式多條核酸序列的FASTA格式>gi|10639033|emb|AJ245946.1|HSA245946HomosapiensmRNAforneuroglobin(NGBgene)ATGGAGCGCCCGGAGCCCGAGCTGATCCGGCAGAGCTGGCGGGCAGTGAGCCGCAGCCCGCTGGAGCACGGCACCGTCCTGTTTGCCAGGCTGTTTGCCCTGGAGCCTGACCTGCTGCCCCTCTTCCAGTACAACTGCCGCCAGTTCTCCAGCCCAGAGGACTGTCTCTCCTCGCCTGAGTTCCTGGACCACATCAGGAAGGTGATGCTCGTGATTGATGCTGCAGTGACCAATGTGGAAGACCTGTCCTCACTGGAGGAGTACCTTGCCAGCCTGGGCAGGAAGCACCGGGCAGTGGGTGTGAAGCTCAGCTCCTTCTCGACAGTGGGTGAGTCTCTGCTCTACATGCTGGAGAAGTGTCTGGGCCCTGCCTTCACACCAGCCACACGGGCTGCCTGGAGCCAACTCTACGGGGCCGTAGTGCAGGCCATGAGTCGAGGCTGGGATGGCGAGTAAGAGGCGACCCCGCCCGGCAGCCCCCATCCATCTGTGTCTGTCTGTTGGCCTGTATCTGTTGT>gi|10639821|emb|AJ245945.1|MMU245945Mus

musculusmRNAforneuroglobin(Ngbgene)GCTGCATGTGCGTTGACTGCACCCACGCCTCGAGGGTCCCATCACTGCGTCCCGCGAGTCTCCTGGGAGAGAGAGCATGGAGCGCCCGGAGTCAGAGCTGATCCGGCAGAGCTGGCGGGTAGTGAGCCGCAGCCCTCTGGAACATGGCACTGTCCTGTTCGCCAGGCTCTTCGCCCTGGAACCCAGCCTGCTGCCTCTCTTCCAGTACAATGGCCGCCAGTTCTCCAGCCCTGAGGACTGTCTCTCCTCTCCAGAATTCCTGGACCACATTAGGAAGGTGATGCTAGTGATTGATGCTGCAGTGACCAACGTGGAGGACCTGTCTTCATTGGAGGAGTACCTGACCAGCTTGGGCAGGAAGCATCGGGCAGTGGGAGTGAGGCTCAGCTCCTTCTCGACAGTAGGCGAGTCCCTGCTCTACATGCTGGAGAAGTGCCTGGGTCCCGACTTTACACCAGCTACAAGGACCGCCTGGAGCCGACTCTACGGAGCTGTGGTGCAAGCCATGAGCCGAGGCTGGGATGGGGAGTAAGAGACGAGCCAGTGCCCCTATCTATGTGTGTCTGTCTGTTGATCTGCCTGTTGTAGTCTTAGCCTCTCCCCCAGGGTCTCTCTATACCTTGGTC蛋白序列FASTA格式單一蛋白質(zhì)序列的FASTA格式>gi|10639034|emb|CAC11133.1|neuroglobin[Homosapiens]MERPEPELIRQSWRAVSRSPLEHGTVLFARLFALEPDLLPLFQYNCRQFSSPEDCLSSPEFLDHIRKVMLVIDAAVTNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKCLGPAFTPATRAAWSQLYGAVVQAMSRGWDGE相當(dāng)于自由書(shū)寫(xiě)格式>HsNGB,151aaMERPEPELIRQSWRAVSRSPLEHGTVLFARLFALEPDLLPLFQYNCRQFSSPEDCLSSPEFLDHIRKVMLVIDAAVTNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKCLGPAFTPATRAAWSQLYGAVVQAMSRGWDGE多條蛋白質(zhì)序列的FASTA格式>gi|10639034|emb|CAC11133.1|neuroglobin[Homosapiens]MERPEPELIRQSWRAVSRSPLEHGTVLFARLFALEPDLLPLFQYNCRQFSSPEDCLSSPEFLDHIRKVMLVIDAAVTNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKCLGPAFTPATRAAWSQLYGAVVQAMSRGWDGE>gi|10639822|emb|CAC11135.1|neuroglobin[Mus

musculus]MERPESELIRQSWRVVSRSPLEHGTVLFARLFALEPSLLPLFQYNGRQFSSPEDCLSSPEFLDHIRKVMLVIDAAVTNVEDLSSLEEYLTSLGRKHRAVGVRLSSFSTVGESLLYMLEKCLGPDFTPATRTAWSRLYGAVVQAMSRGWDGE核酸序列基序格式ALPHABET=ACGTlog-oddsmatrix:alength=4w=9-4.275-0.182-4.1951.408-4.296-1.4871.880-0.816-2.160-1.492-4.1711.474-0.810-4.0761.872-2.1641.537-1.487-4.195-4.2050.1130.340-0.237-0.209-0.4540.9230.390-0.834-1.336-0.0820.9050.1000.674-4.1830.130-0.201log-oddsmatrix:alength=4w=6-2.0320.3241.371-0.781-0.4090.560-0.2500.119-4.274-0.519-0.2601.167-2.1882.300-4.191-2.4651.265-4.111-0.267-2.180-1.9772.158-1.661-2.071蛋白序列功能域格式PatternprofileProfileHMMsavefileformatSELEXMultiplesequencealignmentfileformatPROSITEProfileformat序列分析實(shí)用軟件包EMBOSS-TheEuropeanMolecularBiologyOpenSoftwareSuite/SMS-TheSequenceManipulationSuite/sms2/第二部分:核酸相似性分析編碼序列表達(dá)序列mRNA,CDS,EST多序列比對(duì)NCBIBLASTBLAST-BasicLocalAlignmentSearchTool/blast/Blast.cgi多序列比對(duì)ClustalW

ftp://ftp.ebi.ac.uk/pub/software/clustalw2/clustalw-2.0.5-win.msi比對(duì)結(jié)果clustal格式*.aln需要GDE格式引物設(shè)計(jì)中的相似性問(wèn)題PrimerPremier軟件特異引物和通用引物特異性檢查PSChttp:///PSC/非編碼序列非編碼基因順式元件非編碼基因序列已有數(shù)據(jù)庫(kù)http://biobases.ibch.poznan.pl/ncRNA/.au/rnadb/miRNAhttp://microrna.sanger.ac.uk/sequences/index.shtmlRNA二級(jí)結(jié)構(gòu)ViennaRNAPackage,mfold二級(jí)結(jié)構(gòu)相似性分析Infernal

http:///按序列和結(jié)構(gòu)相似性預(yù)測(cè)microRNA的工具M(jìn)iRscan,miRAlign,MiRFinderInfernal#STOCKHOLM1.0tRNA1 GCGGAUUUAGCUCAGUUGGG.AGAGCGCCAGACUGAAGAUCUGGAGGUCCtRNA2 UCCGAUAUAGUGUAAC.GGCUAUCACAUCACGCUUUCACCGUGGAGA.CCtRNA3 UCCGUGAUAGUUUAAU.GGUCAGAAUGGGCGCUUGUCGCGUGCCAGA.UCtRNA4 GCUCGUAUGGCGCAGU.GGU.AGCGCAGCAGAUUGCAAAUCUGUUGGUCCtRNA5 GGGCACAUGGCGCAGUUGGU.AGCGCGCUUCCCUUGCAAGGAAGAGGUCA#=GCSS_cons<<<<<<<..<<<<.........>>>>.<<<<<.......>>>>>.....<tRNA1 UGUGUUCGAU

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論