蛋白質(zhì)的功能與結(jié)構(gòu)預(yù)測(cè)_第1頁
蛋白質(zhì)的功能與結(jié)構(gòu)預(yù)測(cè)_第2頁
蛋白質(zhì)的功能與結(jié)構(gòu)預(yù)測(cè)_第3頁
蛋白質(zhì)的功能與結(jié)構(gòu)預(yù)測(cè)_第4頁
蛋白質(zhì)的功能與結(jié)構(gòu)預(yù)測(cè)_第5頁
已閱讀5頁,還剩84頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、基礎(chǔ)生物信息學(xué)及應(yīng)用基礎(chǔ)生物信息學(xué)及應(yīng)用李裕強(qiáng)李裕強(qiáng)2009.092009.09基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用第第部分部分生物分子信息的分析生物分子信息的分析第十章第十章 蛋白質(zhì)的功能與結(jié)蛋白質(zhì)的功能與結(jié)構(gòu)預(yù)測(cè)構(gòu)預(yù)測(cè)基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用n本章內(nèi)容:本章內(nèi)容:蛋白質(zhì)家族和結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)家族和結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)功能預(yù)測(cè)蛋白質(zhì)功能預(yù)測(cè)蛋白質(zhì)結(jié)構(gòu)預(yù)測(cè)蛋白質(zhì)結(jié)構(gòu)預(yù)測(cè)基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用第一節(jié)第一節(jié) 蛋白質(zhì)家族和結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)家族和結(jié)構(gòu)域數(shù)據(jù)庫n本節(jié)內(nèi)容:本節(jié)內(nèi)容:蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)

2、模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)家族數(shù)據(jù)庫蛋白質(zhì)家族數(shù)據(jù)庫蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫其它生物大分子數(shù)據(jù)庫其它生物大分子數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用1 1、蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫、蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫n模體和結(jié)構(gòu)域模體和結(jié)構(gòu)域nPROSITEPROSITE數(shù)據(jù)庫數(shù)據(jù)庫nPRINTSPRINTS數(shù)據(jù)庫數(shù)據(jù)庫nBLOCKSBLOCKS數(shù)據(jù)庫數(shù)據(jù)庫nProDomProDom數(shù)據(jù)庫數(shù)據(jù)庫nPfamPfam數(shù)據(jù)庫數(shù)據(jù)庫nSMARTSMART數(shù)據(jù)庫數(shù)據(jù)庫nInterProInterPro數(shù)據(jù)庫數(shù)據(jù)庫nConserved DomainConserved Domain數(shù)據(jù)

3、庫數(shù)據(jù)庫nCDARTCDART基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫n模體(模體(motifsmotifs)和結(jié)構(gòu)域)和結(jié)構(gòu)域 (domainsdomains):):Biologists can gain insight of the protein function Biologists can gain insight of the protein function based on identification of short consensus based on identification of short con

4、sensus sequences related to known functions. These sequences related to known functions. These consensus sequence patterns are termed motifs and consensus sequence patterns are termed motifs and domains.domains.A A motif motif is a short conserved sequence pattern associated with is a short conserve

5、d sequence pattern associated with distinct functions of a protein or DNA.distinct functions of a protein or DNA. It is often associated with a distinct structural site It is often associated with a distinct structural site performing a particular function.performing a particular function. A typical

6、 motif, such as a Zn-finger motif, is A typical motif, such as a Zn-finger motif, is ten to twenty ten to twenty amino acids longamino acids long. . 基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用A A domain domain is also a conserved sequence pattern, is also a conserved sequence pattern, defined as an independent fu

7、nctional and defined as an independent functional and structural unit. structural unit. Domains are normally Domains are normally longer than motifslonger than motifs. . A domain consists of more than 40 residues and up to A domain consists of more than 40 residues and up to 700 residues, with an av

8、erage length of 100 residues. 700 residues, with an average length of 100 residues. A domain may or may not include motifs within A domain may or may not include motifs within its boundaries. its boundaries. ExamplesExamples,transmembrane domainstransmembrane domains, ligand-binding ligand-binding d

9、omains.domains.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nIdentification of motifs and domains Identification of motifs and domains heavily relies on heavily relies on multiple sequence multiple sequence alignmentalignment as well as profile and hidden as well as profile and hidden Markov

10、 model (HMM) constructionMarkov model (HMM) construction蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nPROSITEPROSITE(蛋白質(zhì)家族及結(jié)構(gòu)域數(shù)據(jù)庫):(蛋白質(zhì)家族及結(jié)構(gòu)域數(shù)據(jù)庫):The first established sequence pattern database The first established sequence pattern database /prosite//prosite/ 是蛋白

11、質(zhì)家族和結(jié)構(gòu)域數(shù)據(jù)庫,包含具有生物學(xué)意義的位點(diǎn)、模式、可是蛋白質(zhì)家族和結(jié)構(gòu)域數(shù)據(jù)庫,包含具有生物學(xué)意義的位點(diǎn)、模式、可幫助識(shí)別蛋白質(zhì)家族的統(tǒng)計(jì)特征。幫助識(shí)別蛋白質(zhì)家族的統(tǒng)計(jì)特征。 PROSITEPROSITE中涉及的序列模式包括酶的催化位點(diǎn)、配體結(jié)合位點(diǎn)、與金屬離子中涉及的序列模式包括酶的催化位點(diǎn)、配體結(jié)合位點(diǎn)、與金屬離子結(jié)合的殘基、二硫鍵的半胱氨酸、與小分子或其它蛋白質(zhì)結(jié)合的區(qū)域等。結(jié)合的殘基、二硫鍵的半胱氨酸、與小分子或其它蛋白質(zhì)結(jié)合的區(qū)域等。 PROSITEPROSITE還包括根據(jù)多序列比對(duì)而構(gòu)建的序列統(tǒng)計(jì)特征,能更敏感地發(fā)現(xiàn)一還包括根據(jù)多序列比對(duì)而構(gòu)建的序列統(tǒng)計(jì)特征,能更敏感地發(fā)現(xiàn)一個(gè)

12、(未知)序列是否具有相應(yīng)的特征。個(gè)(未知)序列是否具有相應(yīng)的特征。 The functional information of these patterns is primarily based on The functional information of these patterns is primarily based on published literature.published literature.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用To search the database To search the

13、database with a query sequence, with a query sequence, PROSITE uses exact PROSITE uses exact matchesmatches to the sequence to the sequence patterns.patterns.基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nPRINTSPRINTS(蛋白質(zhì)模體指紋數(shù)據(jù)庫):(蛋白質(zhì)模體指紋數(shù)據(jù)庫):A fingerprint is a group of conserved motifs used A fingerprint is a group

14、 of conserved motifs used to characterise a protein family; its diagnostic to characterise a protein family; its diagnostic power is refined by iterative scanning of a SWISS-power is refined by iterative scanning of a SWISS-PROT/TrEMBL composite. Usually the motifs do not PROT/TrEMBL composite. Usua

15、lly the motifs do not overlap, but are separated along a sequence, though overlap, but are separated along a sequence, though they may be contiguous in 3D-space. they may be contiguous in 3D-space. http:/bioinf.man.ac.uk/dbbrowser/PRINTS/http:/bioinf.man.ac.uk/dbbrowser/PRINTS/提供蛋白質(zhì)同源性分析,蛋白質(zhì)模體指紋分析,系

16、統(tǒng)發(fā)生提供蛋白質(zhì)同源性分析,蛋白質(zhì)模體指紋分析,系統(tǒng)發(fā)生和序列進(jìn)化分析,以及微陣列分析,并提供生物信息學(xué)和和序列進(jìn)化分析,以及微陣列分析,并提供生物信息學(xué)和PRINTSPRINTS數(shù)據(jù)庫數(shù)據(jù)下載。數(shù)據(jù)庫數(shù)據(jù)下載。 蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nBLOCKS:BLOCKS:A database of blocksA database of blocks BlocksBlocks:ungapped multiple alignments derived from

17、the most ungapped multiple alignments derived from the most conserved, ungapped regions of homologous protein sequences. conserved, ungapped regions of homologous protein sequences. The blocks, which are usually longer than motifs, are subsequently The blocks, which are usually longer than motifs, a

18、re subsequently converted to PSSMs. converted to PSSMs. Because blocks often encompass motifs, the functional annotation of Because blocks often encompass motifs, the functional annotation of blocks is thus consistent with that for the motifs blocks is thus consistent with that for the motifs http:/

19、/blocks/blocks. . 檢測(cè)和鑒定蛋白質(zhì)模體,有檢測(cè)和鑒定蛋白質(zhì)模體,有BLOCK searchBLOCK search、Get BlocksGet Blocks和和Block MakerBlock Maker工工具具 A query sequence can be used to align with precomputed profiles in A query sequence can be used to align with precomputed profiles in the database

20、 to select the highest scored matches. the database to select the highest scored matches. 蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nProDomProDomDomain database Domain database ProDom is a comprehensive set of protein domain families ProDom is a comprehensive

21、set of protein domain families automatically generated from the SWISS-PROT and TrEMBL automatically generated from the SWISS-PROT and TrEMBL sequence databasessequence databasesThe domains are built using recursive iterations of PSI-The domains are built using recursive iterations of PSI-BLAST.BLAST

22、.http:/prodom.prabi.fr/prodom/current/html/home.phphttp:/prodom.prabi.fr/prodom/current/html/home.php提供相似性搜索、來自提供相似性搜索、來自SWISSPROTSWISSPROT相關(guān)結(jié)構(gòu)域的多序列比對(duì)相關(guān)結(jié)構(gòu)域的多序列比對(duì)蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nPfamPfam(Protein families database of alignments and HMMsProtein families database of

23、 alignments and HMMs)A database with protein domainA database with protein domain derived from sequences in SWISSPROT and TrEMBL. Each motif or domain is derived from sequences in SWISSPROT and TrEMBL. Each motif or domain is represented by an HMM profile generated from the seed alignment of a numbe

24、r represented by an HMM profile generated from the seed alignment of a number of conserved homologous proteins. of conserved homologous proteins. //The Pfam database is composed of two parts The Pfam database is composed of two parts Pfam-A involves manual

25、 alignmentsPfam-A involves manual alignments Pfam-B, automatic alignment in a way similar to ProDomPfam-B, automatic alignment in a way similar to ProDom( PSI-BLAST PSI-BLAST ). . The functional annotation of motifs in Pfam-A is often related to that in The functional annotation of motifs in Pfam-A

26、is often related to that in PROSITE. Pfam-B only contains sequence families not covered in Pfam-A.PROSITE. Pfam-B only contains sequence families not covered in Pfam-A.Because of the automatic nature, Pfam-B has a much larger coverage Because of the automatic nature, Pfam-B has a much larger coverag

27、e but is also more error prone because some HMMs are generated from but is also more error prone because some HMMs are generated from unrelated sequences.unrelated sequences.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nSMART (Simple Modular Architecture Research

28、 SMART (Simple Modular Architecture Research ToolTool):):Contains HMM profiles constructed from manually Contains HMM profiles constructed from manually refined protein refined protein domaindomain alignments. alignments. http:/smart.embl-heidelberg.de/http:/smart.embl-heidelberg.de/Alignments in th

29、e database are built based on Alignments in the database are built based on tertiary structures whenever available tertiary structures whenever available or based on PSI-BLAST profiles. or based on PSI-BLAST profiles. Alignments are further checked and refined by human Alignments are further checked

30、 and refined by human annotators before HMM profile construction. annotators before HMM profile construction. Protein functions are also manually curated. Protein functions are also manually curated. 蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nSMART (Simple Modular Architecture Research SM

31、ART (Simple Modular Architecture Research ToolTool):):The database may The database may be of better quality than Pfambe of better quality than Pfam with with more extensive functional annotations.more extensive functional annotations.Compared to Pfam, the SMART database contains an Compared to Pfam

32、, the SMART database contains an independent collection of HMMs, with emphasis on independent collection of HMMs, with emphasis on signaling, extracellular, and chromatin-associated signaling, extracellular, and chromatin-associated motifs and domains.motifs and domains.Sequence searching in this da

33、tabase produces a Sequence searching in this database produces a graphical output of domains with well-annotated graphical output of domains with well-annotated information with respect to information with respect to cellular localization, cellular localization, functional sites, superfamily, and te

34、rtiary structure.functional sites, superfamily, and tertiary structure.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nInterProInterPro:An integrated pattern database An integrat

35、ed pattern database www.ebi.ac.uk/interpro/www.ebi.ac.uk/interpro/The database integrates information from PROSITE, Pfam, PRINTS, The database integrates information from PROSITE, Pfam, PRINTS, ProDom, and SMART databases.ProDom, and SMART databases.The sequence patterns from the five databases are

36、further The sequence patterns from the five databases are further processed. Only overlapping motifs and domains in a protein processed. Only overlapping motifs and domains in a protein sequence derived by all five databases are included.sequence derived by all five databases are included.A popular

37、feature of this database is a graphical output that A popular feature of this database is a graphical output that summarizes motif matches and has links to more detailed summarizes motif matches and has links to more detailed rmation.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng)

38、 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nCDD( Conserved Domain Database)CDD( Conserved Domain Database)a collection of multiple sequence alignments for ancient a collection of multiple sequence alignments for ancient domains and full-length proteins. domains and full-length proteins. http:/www.ncbi.nlm.nih.

39、gov/Structure/cdd/cdd.shtml/Structure/cdd/cdd.shtmlThe CD-Search service may be used to identify the conserved The CD-Search service may be used to identify the conserved domains present in a protein query sequence: domains present in a protein query sequence: http:/www.ncb

40、/Structure/cdd/wrpsb.cgi/Structure/cdd/wrpsb.cgi RPS-BLAST (Reverse PSI-BLAST) is the search tool used in the CD-Search RPS-BLAST (Reverse PSI-BLAST) is the search tool used in the CD-Search service. service. uses a query sequence to search against a pre-comput

41、ed profile database uses a query sequence to search against a pre-computed profile database generated by PSI-BLAST. The role of the PSSM has changed from query to generated by PSI-BLAST. The role of the PSSM has changed from query to subject, hence the term reverse in RPS-BLAST.subject, hence the te

42、rm reverse in RPS-BLAST. It performs only one iteration of regular BLAST searching against a It performs only one iteration of regular BLAST searching against a database of PSI-BLAST profiles to find the high-scoring gapped matches.database of PSI-BLAST profiles to find the high-scoring gapped match

43、es.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nCDART (Conserved Domain Architecture) :CDART (Conserved Domain Architecture) :A A domain search programdomain search program /BLAST//BLAST/ CombinesCombines the results from RPS-BLAST, SMART, and the res

44、ults from RPS-BLAST, SMART, and Pfam.Pfam.The resulting domain architecture of a query The resulting domain architecture of a query sequence can be graphically presented along with sequence can be graphically presented along with related sequences.related sequences.CDART is not a substitute for indi

45、vidual database CDART is not a substitute for individual database searches because it often misses certain features searches because it often misses certain features that can be found in SMART and Pfam.that can be found in SMART and Pfam.蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫蛋白質(zhì)模體及結(jié)構(gòu)域數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用2 2、蛋白質(zhì)家族

46、數(shù)據(jù)庫、蛋白質(zhì)家族數(shù)據(jù)庫nCOG (Cluster of Orthologous Groups ):COG (Cluster of Orthologous Groups ):A protein A protein family databasefamily database based on phylogenetic based on phylogenetic classification. classification. /COG//COG/It is constructed by comparing prote

47、in sequences It is constructed by comparing protein sequences encoded in completely sequenced genomes. encoded in completely sequenced genomes. Unicellular clustersUnicellular clusters:檢索工具為:檢索工具為COGnitor programCOGnitor programEukaryotic ClustersEukaryotic Clusters:檢索工具為:檢索工具為KOGnitor KOGnitor A qu

48、ery sequence can be assigned function if it has A query sequence can be assigned function if it has significant similarity matches with any member of significant similarity matches with any member of the cluster. the cluster. 基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息

49、息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nProtoNet:ProtoNet:A database of clusters of homologous proteins similar A database of clusters of homologous proteins similar to COG. to COG. tonet.cs.huji.ac.il/tonet.cs.huji.ac.il/Orthologous protein sequences in the SWISSPROT database Ortho

50、logous protein sequences in the SWISSPROT database are clustered based on pairwise sequence comparisons are clustered based on pairwise sequence comparisons between all possible protein pairs using BLAST. between all possible protein pairs using BLAST. Protein relatedness is defined by the E-values

51、from Protein relatedness is defined by the E-values from the BLAST alignments. the BLAST alignments. A query protein sequence can be submitted to the server A query protein sequence can be submitted to the server for for cluster identification and functional annotationcluster identification and func

52、tional annotation. . 蛋白質(zhì)家族數(shù)據(jù)庫蛋白質(zhì)家族數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用3 3、蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫、蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫nPDBPDB(Protein Data BankProtein Data Bank)PDBPDB中含有通過實(shí)驗(yàn)(中含有通過實(shí)驗(yàn)(X X射線晶體衍射,核磁共振射線晶體衍射,核磁共振NMRNMR)測(cè)定的生物大分子的三維結(jié)構(gòu)測(cè)定的生物大分子的三維結(jié)構(gòu)蛋白質(zhì)蛋白質(zhì)核酸核酸糖類糖類其它復(fù)合物其它復(fù)合物/pdb/pdb基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及

53、及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用下載壓縮PDB文件查看PDB格式文件(next)用Joml viewer 查看三維結(jié)構(gòu)(next2)點(diǎn)擊打開查看詳細(xì)信息(next3)基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用顯式序列信息(explicit sequence):在PDB文件中,以關(guān)鍵字SEQRES作為顯式序列標(biāo)記,以該關(guān)鍵字打頭的每一行都是關(guān)于序列的信息。 隱式序列信息(implicit sequence) :PDB的隱式序列即為立體化學(xué)數(shù)據(jù),包括每個(gè)原子的名稱和原子的三維坐標(biāo)?;?礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基

54、 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫nSCOPSCOP(Structural Classification of Proteins Structural Classification of Proteins )蛋白質(zhì)結(jié)構(gòu)分)蛋白質(zhì)結(jié)構(gòu)分類數(shù)據(jù)庫類數(shù)據(jù)庫提供關(guān)于已知結(jié)構(gòu)的蛋白質(zhì)之間結(jié)構(gòu)和進(jìn)化關(guān)系的詳細(xì)描述,包括蛋白提供關(guān)于已知結(jié)構(gòu)的蛋白質(zhì)之間結(jié)構(gòu)和進(jìn)化關(guān)系的詳細(xì)描述,包括蛋白

55、質(zhì)結(jié)構(gòu)數(shù)據(jù)庫質(zhì)結(jié)構(gòu)數(shù)據(jù)庫PDBPDB中的所有條目。中的所有條目。 http:/scop.mrc-lmb.cam.ac.uk/scop/http:/scop.mrc-lmb.cam.ac.uk/scop/SCOPSCOP數(shù)據(jù)庫除了提供蛋白質(zhì)結(jié)構(gòu)和進(jìn)化關(guān)系信息外,對(duì)于每一個(gè)蛋白質(zhì)數(shù)據(jù)庫除了提供蛋白質(zhì)結(jié)構(gòu)和進(jìn)化關(guān)系信息外,對(duì)于每一個(gè)蛋白質(zhì)還包括下述信息:到還包括下述信息:到PDBPDB的連接,序列,參考文獻(xiàn),結(jié)構(gòu)的圖像等。的連接,序列,參考文獻(xiàn),結(jié)構(gòu)的圖像等。可以按結(jié)構(gòu)和進(jìn)化關(guān)系對(duì)蛋白質(zhì)分類,分類結(jié)果是一個(gè)具有層次結(jié)構(gòu)的可以按結(jié)構(gòu)和進(jìn)化關(guān)系對(duì)蛋白質(zhì)分類,分類結(jié)果是一個(gè)具有層次結(jié)構(gòu)的樹,其主要的層次是

56、家族、超家族和折疊樹,其主要的層次是家族、超家族和折疊: : 家族:具有明顯的進(jìn)化關(guān)系家族:具有明顯的進(jìn)化關(guān)系 超家族:具有遠(yuǎn)源進(jìn)化關(guān)系,具有共同的進(jìn)化源超家族:具有遠(yuǎn)源進(jìn)化關(guān)系,具有共同的進(jìn)化源 折疊類:主要結(jié)構(gòu)相似折疊類:主要結(jié)構(gòu)相似基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nDSSPDSSP(蛋白質(zhì)二級(jí)結(jié)構(gòu)數(shù)據(jù)庫)(蛋白質(zhì)二級(jí)結(jié)構(gòu)數(shù)據(jù)庫)對(duì)生物大分子數(shù)據(jù)庫對(duì)生物大分子數(shù)據(jù)庫PDBPDB中的任何一個(gè)蛋白質(zhì),根據(jù)其中的任何一個(gè)蛋白質(zhì),根據(jù)其三維結(jié)構(gòu)推導(dǎo)出對(duì)應(yīng)的二

57、級(jí)結(jié)構(gòu)。三維結(jié)構(gòu)推導(dǎo)出對(duì)應(yīng)的二級(jí)結(jié)構(gòu)。 http:/www.sander.embl-heidelberg.de/dssp/http:/www.sander.embl-heidelberg.de/dssp/對(duì)研究蛋白質(zhì)序列與蛋白質(zhì)二級(jí)結(jié)構(gòu)及空間結(jié)構(gòu)的關(guān)對(duì)研究蛋白質(zhì)序列與蛋白質(zhì)二級(jí)結(jié)構(gòu)及空間結(jié)構(gòu)的關(guān)系非常有用系非常有用除了二級(jí)結(jié)構(gòu)以外,除了二級(jí)結(jié)構(gòu)以外,DSSPDSSP還包括蛋白質(zhì)的幾何特征及還包括蛋白質(zhì)的幾何特征及溶劑。溶劑。蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用nHSSPHSSP(蛋白質(zhì)同源序列比對(duì)數(shù)據(jù)庫)(蛋白質(zhì)同源序列比對(duì)數(shù)據(jù)庫)二級(jí)數(shù)據(jù)

58、庫二級(jí)數(shù)據(jù)庫 http:/www.sander.embl-heidelberg.de/hssp/http:/www.sander.embl-heidelberg.de/hssp/數(shù)據(jù)來源于數(shù)據(jù)來源于PDBPDB,或來源于,或來源于SWISS-PROT SWISS-PROT 對(duì)于對(duì)于PDBPDB中的每一個(gè)蛋白質(zhì),中的每一個(gè)蛋白質(zhì),HSSPHSSP將與其同源的所有蛋將與其同源的所有蛋白質(zhì)序列對(duì)比排列起來,從而將相似序列的蛋白質(zhì)聚白質(zhì)序列對(duì)比排列起來,從而將相似序列的蛋白質(zhì)聚集成結(jié)構(gòu)同源的家族。集成結(jié)構(gòu)同源的家族。HSSPHSSP有助于分析蛋白質(zhì)的保守區(qū)域,研究蛋白質(zhì)的進(jìn)有助于分析蛋白質(zhì)的保守區(qū)域,

59、研究蛋白質(zhì)的進(jìn)化關(guān)系,有助于蛋白質(zhì)的分子設(shè)計(jì)。化關(guān)系,有助于蛋白質(zhì)的分子設(shè)計(jì)。蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫蛋白質(zhì)結(jié)構(gòu)數(shù)據(jù)庫基基 礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及 應(yīng)應(yīng) 用用4 4、其它生物大分子數(shù)據(jù)庫、其它生物大分子數(shù)據(jù)庫nMMDB MMDB (Molecular Modeling DatabaseMolecular Modeling Database)MMDB MMDB 是(是(NCBINCBI)EntrezEntrez的一個(gè)部分,數(shù)據(jù)庫的內(nèi)容的一個(gè)部分,數(shù)據(jù)庫的內(nèi)容包括來自于實(shí)驗(yàn)的生物大分子結(jié)構(gòu)數(shù)據(jù)。包括來自于實(shí)驗(yàn)的生物大分子結(jié)構(gòu)數(shù)據(jù)。 /e

60、ntrez/query.fcgi?d/entrez/query.fcgi?db=Structureb=Structure與與PDBPDB相比,對(duì)于數(shù)據(jù)庫中的每一個(gè)生物大分子結(jié)構(gòu),相比,對(duì)于數(shù)據(jù)庫中的每一個(gè)生物大分子結(jié)構(gòu),MMDBMMDB具有許多附加的信息,如分子的生物學(xué)功能、產(chǎn)具有許多附加的信息,如分子的生物學(xué)功能、產(chǎn)生功能的機(jī)制、分子的進(jìn)化歷史等生功能的機(jī)制、分子的進(jìn)化歷史等 。還提供生物大分子三維結(jié)構(gòu)模型顯示、結(jié)構(gòu)分析和結(jié)還提供生物大分子三維結(jié)構(gòu)模型顯示、結(jié)構(gòu)分析和結(jié)構(gòu)比較工具。構(gòu)比較工具?;?礎(chǔ)礎(chǔ) 生生 物物 信信 息息 學(xué)學(xué) 及及

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論