Скачать презентацию Bioinformatics Structural and functional prediction Master in Molecular Скачать презентацию Bioinformatics Structural and functional prediction Master in Molecular

e67f64fb07e857a5c0357ee683b17854.ppt

  • Количество слайдов: 58

Bioinformatics Structural and functional prediction Master in Molecular Biotecnology 2009 -10 Bioinformatics Structural and functional prediction Master in Molecular Biotecnology 2009 -10

Outline ¡ ¡ ¡ Introduction Biological Databases Sequence Comparison 3 D Structure visualization Functional Outline ¡ ¡ ¡ Introduction Biological Databases Sequence Comparison 3 D Structure visualization Functional Prediction Structural Prediction http: //mmb. pcb. ub. es/MBIOTEC/

Material and Evaluation ¡ Exercises and slides l l ¡ Campus Virtual http: //mmb. Material and Evaluation ¡ Exercises and slides l l ¡ Campus Virtual http: //mmb. pcb. ub. es/MBIOTEC Evaluation. l Practical test on Campus Virtual.

Bioinformatics intuitive definition Informatic tools that can suggest solutions to biological problems Bioinformatics intuitive definition Informatic tools that can suggest solutions to biological problems

You really understand a system when you are able to represent it using a You really understand a system when you are able to represent it using a mathematical equation Lord Kelvin Living organisms are the most perverse of chemical systems Coulson

FEBRUARY 2001: Public Consortium Celera Genomics NOVIEMBRE 2001 : Ohio State University FEBRUARY 2001: Public Consortium Celera Genomics NOVIEMBRE 2001 : Ohio State University

Sequencing Paralel / combinatorial synthesis HT Screening Separation Purification Crystallization. . . Sequencing Paralel / combinatorial synthesis HT Screening Separation Purification Crystallization. . .

DATA INFORMATION DATA INFORMATION

Bioinformatics Genome projects ¡ Functional genomics ¡ Structural genomics ¡ Proteomics, systems biology ¡ Bioinformatics Genome projects ¡ Functional genomics ¡ Structural genomics ¡ Proteomics, systems biology ¡ Molecular recognition ¡ … ¡

Genome projects Genome Determinations Massive sequencing Genome Annotation Genomics and disease Genome projects Genome Determinations Massive sequencing Genome Annotation Genomics and disease

+ 2124 Virus http: //www. ncbi. nlm. nih. gov/genomes/static/gpstat. html + 2124 Virus http: //www. ncbi. nlm. nih. gov/genomes/static/gpstat. html

http: //www. ncbi. nlm. nih. gov/mapview/map_search. cgi? taxid=960 6&build=previous http: //www. ncbi. nlm. nih. gov/mapview/map_search. cgi? taxid=960 6&build=previous

Functional genomics Statistical analysis DNA-chips …. Expression profiles Image processing Data mining Functional genomics Statistical analysis DNA-chips …. Expression profiles Image processing Data mining

Statistical analysis ¡ Clustering ¡ Machine learning methods ¡ Ontology ¡ Statistical analysis ¡ Clustering ¡ Machine learning methods ¡ Ontology ¡

http: //www. ncbi. nlm. nih. gov/geo/ http: //www. ncbi. nlm. nih. gov/geo/

Structural genomics X Ray NMR Structure selection 3 D Structure Homology 3 D Structure-function Structural genomics X Ray NMR Structure selection 3 D Structure Homology 3 D Structure-function analysis Molecular modeling Structure-function New biomolecules

Rosalyn Franklin Mapa difracción B-DNA Rosalyn Franklin Mapa difracción B-DNA

COX-2 FKBP ADA XO COX-2 FKBP ADA XO

ATP (Mg) - ACV ATP (Mg) - ACV

Dynamic properties. ¡ Molecular recognition requires structural adjustment Dynamic properties. ¡ Molecular recognition requires structural adjustment

Proteomics Proteoma Metaboloma System biology Proteomics Proteoma Metaboloma System biology

HUMAN PLASMA HUMAN PLASMA

http: //www. imb-jena. de/jcb/ppi/ http: //www. imb-jena. de/jcb/ppi/

Barabasi et al. (and others), since 1999 Pazos et al. , EMBO Reports 2003 Barabasi et al. (and others), since 1999 Pazos et al. , EMBO Reports 2003

Bioinformatics & prediction ¡ Most used bioinformatics tools try to predict function or structure Bioinformatics & prediction ¡ Most used bioinformatics tools try to predict function or structure of macromolecules ¡ Sequence information is the primary entry point ¡ Evolutionary pressure assures conservation l DNA seq < Protein 3 D structure

Prediction. Possible scenarios 1. Homology can be recognized using sequence comparison tools or protein Prediction. Possible scenarios 1. Homology can be recognized using sequence comparison tools or protein family databases (blast, clustal, pfam, . . . ). Structural and functional predictions are feasible 2. Homology exist but cannot be recognized easily (psiblast, threading) Low resolution fold predictions are possible. No functional information. 3. No homology 1 D predictions. Sequence motifs. Limited functional prediction. Ab-initio prediction

Reminder ¡ Bioinformatics “suggests” answers, experimental proof is still necessary ¡ Bioinformatics can “save Reminder ¡ Bioinformatics “suggests” answers, experimental proof is still necessary ¡ Bioinformatics can “save work”. Hypothesis can be tested “in silico” ¡ Bioinformatics can do impossible experiments ¡ However, never trust bioinformatics

Biological databases Biological databases

DNA sequence Molecular Recognition Protein sequence 3 D Structure 41 DNA sequence Molecular Recognition Protein sequence 3 D Structure 41

In real life however …. . >gi|261252063|ref|NZ_ACZV 01000005. 1| Vibrio orientalis CIP 102891 VIA. In real life however …. . >gi|261252063|ref|NZ_ACZV 01000005. 1| Vibrio orientalis CIP 102891 VIA. Contig 80, whole genome shotgun sequence ACGCGTTAAGTAGACCGCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAATGAATTGACGGGGGCCCGC ACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCTACTCTTGACATCCAGAGA AGCCGGAAGAGATTCTGGTGTGCCTTCGGGAACTCTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTG TTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTGTTTGCCAGCGAGTAATGTCGG GAACTCCAGGGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGGCCCTTA CGAGTAGGGCTACACACGTGCTACAATGGCGCATACAGAGGGCAGCCAACTTGCGAAAGTGAGCGAATCC CAAAAAGTGCGTCGTAGTCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCG TGGATCAGAATGCCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGG CTGCAAAAGAAGTAGTTTAACCTTCGGGAGAACGCTTACCACTTTGTGGTTCATGACTGGGGTGAA GTCGTAACAAGGTAGCCCTAGGGGAACCTGGGGCTGGATCACCTCCTTATACGATGATTACTCACGATGA GTGTCCACACAGATTGATATGTCTTTATTAGAGCTTTGAGGGGCTATAGCTCAGCTGGGAGAGCGCTTCG DNA sequence ATOM ATOM ATOM ATOM Molecular Recognition 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 CE 2 CE 3 CZ 2 CZ 3 CH 2 N CA C O CB CG OD 1 OD 2 N CA TRP TRP TRP ASP ASP PHE 115 115 115 116 116 117 Protein sequence 28. 381 27. 500 27. 750 26. 888 27. 053 26. 290 25. 763 24. 689 24. 564 26. 872 26. 368 25. 812 26. 590 23. 915 22. 766 8. 071 9. 825 7. 155 8. 895 7. 584 11. 255 10. 825 11. 802 12. 103 10. 617 10. 397 9. 294 11. 276 12. 348 13. 148 33. 915 32. 526 33. 103 31. 705 32. 002 36. 778 38. 096 38. 607 39. 797 39. 142 40. 557 40. 721 41. 416 37. 709 38. 156 3 D Structure 1. 00 10. 00 10. 00 50. 00 10. 00 42

The amount of data is huge 43 The amount of data is huge 43

http: //www 3. ebi. ac. uk/Services/DBStats/ 44 http: //www 3. ebi. ac. uk/Services/DBStats/ 44

Biological databases ¡ Primary l l l ¡ Derived l l l ¡ Annotated Biological databases ¡ Primary l l l ¡ Derived l l l ¡ Annotated a posteriori Data is revised and corrected. Information from literature is added Ex. SWISS-PROT Reusable Experimental data l ¡ Information comes from experiment Database only organizes and provides the data Ex. Gen. Bank, EMBL GEO, SRA Computationally derived l l Ex. PFAM Specific Molecular Database Collection 2009 update

Search strategies ¡ Direct access to database l ¡ Global retrieval l l ¡ Search strategies ¡ Direct access to database l ¡ Global retrieval l l ¡ Usually more elaborated information Sequence Retrieval System (SRS), NCBI Entrez Automated, uniform. Allows to check several (all) databases simultaneously Program access (bio. XXX, Web services, Taverna)

Origin of information ¡ Individual research l ¡ Good quality but very limited amount Origin of information ¡ Individual research l ¡ Good quality but very limited amount Massive sequencing projects: EST, HTS, genome projects. l Large amount of data. Quality not assured. Frequent update

Main sequence repositories ¡ DNA l ¡ EMBL, Genbank, DDBJ Protein l Swissprot/Tr. EMBL, Main sequence repositories ¡ DNA l ¡ EMBL, Genbank, DDBJ Protein l Swissprot/Tr. EMBL, PIR

50 50

51 51

52 52

53 53

54 54

55 55

Trusted annotation Translation from DNA http: //www. expasy. org Trusted annotation Translation from DNA http: //www. expasy. org

Cross links ¡ Most database files contain links to other databases l l DNA Cross links ¡ Most database files contain links to other databases l l DNA sequence to Protein sequence Sequence to 3 D structure Sequence to bibliographic data. .

Warnings Prediction method can fail and some times accurancy is not available ¡ Prediction Warnings Prediction method can fail and some times accurancy is not available ¡ Prediction is always made of known issues ¡ Databases can contain incorrect data ¡ Avoid overvaloration of results ¡