185914dc1b48e0f7569aa16a877f0c52.ppt
- Количество слайдов: 28
Biological databases
Secuencia DNA Secuencia Proteína Reconocimiento Estructura 3 D 14/10/2009 Genómica aplicada a la medicina clínica 2
La vida real sin embargo… >gi|261252063|ref|NZ_ACZV 01000005. 1| Vibrio orientalis CIP 102891 VIA. Contig 80, whole genome shotgun sequence ACGCGTTAAGTAGACCGCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAATGAATTGACGGGGGCCCGC ACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCTACTCTTGACATCCAGAGA AGCCGGAAGAGATTCTGGTGTGCCTTCGGGAACTCTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTG TTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTGTTTGCCAGCGAGTAATGTCGG GAACTCCAGGGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGGCCCTTA CGAGTAGGGCTACACACGTGCTACAATGGCGCATACAGAGGGCAGCCAACTTGCGAAAGTGAGCGAATCC CAAAAAGTGCGTCGTAGTCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCG TGGATCAGAATGCCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGG CTGCAAAAGAAGTAGTTTAACCTTCGGGAGAACGCTTACCACTTTGTGGTTCATGACTGGGGTGAA GTCGTAACAAGGTAGCCCTAGGGGAACCTGGGGCTGGATCACCTCCTTATACGATGATTACTCACGATGA GTGTCCACACAGATTGATATGTCTTTATTAGAGCTTTGAGGGGCTATAGCTCAGCTGGGAGAGCGCTTCG Secuencia DNA ATOM ATOM ATOM ATOM Reconocimiento 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 CE 2 CE 3 CZ 2 CZ 3 CH 2 N CA C O CB CG OD 1 OD 2 N CA TRP TRP TRP ASP ASP PHE 115 115 115 116 116 117 Secuencia Proteína 28. 381 27. 500 27. 750 26. 888 27. 053 26. 290 25. 763 24. 689 24. 564 26. 872 26. 368 25. 812 26. 590 23. 915 22. 766 8. 071 9. 825 7. 155 8. 895 7. 584 11. 255 10. 825 11. 802 12. 103 10. 617 10. 397 9. 294 11. 276 12. 348 13. 148 33. 915 32. 526 33. 103 31. 705 32. 002 36. 778 38. 096 38. 607 39. 797 39. 142 40. 557 40. 721 41. 416 37. 709 38. 156 Estructura 3 D 1. 00 10. 00 10. 00 50. 00 10. 00
La cantidad de datos es enorme 14/10/2009 Genómica aplicada a la medicina clínica 4
http: //www 3. ebi. ac. uk/Services/DBStats/
Biological databases ¡ Primary l l l ¡ Derived l l l ¡ GEO, SRA Computationally derived l ¡ Annotated a posteriori Data is revised and corrected. Information from literature is added Ex. SWISS-PROT Reusable Experimental data l ¡ Information comes from experiment Database only organizes and provides the data Ex. Gen. Bank, EMBL Ex. PFAM Specific issues Molecular Database Collection 2009 update
Search strategies ¡ Direct access to database l ¡ Global retrieval l l ¡ Usually more elaborated information Sequence Retrieval System (SRS), EBI-Eye, NCBI Entrez, Moby. Miner Automated, uniform. Allows to check several (all) databases simultaneously Program access (bio. XXX, Web services, Taverna)
Origin of information ¡ Individual research l ¡ Good quality but very limited amount Massive sequencing projects: EST, HTS, genome projects. l Large amount of data. Quality not assured. Frequent update
Main sequence repositories ¡ DNA l ¡ EMBL, Genbank, DDBJ Protein l Swissprot/Tr. EMBL, PIR
14/10/2009 Genómica aplicada a la medicina clínica 11
14/10/2009 Genómica aplicada a la medicina clínica 12
TEXT
14/10/2009 Genómica aplicada a la medicina clínica 17
14/10/2009 Genómica aplicada a la medicina clínica 18
14/10/2009 Genómica aplicada a la medicina clínica 19
14/10/2009 Genómica aplicada a la medicina clínica 20
Trusted annotation Translation from DNA http: //www. expasy. org
Cross links ¡ Most database files contain links to other databases l l DNA sequence to Protein sequence Sequence to 3 D structure Sequence to bibliographic data. .
Warnings Prediction method can fail and some times accurancy is not available ¡ Prediction is always made of known issues ¡ Databases can contain incorrect data ¡ Avoid overvaloration of results ¡
185914dc1b48e0f7569aa16a877f0c52.ppt