b6e4910eb35f4f421578fb10f423a51a.ppt
- Количество слайдов: 17
w. Flea. Base Daphnia Genome Database from Common Components Daphnia Genomic Consortium Meeting, Sept. 2003 Don Gilbert, gilbertd@indiana. edu
http: //iubio. indiana. edu/daphnia
A Replicable Genome inf. Ormation System ( Argos ) http: //eugenes. org/argos | flybase. net/flybase-ng common/ java/ ; perl/ -- program libraries and packages servers/ -- major programs (BLAST, My. Sql/Postgre. SQL, others) systems/ -- OS executables of programs daphnia/. . implemented organism genome systems eugenes/ flybase/ docs/ & install/ -- Argos instructions and usage template/ -- structure for new projects ROOT/ -- common directory of installed projects
Argos features Common genome tool set ü ü Share benefits of “best of breed” genome tools Common parts are tested & maintained by others Minimal IT expertise (no compiles or system management) Choice of tools (existing or new genome DB use parts desired) Flexible project packages ü Project needs specify tool set (compare Ens. EMBL where all use one set) ü Own look’n’feel web pages, contents, functions ü Security for protected and public sections Easy replication to any Unix computer ü ü ‘Live’ database system replication using rsync Keep remote servers up-to-date every day Local cluster/grid for high-volume traffic Works on common workstations, laptops
Argos - advanced features Data mining ü Fulfill need to search & retrieve 1000 s of genes ü Web Services, Grid Services and LDAP for large data sets ü Simple, computable, industry standards for query by criteria and retrieval of volumes of data ü Bypass time-consuming web pages made for people ü Use with personal, lab databases to keep genome links up-to-date
Argos common parts Java common library, Ant builds, XML Tools, Web Services (Axis), Lucene for “Google”-like searches Perl common library of Bio. Perl, GBrowse, others Servers include Apache, Tomcat web servers My. SQL, Postgre. SQL databases BLAST (NCBI) Systems compiled for apple-powerpc-darwin, intel-linux, sun-sparc-solaris
w. Flea. Base structure Cgi-bin -- Web programs(Perl) Common -- Link to common, shared tools Conf -- Site configurations for web, data Data -- Bulk data & FTP site folder Dbs -- Project databases: blast, lucene, mysql Indices -- Database indices Lib -- Program libraries Web -- Web structure and documents Genomics, Sequences, Maps, Literature, Stocks, Docs, other includes Public and Protected (project member only) parts Webapps -- Web programs (Java) includes Search system, Secure web and editing
Search w. Flea. Base
BLAST w. Flea. Base
Edit w. Flea. Base
Where to put Daphnia Genome? Database needs n n Automated annotation and curated updates Search and retrieve data subsets Choices n n n Ens. EMBL - working now, Gramene & others use GMOD: Chado - in development (Fly. Base, Worm. Base, Chlamy. Genome, TIGR, others will use) Others choices?
Generic Model Organism Database Construction Set www. gmod. org n Genome+ Database (more than annotations) n Genome visualization tools n Genome annotation pipeline planned n Literature curation and Gene Ontology tools n Component system (pick and choose) n Developing - more complete in 2004
Ens. EMBL Genome Database www. ensembl. org n Genome annotation database n Genome visualization tools n Genome annotation pipeline n Comprehensive system (all or none) n Production - useable now
From Shawn Hoon, Fugu Informatics Group
w. Flea. Base issues • Basic web system ready for genome data? • Start with Ens. EMBL for management; move to GMOD: Chado if better choice? • Add GMOD GBrowse; Apollo Editor with genome • Add “Self-service” database features for? • Easy management by scientists • Genome data; stocks; research literature • Add evolutionary, ecological, environmental data Prototype at http: //iubio. indiana. edu/daphnia/
GBrowse Maps
Apollo Annotator
b6e4910eb35f4f421578fb10f423a51a.ppt