d23d83b29df85754b4782821e7cc2e25.ppt
- Количество слайдов: 44
Methods and resources for pathway analysis PABIO 590 B Week 2
Pathways overview • • • Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks • Pathway and cellular simulations
Pathways vs. networks Gene networks • Clusters of genes (or gene products) with evidence of coexpression • Connections usually represent degrees of co-expression • In-depth knowledge of process is not necessary • Networks are non-predictive Biochemical pathways • Series of chained, chemical reactions • Connections represent describable (and quantifiable) relations between molecules, proteins, lipids, etc. • Enzymatic process is elucidated • Changes via perturbation are predictable downstream
Pathways vs. networks Gene networks Curation Relatively easy: Biochemical pathways Difficult: mostly manual automated and manual Nodes Genes or gene products Any general molecule Edges Levels of co- Representation of possibly quantifiable mechanisms between compounds expression/influence or a qualitative relation Fidelity Low – usually very little High – specific processes detail Predictive power Relatively low Relatively high
n tio ac re ys ed wa at th ur pa C m si tic ilis s ab rk ob wo Pr net Level of detail al s ic el at d m mo he n at io M ulat e iv at s lit rk ua o Q etw n n io ct ra te in rks al o er etw en n G Effort to curate Pathway and network granularity
• • • Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks • Pathway and cellular simulations
Yeast gene interaction network Tong, et al. , Science 303, 808 (2004)
Characteristics of the yeast gene network • Some genes (e. g. regulatory factors) act as ‘hubs’ in a network and have many interactions – Degrees of connectivity follows the power law – Hubs may make interesting anti-cancer targets • Clusters of genes with known function suggest function for hypothetical genes in same cluster • Network characteristics can be used to predict protein interactions • Path between two genes tends to be short (average ~3. 3 hops) Tong, et al. , Science 303, 808 (2004)
E. coli metabolic pathway glycolysis Karp, et al. , Science 293, 2040 (2001)
Pathways: E. coli metabolic map • Encompasses >791 chemical compounds in >744 noted biochemical reactions • Pathway was compiled via literature information extraction and extensive manual curation – System allows for users to indicate evidence of pathway annotations – Curation is done collaboratively with numerous experts outside of Eco. Cyc Karp, et al. , Science 293, 2040 (2001)
Pathways in bioinformatics • Most resources for pathways focus on metabolic pathways (signaling and regulatory gaining prominence) • Pathways as a very specific subtype of networks – Like networks, can be made in computable (symbolic) form – Specificities in chemical reactions are more predictive – Pathways can chain together, forming larger pathways Karp, et al. , Science 293, 2040 (2001)
• • • Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks • Pathway and cellular simulations
Pathway repositories • Bio. Cyc/Meta. Cyc • Kyoto Encyclopedia of Genes and Genomes (KEGG) PATHWAY DB • Bio. Carta • Bio. Models database
Bio. Cyc database http: //www. biocyc. org • Pathway/genome database (PGDB) for organisms with completely sequenced genomes • 409 full genomes and pathways deposited • Species-specific pathways are inferred form Meta. Cyc • Query/navigation/pathway creation support through the Pathway Tools software suite
http: //www. biocyc. org
Meta. Cyc database http: //www. metacyc. org • Non-redundant reference database for metabolic pathways, reactions, enzymes and compounds • Curation through experimental verification and manual literature review • >1200 pathways from 1600+ species (mostly plants and microorganisms)
http: //www. metacyc. org
Glycolysis pathway in Meta. Cyc http: //www. metacyc. org
KEGG PATHWAY database http: //www. kegg. com • Consolidated set of databases that cover genomics (GENE), chemical compounds (LIGAND) and reaction networks (PATHWAY) • Broad focus on metabolics, signal transduction, disease, etc. • Species-specific views available (but networks are static across all organisms)
http: //www. kegg. com
Glycolysis pathway in KEGG http: //www. kegg. com
Global Pathway Map
Bio. Carta database http: //www. biocarta. com • Corporate-owned, publicly-curated pathway database • Series of interactive, “cartoon” pathway maps • Predominantly human and mouse pathways • Contains 120, 000 gene entries and 355 pathways
http: //www. biocarta. com
Glycolysis pathway in Bio. Carta http: //www. biocarta. com
Bio. Models database http: //www. biomodels. net • Database for published, quantitative models of biochemical processes • All models/pathways curated manually, compliant with MIRIAM • Models can be output in SBML format for quantitative modeling • 86 curated models, 40 models pending curation
http: //www. biomodels. net
Glycolysis pathways in Bio. Models http: //www. biomodels. net
Comparison of pathway databases Meta. Cyc/ Bio. Cyc Curation Manual and KEGG PATHWAYS Bio. Carta Bio. Models Automated Manual ~289 reference pathways ~355 pathways ~126 models EC, KO None GO Various Primarily human and mouse ~475 species Reference and species-specific Animated, cartoonish Non-standardized PGDB, pathway comparisons Human pathways, disease Simulations, modeling automated Size ~621+ pathways Nomenclature EC, GO Organism ~500 species coverage Visuals Species-specific custom Primary usage PGDB, computational biology
• • • Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks • Pathway and cellular simulations
Pathway formats • Extensible Markup Language (XML) • Systems Biology Markup Language (SBML) • Bio. Pax
Extensible Markup Language (XML) • Standard of representing information in a machine-readable way • Similar to HTML; tags can enclose or contain data <my. XMLData> <some. Tag>Some data here</some. Tag> <another. Tag>More stuff here</another. Tag> <attribute. Tag data=“embedded in tag” /> </my. XMLData>
Systems Biology Markup Language • XML-based language for representing biochemical reactions • Oriented towards software data-sharing • Tiered, upward-compatible architecture (two, upward-compatible levels, third planned) • Primary intended use is for quantitative model simulations
SBML
Bio. Pax • Like SBML, XML-based pathway representation • Tiered structure – Level 1: Metabolic pathway information – Level 2: Level 1 + Molecular interaction, posttranslational modification • Intended to be a lingua franca for pathway databases
Bio. Pax XML representation
• • • Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks • Pathway and cellular simulations
Inferring pathways and networks • Experimental methods – – – Microarray co-expression Quantitative trait locus mapping (QTL) Isotope-coded affinity tagging (ICAT) Yeast two-hybrid assay Green florescent protein tagging (GFP tagging) • Computational methods – Database-driven protein-protein interactions – Expression clustering techniques – Literature-mining for specified interactions
• • • Introduction to pathways and networks Examples of pathways and networks Review of pathway databases and tools Representing pathways and networks Methods of inferring pathways and networks • Pathway and cellular simulations
Cellular simulations • Study the effect perturbation has on a pathway (and thus the organism) • Generally require extensive detail on the pathway or reactions of interest (flux equations, metabolite concentration, etc. ) • Cellular pathway simulations must manage both temporal and spatial complexity
microsec. millisec. min. yr. ns d an io ys s m ph lo gy ce nanosec. is s ro es ss am yn d ar s ic picosec. n ga or em st rp la llu Temporal intervals ga or sy ce ul ec ol m m s tu ic an an qu ech m 0. 1 nm 10 nm 1 um 1 mm 1 cm 1 m Spatial dimension Adapted from Kelly, H. , http: //www. fas. org/resource/05242004121456. pdf , via Neal, Yngve 2006 VHS, UW MEBI 591
Simulation methods and techniques Biological process Phenomena Metabolism Enzymatic reaction Signal transduction Binding Computation scheme Differential-algebraic equations, flux-based analysis Differential-algebraic equations, stochastic algorithms, diffusionreaction Gene expression Binding Polymerization Degradation Object-oriented modeling, differential-algebraic equations, stochastic algorithms, boolean networks DNA replication Binding Polymerization Object-oriented modeling, differential-algebraic equations Membrane transport Osmotic pressure Membrane potential Differential-algebraic equations, electrophysiology Adapted from Tomita 2001
Research in simulation and modeling • Virtual Cell (National Resource for Cell Analysis and Modeling) • MCell (the Salk Institute) • Gepasi (Virginia Tech) • E-CELL (Institute for Advanced Biosciences, Keio University) • Karyote/Cell. X (Indiana University)
Exercise Your task is to: • Identify the functions of proteins X, Y & Z • Identify the pathway(s) in which they are involved • Look for differences in pathways between databases • Examine the same pathway(s) in humans
d23d83b29df85754b4782821e7cc2e25.ppt