994d2b3c3cd1f400fe1acecb1b6afd1e.ppt

- Количество слайдов: 25

Kristen Amuzzini Biotech, Pharmaceutical, & Medical Industry The Math. Works, Inc. © 2003 The Math. Works, Inc. Developing and for Bioinformatics MATLAB Deploying Bioinformatics MATLAB Applications in Bioinformatics Applications with MATLAB 1 © 2003 The Math. Works, Inc.

Presentation Layout § MATLAB applications in Bioinformatics § Customer success stories § MATLAB & The Bioinformatics Toolbox § Sequence analysis § Microarray analysis § Integrating MATLAB with other tools § MATLAB as computational engine for Excel § Questions/Answers & Wrap-up © 2003 The Math. Works, Inc.

Bioinformatics Applications • Sequence analysis • Base calling algorithm design, sequence alignment, sequence building algorithms • Microarray analysis • Image processing, QA/QC, data normalization, data analysis • Proteomics • Mass Spectrometry signal processing, protein marker identification and classification, peptide sequence identification, 2 D-Gel image analysis • Systems Biology • Interaction network identification, simulation of metabolic pathways, flux analysis © 2003 The Math. Works, Inc.

Bioinformatics teams supporting multiple constituencies with multiple tools. Bioinformatics Team • Algorithm development • Custom one-off analyses • Programs for biologists Research Biologists • Prefer UI/Web based tools • Want custom analyses • • Software Engineers • C++, Java • Work off MATLAB prototypes C/C++, Java, Perl VB, Excel Macros SQL GUI Based tools Freeware SPLUS, R, SAS, Mathematica Web based tools © 2003 The Math. Works, Inc.

Using MATLAB, bioinformatics teams can support multiple constituencies. Research Biologists • Prefer UI/Web based tools • Want custom analyses Software Engineers • C++, Java • Work off MATLAB prototypes Bioinformatics Team • Algorithm development • Custom one-off analyses • Programs for biologists MATLAB GUI’s, analyses MATLAB prototypes/ Applications © 2003 The Math. Works, Inc.

User example: Genetic Sequence Base Calling Complete draft of the human genome, accelerated by Applied Biosystems — using MATLAB algorithms. “Having one integrated package is a big advantage. Using MATLAB and the MATLAB Compiler reduced my development time by a factor of 4 or 5. ” “MATLAB has always been ideal as an algorithm prototyping tool, ” Labrenz concludes, “but the MATLAB Compiler and C/C++ Math and Graphics Libraries add a whole new dimension, allowing rapid delivery of sophisticated solutions. ” Jim Labrenz, Applied Biosystems © 2003 The Math. Works, Inc.

User example: Breast Cancer Prognosis Rosetta Inpharmatics recently developed a tool that enables clinicians to determine a breast cancer patient’s prognosis based on the gene expression profile of the primary tumor. “Since MATLAB and the Image Processing Toolbox are fully integrated and the MATLAB platform is very good for matrix calculation, we did not have to spend time writing the low level image processing and the basic data analysis routines like vector and matrix calculations” “Our research scientists are happy with the quick feedback, ” Dr. Dai says. “Using Math. Works tools, we can respond to their requests very fast, and it’s easy for the scientists to use these tools. Using the GUIs that we develop in MATLAB, they can access functions without having to remember the underlying code. ” Dr. Hongyue Dai, Rosetta Inpharmatics/Merck & Company © 2003 The Math. Works, Inc.

Academic users • Bioinformatics Teaching • MIT, Stanford, Cornell, Carnegie Mellon, … • Research • Sequencing • Base calling algorithm design • Sequence analysis • Computational biolinguistics • Microarray analysis • Statistical modeling of microarrays • Proteomics • Statistical modeling of protein-protein interaction • Systems Biology • Flux Analysis © 2003 The Math. Works, Inc.

Thousands of universities teach students using Math. Works products. More than 600 textbooks for education and professional use, in 19 languages – – Biosciences Controls Signal Processing Image Processing – – Mechanical Engineering Mathematics Natural Sciences Environmental Sciences © 2003 The Math. Works, Inc.

Industry Issues & Solutions • Integrating tools from various programming languages is difficult, closed source tools are not customizable, and freeware is often not supported. • MATLAB is a supported, open architecture, user-friendly environment for data analysis across applications, algorithm development, and deployment. • There is no standard biological data format. • MATLAB and the Bioinformatics Toolbox provides file format support for common data sources (webbased, sequences, microarray, etc. ). • Applications must be easily deployable within organizations. • MATLAB’s deployment tools and user-interface design environment allow easy deployment of MATLAB based applications. © 2003 The Math. Works, Inc.

Robert Henson The Math. Works, Inc. © 2003 The Math. Works, Inc. Developing& The Deploying Bioinformatics MATLAB and Bioinformatics Toolbox The Bioinformatics Toolbox Applications with MATLAB 11 © 2003 The Math. Works, Inc.

The Math. Works Product Family Integrated for: § technical computing, data analysis and visualization § system modeling and simulation § implementation of real-time embedded software Blocksets Toolboxes DAQ cards Instruments Databases and files Financial Datafeeds Stateflow Code Generation PC-based real-time systems Desktop Applications Automated Reports © 2003 The Math. Works, Inc.

Bioinformatics Toolbox 1. 0 • File I/O • FASTA, PDB, SCF, GPR, GAL • Web Connectivity • Gen. Bank, EMBL, PIR, PDB • Sequence Analysis & Alignment 212 PYESFTFPELMRKGSYNPVTHIYTAQDVKEVIEYARLRGIR | | | : | | | : | : : : | | | : | | | : : 321 PYISRYYPELAVHGAYSE -SETYSEQDVREVAEFAKIYGVQ • Needleman-Wunsch, Smith-Waterman • DNA/RNA/AA conversions, pattern searching • Microarray Normalization & Visualization • Lowess, global mean, MAD (median absolute deviation) • Protein Visualization • Atomic composition, molecular weight, hydrophobicity profile © 2003 The Math. Works, Inc.

MATLAB Desktop Tools Launchpad: Start other tools and demos Command Window Workspace Browser: See your data Command History © 2003 The Math. Works, Inc.

Sequence Alignment Tutorial Example • • • Get human and mouse genes from Gen. Bank Look for open reading frames (ORFs) Convert DNA sequences to amino acid sequences Create a dotplot of the two sequences Perform global alignment Perform local alignment © 2003 The Math. Works, Inc.

Microarray Data Analysis Tutorial Example • • • Plot expression profiles for genes Filter genes based on information content of profile Perform hierarchical clustering Perform K-means clustering Perform Principal Component Analysis Reference: De. Risi, JL, Iyer, VR, Brown, PO. "Exploring the metabolic and genetic control of gene expression on a genomic scale. " Science. 1997 Oct 24; 278(5338): 680 -6. © 2003 The Math. Works, Inc.

Robert Henson The Math. Works, Inc. © 2003 The Math. Works, Inc. Integrating and Deploying Bioinformatics Tools with Developing and Deploying Bioinformatics Tools with MATLAB Applications with MATLAB 17 © 2003 The Math. Works, Inc.

Connecting to MATLAB C/C++ Java Perl Excel / COM Database Toolbox Web Instrument Control Data Acquisition Image Acquisition File I/O © 2003 The Math. Works, Inc.

Deploying with MATLAB C/C++ Stand-alone COM Excel Web © 2003 The Math. Works, Inc.

Push Data into MATLAB Data I/O • Import Excel ranges into MATLAB • Export MATLAB data into Excel ranges • Evaluate MATLAB Statements in Excel © 2003 The Math. Works, Inc.

Computational Engine for Excel Spread Sheet Applications • MATLAB Excel Link can be the computational engine behind your Excel applications • Fast scalable solution MLPut. Matrix("data", B 2: H 43) MLPut. Matrix("Genes", A 2: A 43) MLPut. Matrix("Time. Steps", B 1: H 1) MLEval. String("clustergram(data, 'Row. Labels', … Genes, 'Col. Labels', Time. Steps)") © 2003 The Math. Works, Inc.

What else could you do? Bioinformatics Statistics Signal Processing Neural Networks Image Processing Optimization © 2003 The Math. Works, Inc.

Robert Henson The Math. Works, Inc. © 2003 The Math. Works, Inc. Integrating and Deploying Bioinformatics Tools with Developing and Summary Deploying Bioinformatics Applications with MATLAB 23 © 2003 The Math. Works, Inc.

Industry Issues & Solutions • Integrating tools from various programming languages is difficult, closed source tools are not customizable, and freeware is often not supported. • MATLAB is a supported, open architecture, user-friendly environment for data analysis across applications, algorithm development, and deployment. • There is no standard biological data format. • MATLAB and the Bioinformatics Toolbox provides file format support for common data sources (webbased, sequences, microarray, etc. ). • Applications must be easily deployable within organizations. • MATLAB’s deployment tools and user-interface design environment allow easy deployment of MATLAB based applications. © 2003 The Math. Works, Inc.

Further Information • Bioinformatics Toolbox Product page –Demos, technical literature, trial information –www. mathworks. com/products/bioinfo • MATLAB Central – File exchange and newsgroup access for MATLAB and Simulink users – www. mathworks. com/matlabcentral – Access to comp. soft-sys. matlab file exchange and newsgroup access for the MATLAB & Simulink user community © 2003 The Math. Works, Inc.