
3a615fe66c826180714cf88ffab694ea.ppt
- Количество слайдов: 44
Consorzio COMETA - Progetto PI 2 S 2 FESR GRID as a Bio-Informatic Tool in Plant Virology Investigations Alessandro Lombardo 1, 2, Gaetano Lanzalone 1, 3, Annamaria Muoio 1 , Marcello Iacono-Manno 1 1 INFN Sezione di Catania and Consorzio COMETA Catania IT di Scienze e Tecnologie Fitosanitarie Catania IT 3 INFN LNS Catania IT 2 Dipartimento Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 www. consorzio-cometa. it
• 1 Introdution on the Biological problem. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 2
Gen. Bank Release 162 October 2007 Sequences 77, 5 millions Nucleotides 81, 5 billions Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 3
Biological problem TESTBEDS CMV (Cucumber mosaic virus) TYLCV (Tomato yellow leaf curl virus) CTV (Citrus tristeza virus) TYLCSV TSWV (Tomato yellow leaf curl sardinia virus) (Tomato spotted wilt virus) Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 4
Symptoms: Rapid decline and death of Citrus grafted on bitter orange (Citrus aurantianum L. ) • Stem pitting, yields reduced, poor quality of the fruits. • Yellow seedling and leaves. • Low growth rate Vectors: Aphids (Toxoptera citricida, Aphis gossypii) Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 5
CTV Geographic distribution: Algeria, American Samoa, Antigua and Barbuda, Argentina, Australia, Belize, Bermuda, Bolivia, Brazil, Brunei Darussalam, Cameroon, the Central African Republic, Chad, China, Colombia, Costa Rica, Cyprus, the Dominican Republic, Ecuador, Egypt, El Salvador, Ethiopia, Fiji, French Polynesia, Gabon, Ghana, Guyana, India, Indonesia, Iran, Israel, Italy, Jamaica, Japan, Kenya, Korea Republic, Malaysia, Mauritius, Morocco, Mozambique, Nepal, Netherlands Antilles, New Caledonia, New Zealand, Nicaragua, Nigeria, Pakistan, Panama, Paraguay, Peru, the Philippines, Portugal, Puerto Rico, Saudi Arabia, Spain, Sri Lanka, Suriname, Taiwan, Tanzania, Thailand, Trinidad and Tobago, Turkey, the USA, Uganda, Uruguay, Venezuela, Vietnam, Zaire, Zambia, Western Samoa, the former Yugoslavia, Zimbabwe. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 6
WORLD-WIDE CITRUS PRODUCTION (FAO) ACTUAL GROWTH RATES Average 000 TONNES PROJECTED Average Percent per year 1987 -1989 1997 -1999 2001 2010 1987 -89 to 199799 1997 -99 to 2010 WORLD 62100 87598 89071 100006 3, 5 1, 1 DEVELOPED 25781 28785 29138 31370 1, 1 0, 7 North America 11422 14650 14846 16440 2, 5 1 Europe 8481 9692 9977 9879 1, 3 0, 2 Former USSR Area 318 93 76 85 -11, 5 -0, 8 Oceania 573 634 579 845 1 2, 4 Other developed 4987 3715 3661 4122 -2, 9 0, 9 DEVELOPING 36319 58813 59933 68636 4, 9 1, 3 Africa 2573 3101 2866 3474 1, 9 0, 9 Latin Americ. & Carib. 20211 30651 30602 34925 4, 3 1, 1 Near East 5553 8198 8967 9566 4 1, 3 Far East 7982 16863 17497 20672 7, 8 1, 7 The production of oranges would have to be attested around to 66, 4 of million tons Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 7
CTV (Citrus tristeza virus) ZOOM Particles dimension: 2000 nm x 11 nm Genome: RNA single strand + 19, 3 Kb Genome organization: 12 Open Reading Frames + 2 Untraslated Terminal Regions Proteins produced: at least 19 Complete genomes in Gen. Bank: 9 Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 8
GENOME: Nucleotide, RNA, … ZOOM NUCLEOTIDE Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 9
FASTA format >gi|806738: 18394 -19023 Citrus tristeza virus complete genome ATGGATAATACTAGCGGACAAACTTTCGTTTCTGTGAACCTTTCT GACGAAAGCAACACAGCGACCACTGACGTCGAACCCGTGAGTT CGGAAGCGGATCGCTTGGATTTTTTACAGAAAATGAATCCCATTA TTATCGATGCTTTGATACGGAAGAATAGTTATCAGGGCGCT TTCGCGCGAGAATAATAGGAGTGTGCGTGGATTGCGGTAGAAAA CACGATAAGGGGTTGAAGACTGAACGTAAGTGTAAGGTCAACAA TACGCAGTCTCAGAACGAGGTGGCGCATATGTTAATGCACGACC >gi|678038: 39184 -12903 Citrus tristeza virus complete genome CCGTTAAGTATTTAAACAAAAGCTAGAGCCTTTTCTAATG CGGAGATATTTGCGATTTGGTTATGTACACCAAGGAAAGG CAATTGGCTATTGATTTGGCCGCTGAAAGGGAGAAAACGAGACT GGCTCGTAGACACCCGATGCGTTCTCCGGAAGAAACTCCGGAAT ATTATAAATTCGGTAGGACTGCTAAAGCAATGTTACCGGACATCA ACGCCGTAGACGTTGGTGATAACGAGGAAACTTCGTCGGAGTAT CCAGTGAGTCTGAGTGTTTCTGGCGGAGTTCTCCGCGAACACCA CTTCATCTGA Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 10
• 2 Analysis tools. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 11
Clustal. W: MULTIPLE ALIGNMENT OF NUCLEOTIDE SEQUENCES POINT MUTATIONS INSERTIONS DELETIONS INVERSIONS RECOMBINATIONS Similarity Plot Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 12
FILOGENETIC TREES Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 13
RECOMBINATION Rp. Rd RNA polymerase RNA dependent Strand shift FORMATION OF MOSAIC STRUCTURES Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 14
TOPALi: IDENTIFICATION OF THE RECOMBINATION EVENTS TOPALi V 2. 0 (Bio. SS-Biomathematics & Statistics Scotland) DDS (Difference of Sums of Square - Mc. Guire and Wright, 2000) PDM (Probabilistic Divergence Measures – Husmeier and Wright, 2001) HMM (Hidden Markov Model – Husmeier and Mc. Guire, 2003) Time of analysis PDM about alignment of CTV on pc user (3. 2 MHz)= 44, 2 h !!!!! Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 15
TOPALi input Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 16
TOPALi output Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 17
Splits. Tree input. aln output. jpg Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 18
Splits. Tree algorithm bootstrapping Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 19
Secondary structures Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 20
Secondary structures 5’ UTR (Untraslated Terminal Region) di CTV da Gowda et al. , 2003 MPGAFold: massively parallel genetic algorithm that predicts RNA secondary structure Computational RNA structure group of the National Cancer Institute, USA Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 21
Knetfold Secondary consensus structure of 78 CTV p 23 genes obtained with Knetfold. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 22
• 3 Porting of applications Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 23
Porting of Bio-Informatic Tools for Plant Virology • Applications: • Clustal. W-MPI, TOPALi, Splits. Tree and Knetfold. • Clustal. W is an execution MPI job on the Grid of data analysis program for multiple alignments • TOPALi and Splitstree programs run as interactive jobs on the Grid • Knetfold application runs as a parametric job. • MPGAFold is an execution MPI 2 (in progress) Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 24
PROGRAMS WORK FLOW Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 25
• 4 Problem Solution by GENIUS. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 28
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 31
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 32
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 33
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 34
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 36
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 37
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 38
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 39
Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 40
• 5 Results and Conclusions Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 41
Results Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 42
Results Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 43
Sequences alignement time by Clustal. W Elapsed time for the Clustal. W-MPI results of 9 Citrus Tristeza Virus complete genome as a function of the number of processor. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 45
Comparison in time for TOPALi Comparison of TOPALi 2 analysis times on CTV sequences, DSS method, carried out on different computational architectures. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 46
• 6 Future perspectives Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 47
Tools for protein folding and docking Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 48
Conclusions • The process speed-up together with the integration of the whole phylo-genetic analysis into a coherent and easy-to-use frame, will lead to a remarkable progress in such investigations. Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 49
Any Questions ? Thank you very much for your kind attention! Palermo, Grid Open Days all’Università di Palermo, 6 -7. 12. 2007 50