576c33da1d467895692b2207ee54f308.ppt
- Количество слайдов: 56
Grid Computing (Special Topics in Computer Engineering) Veera Muangsin 23 January 200 4 1
Outline • High-Performance Computing • Grid Applications • Grid Architecture • Grid Middleware • Grid Services 2
High-Performance Computing 3
World’s Fastest Computers: The Top 5 mega = 106 (ลาน ) giga = 109 (พน ลาน ) , tera = 1012 (ลาน ) , peta = 1015 (พนลาน ) 4
#1 Japan’s Earth Simulator Specifications Total number of processors 5, 120 Peak performance / processor 8 Gflops Total number of nodes 640 Peak performance / node Gflops Total peak performance Tflops 64 40 Shared memory 16 GB Total main memory 10 TB 5
Processor Cabinets 6
Earth Simulator does climate modeling I=3840 Inverse d FFT PN 320 96 K= PN 01 PN 03 2 K= PN 320 96 FFT Spectral space. . . PN 01 PN 02 PN 0 3 J=19 20 Grid space J=1920 Parallel decomposition Grid points: 3840*1920*96 7
• Being constructed by IBM • To be completed in 2006 • Expected performance: 1 Peta. FLOPS, to be no. 1 in the TOP 500 list (in 2003 the aggregated performance of TOP 500 machines is 528 TFlops) • Applications: molecular dynamics, protein folding, drug-protein interaction (docking) 8
Clusters The most common architecture in the TOP 500 – 7 in the top 10 – 208 from 500 9
10
#2 LANL’s ASCI Q • 13. 88 TFlops • 8192 -node cluster HP Alpha. Server 1. 25 GHz • LANL (Los Alamos National Laboratory) • Analyze and predict the performance, safety, and reliability of nuclear weapons 11
#3 Virginia Tech’s System X • 10. 28 TFlops • 1, 100 -node cluster, Apple G 5, Dual Power. PC 970 2 GHz, 4 GB memory, 160 GB disk (total 176 TB), Mac OS X (Free. BSD based UNIX) • $5. 2 millions 12
System X’s Applications • • Nanoscale Electonics Quantum Chemistry Computational Chemistry/Biochemistry Computational Fluid Dynamics Computational Acoustics Ecomputational Electromagnetics Wireless Systems Modeling Large scale Network emulation 13
#4 NCSA’s Tungsten • 9. 81 TFlops • 1, 450 -node cluster, dualprocessor Dell Power. Edge 1750, Intel Xeon 3. 06 GHz • NCSA (National Center for Supercomputing Applications( 14
#5 PNNL’s MPP 2 • 8. 63 TFlops • 980 -node cluster, HP Longs Peak, dual Intel Itanium-2 1. 5 GHz • PNNL (Pacific Northwest National Laboratory) • Application: Molecular Science 15
The Real No. 1 68. 06 TFlops !!! 16
Total Users Results received Total CPU time Last 24 Hours 4, 848, 584 1, 457 (new users) 1, 213, 258, 391 1, 507, 691 1, 783, 547. 603 1, 324. 293 years 5. 879995 e+18 Floating (68. 06 Point 4. 315893 e+21 Last updated: Fri Jan 23 01: 33: 45 2004 Tera. FLOPs/se Operations 17 c)
Science at Home 18
Evaluate AIDS drugs at home • 9, 020 users (12 Jan 2004) • Auto. Dock: predict how drug candidates, might bind to a receptor of HIV’s protein 19
Scientific Applications • Always push computer technology to its limit • Grand Challenge applications – Those applications that cannot be completed with sufficient accuracy and timeliness to be of interest, due to limitations such as speed and memory in current computing systems • Next challenge: large scale collaborative problems 20
E-Science: a new way to do science • Pre-electronic science – Theorize and/or experiment, in small teams • Post-electronic science – Construct and mine very large databases – Develop computer simulations & analyses – Access specialized devices remotely – Exchange information within distributed multidisciplinary teams 21
Data Intensive Science: 2000 -2015 • Scientific discovery increasingly driven by IT – – Computationally intensive analyses Massive data collections Data distributed across networks of varying capability Geographically distributed collaboration • Dominant factor: data growth – – 2000 2005 2010 2015 ~0. 5 Petabyte ~10 Petabytes ~1000 Petabytes? • Storage density doubles every 12 months • Transforming entire disciplines in physical and biological sciences 22
Network • Network vs. computer performance – Computer speed doubles every 18 months – Network speed doubles every 9 months – Difference = order of magnitude per 5 years • 1986 to 2000 – Computers: x 500 – Networks: x 340, 000 • 2001 to 2010 – Computers: x 60 – Networks: x 4000 23
E-Science Infrastructure software computers sensor nets instruments colleagues data archives 24
Online Access to Scientific Instruments Advanced Photon Source wide-area dissemination real-time collection archival storage desktop & VR clients with shared controls tomographic reconstruction DOE X-ray grand challenge: ANL, USC/ISI, NIST, U. Chicago 25
Data Intensive Physical Sciences • High energy & nuclear physics – Including new experiments at CERN • Astronomy: Digital sky surveys • Time-dependent 3 -D systems (simulation, data) – Earth Observation, climate modeling – Geophysics, earthquake modeling – Fluids, aerodynamic design – Pollutant dispersal scenarios 26
Data Intensive Biology and Medicine • Medical data – X-Ray – Digitizing patient records • X-ray crystallography • Molecular genomics and related disciplines – Human Genome, other genome databases – Proteomics (protein structure, activities, …) – Protein interactions, drug delivery • 3 -D Brain scans 27
Grid Computing 28
What is Grid? Google Search (Jan 2004) “grid computing” >600, 000 hits “grid computing” AND hype >20, 000 hits (hype = โฆษณาชวนเชอ ( 29
From Web to Grid • : 1989 Tim Berners-Lee invented the web • so physicists around the world could share documents • : 1999 Grids add to the web • computing power, data management, instruments • E-Science • Commerce is not far behind 30
The Grid Opportunity: e-Science and e-Business • Physicists worldwide pool resources for peta-op analyses of petabytes of data • Engineers collaborate to design buildings, cars • An insurance company mines data from partner hospitals for fraud detection • An enterprise configures internal & external resources to support e-Business workload 31
Grid • “We will give you access to some of our computers and instruments if you give us access to some of yours. ” • “Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations” 32
Grid • Grid provides the infrastructure – to dynamically managed: • Compute resources • Data sources (static and live) • Scientific Instruments (Wind Tunnels, Telescopes, Microscopes, Simulators, etc. ) – to build large scale collaborative problem solving environments that are: • cost effective • secure 33
Grid Applications 34
Life Sciences DATA ACQUISITION PROCESSING, ANALYSIS ADVANCED VISUALIZATION NETWORK IMAGING INSTRUMENTS COMPUTATIONAL RESOURCES LARGE DATABASES 35
Biomedical applications • Data mining on genomic databases (exponential growth) • Indexing of medical databases (Tb/hospital/year) • Collaborative framework for large scale experiments • Parallel processing for – Databases analysis – Complex 3 D modelling 36
Digital Radiology on the Grid • 28 petabytes/year for 2000 hospitals • must satisfy privacy laws 37 University of Pennsylvania
Brain Imaging • Biomedical Informatics Research Network [BIRN] Reference set of brains provides essential data for developing therapies for neurological disorders (Multiple Sclerosis, Alzheimer’s disease). • Pre-BIRN: – One lab, small patient base – 4 TB collection • With Tera. Grid – – Tens of collaborating labs Larger population sample 400 TB data collection: more brains, higher resolution Multiple-scale data integration, analysis 38
Earth Observations ESA missions: • about 100 Gbytes of data per day (ERS 1/2) • 500 Gbytes, for the next ENVISAT mission 39
Particle Physics • Simulate and reconstruct complex physics phenomena millions of times 40
Whole-system Simulations wing models airframe models • lift capabilities • drag capabilities • responsiveness stabilizer models • deflection capabilities • responsiveness crew capabilities - accuracy - perception - stamina - reaction times - SOP’s engine models human models • braking performance • steering capabilities • traction • dampening landing gear models capabilities • thrust performance • reverse thrust performance • responsiveness • fuel consumption 41 NASA Information Power Grid: coupling all sub-system simulations
National Airspace Simulation Environment stabilizer models engine models 44, 000 wing runs wing models GRC 50, 000 engine runs airframe models 66, 000 stabilizer runs ARC 22, 000 commercial US flights a day 48, 000 human crew runs human models simulation drivers Virtual National Air Space VNAS La. RC 22, 000 airframe impact runs • FAA ops data 132, 000 • weather data landing/ • airline schedule data • digital flight data take-off gear runs • radar tracks • terrain data • surface data landing gear models NASA Information Power Grid: aircraft, flight paths, airport operations and the environm 42 are combined to get a virtual national airspace
Global In-flight Engine Diagnostics in-flight data airline global network eg SITA ground station DS&S Engine Health Center internet, e-mail, pager maintenance centre data centre 43 Distributed Aircraft Maintenance Environment: Universities of Leeds, Oxford, Sheffield &Yo
Emergency Response Teams • Bring sensors, data, simulations and experts together – wildfire: predict movement of fire & direct fire-fighters – also earthquakes, peacekeeping forces, battlefields, … National Earthquake Simulation Grid 44 Los Alamos National Laboratory: wildfire
Grid Computing Today DISCOM Sin. RG APGrid IPG … 45
Selected Major Grid Projects URL & Sponsors Name Access Grid Blue. Grid New DISCOM www. mcs. anl. gov/FL/ accessgrid; DOE, NSF g g IBM g www. cs. sandia. gov/ discom DOE Defense Programs DOE Science Grid New Earth System Grid (ESG) European Union (EU) Data. Grid g Focus Create & deploy group collaboration systems using commodity technologies Grid testbed linking IBM laboratories Create operational Grid providing access to resources at three U. S. DOE weapons laboratories sciencegrid. org DOE Office of Science Create operational Grid providing access to resources & applications at U. S. DOE science laboratories & partner universities earthsystemgrid. org DOE Office of Science Delivery and analysis of large climate model datasets for the climate research community European Union Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics g g eu-datagrid. org 46
Selected Major Grid Projects Name Euro. Grid, Grid Interoperability (GRIP) URL/Sponso r g eurogrid. org New European Union Fusion Collaboratory g fusiongrid. org New DOE Off. Science Globus Project™ Grid. Lab Grid. PP g globus. org DARPA, DOE, NSF, NASA, Msoft g gridlab. org New European Union g gridpp. ac. uk New U. K. e. Science Grid Research g grids-center. org Integration Dev. & NSF Support Center New Focus Create tech for remote access to supercomp resources & simulation codes; in GRIP, integrate with Globus Toolkit™ Create a national computational collaboratory for fusion research Research on Grid technologies; development and support of Globus Toolkit™; application and deployment Grid technologies and applications Create & apply an operational grid within the U. K. for particle physics research Integration, deployment, support of the NSF Middleware Infrastructure for research & education 47
Selected Major Grid Projects Name URL/Sponsor Focus Grid Application Dev. Software g hipersoft. rice. edu/ Research into program development technologies for Grid applications Grid Physics Network g griphyn. org NSF Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS ipg. nasa. gov NASA Create and apply a production Grid for aerosciences and other NASA missions NSF Create international Data Grid to enable largescale experimentation on Grid technologies & applications grads; NSF Information Power Grid g International Virtual g ivdgl. org Data Grid Laboratory New Network for g neesgrid. org Earthquake Eng. NSF Simulation Grid New Create and apply a production Grid for earthquake engineering Particle Physics Data ppdg. net g Grid DOE Science Create and apply production Grids for data analysis in high energy and nuclear physics experiments 48
Selected Major Grid Projects Name Tera. Grid URL/Sponsor g teragrid. org New NSF Focus U. S. science infrastructure linking four major resource sites at 40 Gb/s UK Grid Support g grid-support. ac. uk Center New U. K. e. Science Support center for Grid projects within the U. K. Unicore Technologies for remote access to supercomputers BMBFT Also many technology R&D projects: e. g. , Condor, Net. Solve, Ninf, NWS See also www. gridforum. org 49
Tera. Grid • 13. 6 trillion calculations per second • Over 600 trillion bytes of immediately accessible data • 40 gigabit per second network speed 50
Tera. Grid 51
European Data. Grid Lund RAL Estec KNMI Berlin IPSL Prague Paris Brn CERN o Lyo Santander Grenoble Milano n PD-LNL Torino Madrid Marseille. Pisa BO-CNAF Lisboa Barcelona ESRIN Roma Valencia Testbed Sites (>40)Catania 52
UK e-Science Grid e-Science Centers Edinburgh Glasgow DL Belfast Newcastle Manchester Oxford Cardiff RAL Cambridge London Hinxton Soton 53
Asia-Pacific Grid (APGrid) Japan Australia USA Canada Korea Thailand Taiwan Singapore Malaysia APAN members 54
Grid goes to business • • • IBM, HP, Oracle, Sun, … www. ibm. com/grid www. hp. com/techservers/grid www. oracle. com/technologies/grid www. sun. com/grid 55
For More Information • Globus Project™ – www. globus. org • Grid Forum – www. gridforum. org • Book (Morgan Kaufman) – www. mkp. com/grids 56