Скачать презентацию ESMF on the GRID Workshop July 20 2005 Скачать презентацию ESMF on the GRID Workshop July 20 2005

0570d74097859c45a8b2986b38bec4e2.ppt

  • Количество слайдов: 19

ESMF on the GRID Workshop July 20, 2005 Experiences from Simulating the Global Carbon ESMF on the GRID Workshop July 20, 2005 Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment Henry Tufo Computer Science Group Head Scientific Computing Division National Center for Atmospheric Research Associate Professor Department of Computer Science University of Colorado at Boulder

Motivation: NCAR as an Integrator q It is our position that NCAR must provide Motivation: NCAR as an Integrator q It is our position that NCAR must provide integrated solutions to the community. q Scientific workflows are becoming too complicated for manual (or semi-manual) implementation. q Not reasonable to expect a scientist to: q Design simulation solutions by chaining together application software packages q Manage the data lifecycle (check out, analysis, publishing, and check in) q Do this in an evolving computational and information environment q NCAR must provide the software infrastructure to allow scientist to seamlessly (and painlessly) implement their workflows, thereby allowing them to concentrate on what they’re good at: SCIENCE! q Goal is increased scientific productivity and requires an unprecedented level of integration of both systems and software. q Long term investment: return to the organization won’t show up in the bottom line immediately. Department of Compujter Science University of Colorado at Boulder 2

Motivation: Robust Modeling Environments q Our goal is to develop a simple, production quality Motivation: Robust Modeling Environments q Our goal is to develop a simple, production quality modeling environment for NCAR and the geoscience community that insulates scientists from the technical details of the execution environment q Cyberinfrastructure q System and software integration q Data archiving q Grid-BGC is an example of such an environment and is the first of these environments developed for NCAR q Learning as we develop and deploy q Tasked by the geoscience community, but developed services are applicable to other collaborative research projects Department of Compujter Science University of Colorado at Boulder 3

Outline q Introduction q Carbon Cycle Modeling q Grid-BGC Prototype Architecture q Experiences with Outline q Introduction q Carbon Cycle Modeling q Grid-BGC Prototype Architecture q Experiences with the Grid q Grid-BGC Production Architecture q Future Work Department of Compujter Science University of Colorado at Boulder 4

Introduction: Participants q This is a collaborative project between the National Center for Atmospheric Introduction: Participants q This is a collaborative project between the National Center for Atmospheric Research (NCAR) and the University of Colorado at Boulder (CU) q NASA has provided funding for three years via the Advanced Information Systems Technology (AIST) program q Researchers: q q q q q Peter Thornton (PI), NCAR Henry Tufo (co-PI), CU Luca Cinquini, NCAR Jason Cope, CU Craig Hartsough, NCAR Rich Loft, NCAR Sean Mc. Creary, CU Don Middleton, NCAR Nate Wilhelmi, NCAR Matthew Woitaszek, CU Department of Compujter Science University of Colorado at Boulder 5

Carbon Cycle Modeling: Workflow q Scientific models q Daymet q Biome-BGC q Daymet interpolates Carbon Cycle Modeling: Workflow q Scientific models q Daymet q Biome-BGC q Daymet interpolates a high resolution grid of weather observations for a region q Biome BGC calculates carbon cycle parameters at the individual grid points for each region q Models originally intended for analysis of small geographic regions. q Analysis of larger regions is accomplished by simulating its composite regions Department of Compujter Science University of Colorado at Boulder 6

Carbon Cycle Modeling: Grid-BGC Motivation Goal: Create an easy to use computational environment for Carbon Cycle Modeling: Grid-BGC Motivation Goal: Create an easy to use computational environment for scientists running large scale carbon cycle simulations. q Requires the management of multiple simultaneously executing workflows q Execution management q Data management q Task automation q Distributed resources across multiple organizations q Data archive and front-end portal are located at NCAR q Execution resources are located at CU Department of Compujter Science University of Colorado at Boulder 7

Grid-BGC Prototype Architecture (June, 2004): Overview q Main components q Web portal GUI q Grid-BGC Prototype Architecture (June, 2004): Overview q Main components q Web portal GUI q Job. Manager q Data. Mover q Prototype goals q Create a computational grid from GT 3. 2 q Create a reliable execution environment q Simplify usage and management Department of Compujter Science University of Colorado at Boulder 8

Experiences q Security breach during Spring 2004 impacted Grid-BGC’s design q Continually evolving requirement Experiences q Security breach during Spring 2004 impacted Grid-BGC’s design q Continually evolving requirement and component q One time password authentication q Short lived certificates q Unique requirements may make implementation difficult q Reliability and Fault Tolerance q Grid-BGC is self healing q Detect and correct failures without scientist intervention q Hold uncorrectable errors for administrative action Department of Compujter Science University of Colorado at Boulder 9

Experiences: Grid Development q Grid-BGC is the first production quality computational grid developed by Experiences: Grid Development q Grid-BGC is the first production quality computational grid developed by NCAR and CU. q Prototype development began January 2004 and ended September 2004 q More difficult than we anticipated q Pleasant persistence in the face of frustration is an essential quality for developers of a grid computing environment q We found that the middleware and development tools for GT 3. 2 were not “production grade” for our environment, but were rapidly improving q Distributed, heterogeneous, and asynchronous system development and debugging are difficult tasks q Developers’ mileage with grid middleware will vary q Globus components may or may not be useful q While critical components are available, more specific components will certainly need development Department of Compujter Science University of Colorado at Boulder 10

Experiences: Analysis of the Prototype q What we did well q Proof of concept Experiences: Analysis of the Prototype q What we did well q Proof of concept grid environment was a success q Portal, grid services, and automation tools create a simplified user environment q Fault tolerant q What we needed to improve q Modularize the monolithic architecture q Break out functionality q Move towards a service oriented architecture q Re-evaluate data management policies and tools q Grid. FTP / Replica Location Service (RLS) vs. Data. Mover q Misuse of NCAR MSS as temporary storage platform q Use the appropriate Globus compliant tools q Many Globus components can be used in place third-party or in-house tools q More recent release of GT have improved the quality and usability of the components and documentation Department of Compujter Science University of Colorado at Boulder 11

Grid-BGC Production Architecture (June, 2005): Overview q Production architecture address prototypes goals q Ease Grid-BGC Production Architecture (June, 2005): Overview q Production architecture address prototypes goals q Ease of use q Efficient and productive q Additionally, production architecture addresses q Modular design q GT 4 Compliance q Restructuring execution and data management Department of Compujter Science University of Colorado at Boulder 12

Grid-BGC Production Architecture (June, 2005): Overview Department of Compujter Science University of Colorado at Grid-BGC Production Architecture (June, 2005): Overview Department of Compujter Science University of Colorado at Boulder 13

Grid-BGC Production Architecture Department of Compujter Science University of Colorado at Boulder 14 Grid-BGC Production Architecture Department of Compujter Science University of Colorado at Boulder 14

Grid-BGC Production Architecture: Improvements q Usage of more Globus Toolkit components and services q Grid-BGC Production Architecture: Improvements q Usage of more Globus Toolkit components and services q WS-GRAM q Grid. FTP and Reliable File Transfer (RFT) service q Restructuring data management q Remove of Data. Mover, replace with RFT q Use Replica Location Service (RLS) as needed q Restructuring execution management q Removal of Job. Manager, replace with WS GRAM and a custom client q Develop a simple workflow manager q Modularize components q q Break out individual components from the monolithic kernel Create a service oriented architecture Separate functional components Generic and reusable Department of Compujter Science University of Colorado at Boulder 15

Grid-BGC Production Architecture: GT 4 Experience q Improvements in Grid middleware immediately useful in Grid-BGC Production Architecture: GT 4 Experience q Improvements in Grid middleware immediately useful in our environment q My. Proxy officially supported q WS GRAM meets our expectations q RFT and Grid. FTP q Porting Grid-BGC prototype was easier than expected q Modular design limited the amount of changes needed q Improved documentation of GT components Department of Compujter Science University of Colorado at Boulder 16

Future Work: Expansion of the Grid-BGC Environment q Integrate NASA’s Columbia Supercomputer into the Future Work: Expansion of the Grid-BGC Environment q Integrate NASA’s Columbia Supercomputer into the Grid-BGC environment q Late Summer 2005 q Deploy the computational framework and components q End-to-end testing between distributed Grid-BGC components q Integrate resources provided by the system’s users Department of Compujter Science University of Colorado at Boulder 17

Future Work: CU / NCAR HTWM Department of Compujter Science University of Colorado at Future Work: CU / NCAR HTWM Department of Compujter Science University of Colorado at Boulder 18

Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment This research Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment This research was supported in part by the National Aeronautics and Space Administration (NASA) under AIST Grant AIST-02 -0036, the National Science Foundation (NSF) under ARI Grant #CDA 9601817, and NSF sponsorship of the National Center for Atmospheric Research. Questions? Ideas? Comments? Suggestions? http: //www. gridbgc. ucar. edu tufo@cs. colorado. edu http: //csc. cs. colorado. edu