Скачать презентацию APAC National Grid A technology responses to diverse Скачать презентацию APAC National Grid A technology responses to diverse

bfe629223f6108ccad19d50385e0028b.ppt

  • Количество слайдов: 44

APAC National Grid A technology responses to diverse requirements Ian Atkinson James Cook University APAC National Grid A technology responses to diverse requirements Ian Atkinson James Cook University (www. jcu. edu. au), and Queensland Cyberinfrastructure Foundation (www. qcif. edu. au) Lindsay Hood Australian Partnership for Advanced Computing www. apac. edu. au

Australian Partnership for Advanced Computing “providing national advanced computing, data management and grid services Australian Partnership for Advanced Computing “providing national advanced computing, data management and grid services for e. Research” Partners: • Australian Centre for Advanced Computing and Communications (ac 3) in NSW • CSIRO • i. VEC, The Hub of Advanced Computing in Western Australia • Queensland Cyber Infrastructure Foundation (QCIF) • South Australian Partnership for Advanced Computing (SAPAC) • The Australian National University (ANU) • The University of Tasmania (TPAC) • Victorian Partnership for Advanced Computing (VPAC) 4500 CPUs, 3 PB storage

Recent Review APAC in the future must be regarded not just as the National Recent Review APAC in the future must be regarded not just as the National Facility, but as the sum of its component parts comprising: […] The National Grid […] That the APAC National Grid must be the pre-eminent grid in Australia and continue extending its coverage to include capabilities wherever they exist or develop. It must also nurture and support scientific research teams, NCRIS infrastructure and international partnerships

Concept of the APAC National Grid Research Teams Data Centres APAC National Grid a Concept of the APAC National Grid Research Teams Data Centres APAC National Grid a virtual system of computing, data storage and visualisation facilities Sensor Networks Other Grids: Institutional International Instruments

NCRIS - National Collaborative Research Infrastructure Scheme • National Plan to invest AU$500 M NCRIS - National Collaborative Research Infrastructure Scheme • National Plan to invest AU$500 M in medium scale collaborative access research infrastructure across 5 years 2007 -1011 • 15 Investment areas of interest including: – bioinformatics, biosecurity, geosciences, astronomy, marine and terrestrial observation systems, structural characterization • APAC will now be funded via NCRIS – APAC and the National Grid must directly support the NCRIS invesment areas as a high priority – NCRIS investments are expected to develop and execute plans to ensure e-Research (cyberinfrastructure) tools are and practices are embedded into their practices and data management Data management is now a hot topic in Australia!

NCRIS Platforms for Collaboration Vision NCRIS Platforms for Collaboration Vision

APAC National Grid Core Services – built on Globus Portal Tools: Grid. Sphere QCIF APAC National Grid Core Services – built on Globus Portal Tools: Grid. Sphere QCIF Info Services: (JCU) MDS 2/4 MIP Security: APAC CA (PKI) Grix My. Proxy VOMRS Systems: QCIF APAC National Facility IVEC ANU SAPAC Gateways Partners’ systems Network: AARNet APAC Private Network ? (AARNet) ac 3 VPAC CSIRO TPAC <15 Staff to deliver all services!

Some requirements • Non dedicated resources (at partner sites) • Varied middleware requirements (many Some requirements • Non dedicated resources (at partner sites) • Varied middleware requirements (many domains to support) • Complex virtual organisation structure • Distributed data, workflows • Simplified interface This turns out to be hard!

Requirements Analysis circa. mid 2006 Requirements Analysis circa. mid 2006

Gateways • Rapid churn in the middleware – You need lots of test machines Gateways • Rapid churn in the middleware – You need lots of test machines • Different communities want different middleware – GT 2, GT 4, Gridsphere, SRB … • Minimises interaction with non griddedicated production systems • Virtualisation is well understood technology • Xen has a nice price

The gateway concept • Grid middleware evolving • Security across firewalls & institutional policies The gateway concept • Grid middleware evolving • Security across firewalls & institutional policies are problems • Using gateway virtual machines to isolate production compute/storage elements from all this change – – – • ng 1 - Globus 2 ng 2 - Globus 4 ngdata - grid. FTP, other data (SRB) ngportal - web application portals Others are easy to build and deploy But some parts of GT 2 especially assume they are running on the cluster mgmt/head node. . .

Gateways on a National Backbone Installed Gateway servers at all grid sites, using VM Gateways on a National Backbone Installed Gateway servers at all grid sites, using VM technology to support multiple grid stacks Cluster Datastore Gateway Server consistency, ease of implementation, performance? High bandwidth, dedicated, secure private network between grid sites Layer 3 Private Network Gateway Server Gateways supporting GT 2, GT 4, LCG, grid portals, and experimental grid stacks Datastore Cluster HPC

Virtual Organisations • International use tends to be large VO’s • Australia demands small, Virtual Organisations • International use tends to be large VO’s • Australia demands small, dynamic VO’s • VOMS/VOMRS has problems – Admin security model – myproxy interaction – gridmapfile – a user can be in one VO • Adopting PRIMA/GUMS – Still complicated and not especially dynamic

“Workflows” • Many existing HPC users have significant shell scripts, and queue commands (PBS) “Workflows” • Many existing HPC users have significant shell scripts, and queue commands (PBS) • WSGRAM, JSDL, BPEL may be human readable, but not human writable! • Abstraction of HPC systems is tough – Eg SGI’s profile. pl doesn’t handle cpusets correctly • Working on a gsub that will take the majority of batch scripts and run them on the grid – User doesn’t have to learn JSDL, WSGRAM … • Unicore-like client GUI would be neat

JSDL (1) My Gnuplot invocation Simple application invocation: User wants to run the application 'gnuplot' to produce a plotted graphical file based on some data shipped in from elsewhere(perhaps as part of a workflow). A front-end application will then build into an animation of spinning data. Front-end application knows URL for data file which must be staged-in. Front-end application wants to stage in a control file that it specifies directly which directs gnuplot to produce the output files. In case of error, messages should be produced on stderr (also to be staged on completion) and no images are to be transferred.

JSDL (2) </jsdl: Job. Identification> <jsdl: Application. Name>gnuplot</jsdl: Application. Name> <jsdl-posix: POSIXApplication> <jsdl-posix: Executable> JSDL (2) gnuplot /usr/local/bin/gnuplot control. txt input. dat output 1. png 2097152. 0 1. 0

JSDL (3) <jsdl: Data. Staging> <jsdl: File. Name>control. txt</jsdl: File. Name> <jsdl: Creation. Flag>overwrite</jsdl: JSDL (3) control. txt overwrite true http: //foo. bar. com/~me/control. txt< /jsdl: URI> input. dat overwrite true http: //foo. bar. com/~me/input. dat output 1. png overwrite true rsync: //spoolmachine/userdir

Portals • • Web browser is the interface to everything… Present simple interface to Portals • • Web browser is the interface to everything… Present simple interface to the underlying resource Good model for many users and applications Gridsphere adopted as the web grid standard – GT 4 based ngportal VM • Java cogkit for standalone “portal” apps – Chemistry Java application with molecular editor being developed – Desktop job submission tool from i. VEC http: //www. grid. apac. edu. au/Services/Production. Portals

Data services • Data is hard – Different communities have different needs – Complex Data services • Data is hard – Different communities have different needs – Complex access controls • We have gridftp between sites – Network consistency is interesting … – Data staging has to • SRB today; i. RODS later • Dcache, SRM, Gfarm as communities require • Credible use cases for a global file system?

Registry services • MDS 2, MDS 4 running • About to deploy Modular Information Registry services • MDS 2, MDS 4 running • About to deploy Modular Information Provider to present site and aggregated information more easily • Using GLUE schema, but it’s far from satisfactory for describing real-world production HPC resources

Improved AAA Services • NCRIS will require e-research services to a much wider community Improved AAA Services • NCRIS will require e-research services to a much wider community than traditional HPC – PKI doesn’t scale and is conceptually difficult for non-IT focused users • Australian Access Federation funded (2007 -_ • IAM Suite from MELCOE – Shibboleth authentication plus appropriate attributes generates short lived certificate (www. identiy – Tools for users to easily create shared workspaces and manage attribute release • Only a few people will need real certificates • But probably a year away before being ready for prime time

IAM Suite Federation rch a Se Lo Receive assertions gin AFS adaptor Federation SP IAM Suite Federation rch a Se Lo Receive assertions gin AFS adaptor Federation SP Fedora (internal or external, e. g. IR) GTK Storage Cluster GTK Specific tools GTK Id. P VO-WAYF Grid. Sphere VO-Id. P Group. Module Sh. ARPE Auth. N IM Autograph Fedora. Web Receive. proxy cert via My. Proxy Presence Re c ass eive ert ion s Calendar Auth. Z Mgnr Equipm. VO-SP Forum Wiki VO-SP LMS People. Picker VO-SP Etc. www. federation. org. au Macquarie University’s E-Learning Centre of Excellence (MELCOE) Erik Vullings

APAC National Grid Status • Essentially operational – core services implemented • APAC CA APAC National Grid Status • Essentially operational – core services implemented • APAC CA and myproxy, VOMRS, GT 2, GT 4, gridsphere, SRB – some applications close to ‘production’ mode – See http: //goc. grid. apac. edu. au; http: //www. grid. apac. edu. au • Systems coverage – users can access ALL systems at APAC partners • via gateways • from the desktop is needed – about 4600 processors and 100’s of Tbytes of disk – around 3 Pbytes of disk-cached HSM systems

Future Strategies • Expand the user base – NCRIS, Merit Allocation Scheme, Partners – Future Strategies • Expand the user base – NCRIS, Merit Allocation Scheme, Partners – Open access to core grid services • Expand the services – Workflow engines and tools – Kepler, Taverna – Data management: metadata support • Expand the facilities – Include major data centres • data from instruments, government agencies – Include institutional systems and repositories • Resulting changes: – – Policies: acceptable service provision Organisation: coordinated user support Architecture: scaling gateways Technologies: Attribute-based authorisation

Changing User Base • National Collaborative Research Infrastructure Strategy – Ambitious plan to hand Changing User Base • National Collaborative Research Infrastructure Strategy – Ambitious plan to hand out $0. 5 B of federal money to fund research infrastructure collaboratively • • • Evolving Biomolecular Platforms and Informatics Integrated Biological Systems Characterisation Fabrication Biotechnology Products Networked Biosecurity Framework Optical and Radio Astronomy Integrated Marine Observing System Structure and Evolution of the Australian Continent Terrestrial Ecosystem Research Network Population Health and Clinical Data Linkage $50. 0 M $47. 7 M $41. 0 M $35. 0 M $25. 0 M $45. 0 M $55. 2 M $42. 8 M $20. 0 M • Platforms for Collaboration $75. 0 M

Summer in Australia? Summer in Australia?

 • APAC Nat. Fac. >1600 cpu Altix • Shoulder clusters E-Reseach services Interoperation • APAC Nat. Fac. >1600 cpu Altix • Shoulder clusters E-Reseach services Interoperation and Collaboration Services (ICS) • Old APAC Grid Aust. Nat. Data Service (ANDS) • Federation of Mass Data Stores • Long term archiving and curation National Coordination Council National Compute Infrastructure (NCI) Australian Access Federation (AAF), AREN - Network New Names and Structures

Bringing it all together real applications Bringing it all together real applications

Future Strategies • Expand the user base – NCRIS, Merit Allocation Scheme, Partners – Future Strategies • Expand the user base – NCRIS, Merit Allocation Scheme, Partners – Open access to core grid services • Expand the services – Workflow engines and tools – Kepler, Taverna – Data management: metadata support, collections registry • Expand the facilities – Include major data centres • data from instruments, government agencies – Include institutional systems and repositories • Resulting changes: – – Policies: acceptable service provision Organisation: coordinated user support Architecture: scaling gateways Technologies: Attribute-based authorisation

Source: Office of Integrative Activities, NSF Source: Office of Integrative Activities, NSF

Three “views” of the Grid “The grid lets me run lots of jobs all Three “views” of the Grid “The grid lets me run lots of jobs all over the” place – Nimrod, Gridbus “The grid lets me build a workflow that uses” distributed resources” – Kepler, Taverna “The grid lets me scale my workstation model to a supercomputer seamlessly” –DEISA So grid means different things to different communities We must deliver production quality services for all of them?

Grid Collaboration • Beyond Data and Compute are the Access. Grid • Australia has Grid Collaboration • Beyond Data and Compute are the Access. Grid • Australia has had a long-term commitment to the AG • Small highly dispersed population – Queensland has the most distributed population in Aust. • AG is still burdened by its “antique” media tools, but the concept is essential – Skype and IP video conferencing are insufficient

Access Grid - ATP, Sydney 1 st in Australia… participating in SC-Global (Nov 2001) Access Grid - ATP, Sydney 1 st in Australia… participating in SC-Global (Nov 2001)

Australia’s 2 nd AG node - in Qld. JCU, Townsville April 2002 Minister Hon Australia’s 2 nd AG node - in Qld. JCU, Townsville April 2002 Minister Hon Paul Lucas

Access. Grid • Access. Grid is now very widely available in Australia – Most Access. Grid • Access. Grid is now very widely available in Australia – Most Universities have several nodes – Extensive use in teaching (AMSI) • New tools being developed – HD codes (Chris Willing UQ) – SRB data grid integration with accessgrid (Atkinson / Willing) – International Quality Assurance Program – Better Multicast / Unicast integration

SRB Browser Connecting the Data. Grid to the Access. Grid AG Vic video AG SRB Browser Connecting the Data. Grid to the Access. Grid AG Vic video AG Venue. Client SRB Browser • AG Shared Application • New SRB Java/Python interface library written (now part of SRB) • All AG clients can share in data from SRB Data store • Cross platform • Exposes SRB metadata Nigel Bajema AG Rat audio Files moved from SRB to/from AG data manager

Acknowledgment • Lindsay Hood, APAC Grid Manager • Rhys Francis, Former APAC Grid Manager Acknowledgment • Lindsay Hood, APAC Grid Manager • Rhys Francis, Former APAC Grid Manager • David Bannon, VPAC Gateway project manager • Rob Woodcock, CSIRO Minerals Exploration