Скачать презентацию Grid middleware is easy to install configure secure Скачать презентацию Grid middleware is easy to install configure secure

7c8a699f784f9cca3e046722b946efca.ppt

  • Количество слайдов: 59

"Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites" "One can't believe impossible things" UK OGSA Evaluation Project (UCL, Imperial, Newcastle, Edinburgh) (Full list of project members) Paul Brebner University College London P. Brebner@cs. ucl. ac. uk

Grid Complexity – The Grid will be BIG Grid Complexity – The Grid will be BIG

Grid Complexity - growing Grid Complexity - growing

Grid Complexity – built on the internet Grid Complexity – built on the internet

Grid Complexity – but more complex Grid Complexity – but more complex

Grid Simplicity – Start with something simple • OGSA – OGSI • GT 3. Grid Simplicity – Start with something simple • OGSA – OGSI • GT 3. 2 – exemplar of a Grid SOA • Initially evaluate installation, configuration, and security • Then performance and scalability, deployment, architectural choices, etc.

Grid Realism – But realistic test-bed • Heterogeneous platforms – Linux, Solaris, Windows • Grid Realism – But realistic test-bed • Heterogeneous platforms – Linux, Solaris, Windows • Cross-organisational – Four nodes – Independently administered – Firewalls and access restrictions • Security – UK e-Science CA

Grid Confusion – What is Globus? • How is Globus intended to be used? Grid Confusion – What is Globus? • How is Globus intended to be used? – 1: Science as first-order services: Middleware for building and hosting Grid Applications, by exposing science code as Grid services. – 2: Middleware as services: As a set of high level Grid services, composed to provide new Grid functionality. Science isn’t first-order service, but managed by Grid services.

Grid Confusion – Science services or Grid services Client 1 E=mc 2 Grid Confusion – Science services or Grid services Client 1 E=mc 2

Grid Confusion – Science services or Grid services Client 1 D=A+2 B+C 2 E=mc Grid Confusion – Science services or Grid services Client 1 D=A+2 B+C 2 E=mc 2

Grid Confusion – Science services or Grid services Client 1 2 D=A+2 B+C 2 Grid Confusion – Science services or Grid services Client 1 2 D=A+2 B+C 2 E=mc 2 D=A+2 B+C 2 E = mc 2

Grid Confusion – How to evaluate • Do we evaluate GT 3 as middleware Grid Confusion – How to evaluate • Do we evaluate GT 3 as middleware for hosting Grid services, or as a toolkit for constructing Grid middleware? • If the first, only need GT 3 Core – just the container. If the second, need “All Services” (and more – there’s no scheduler).

Grid Simplicity – Incremental • • Start with Core Package Add Security Then try Grid Simplicity – Incremental • • Start with Core Package Add Security Then try “All Services” Simple enough – in theory

Grid Steps – single node Install GT 3 OS/HW Grid Steps – single node Install GT 3 OS/HW

Grid Steps – single node Configure Install GT 3 OS/HW Grid Steps – single node Configure Install GT 3 OS/HW

Grid Steps – single node Deploy Configure Install GT 3 OS/HW Grid Steps – single node Deploy Configure Install GT 3 OS/HW

Grid Steps – single node Run Deploy Configure Install GT 3 OS/HW Grid Steps – single node Run Deploy Configure Install GT 3 OS/HW

Grid Steps – Multiple sites GT 3 Grid Steps – Multiple sites GT 3

Grid Steps – Multiple sites GT 3 Grid Steps – Multiple sites GT 3

Grid Steps – Multiple sites Interoperate GT 3 Grid Steps – Multiple sites Interoperate GT 3

Grid Steps – Multiple sites Secure Interoperate GT 3 GT 3 Grid Steps – Multiple sites Secure Interoperate GT 3 GT 3

Grid Steps – Multiple sites Secure Interoperate GT 3 Manage GT 3 Grid Steps – Multiple sites Secure Interoperate GT 3 Manage GT 3

Grid Reality – What we found • Port number management • Host access • Grid Reality – What we found • Port number management • Host access • Remote visibility of installation, container, services • Installation by System Administrators • Tomcat or Test container • Compilation issues on Solaris • Exponential increase in testing complexity as number of nodes increases.

Grid Reality – What we found • Port number management – Post number conflicts Grid Reality – What we found • Port number management – Post number conflicts (with other services) – What port is the container running on?

Grid Reality – What we found • Host access – Is the container visible Grid Reality – What we found • Host access – Is the container visible on that port externally? – From which machines? – For which users? – Non-trivial to test/debug if/when something goes wrong

Grid Reality – What we found • Remote visibility of installation, container, services – Grid Reality – What we found • Remote visibility of installation, container, services – What infrastructure is installed? – What packages and versions? – How is it configured? – What state is it in?

Grid Reality – What we found • Installation by System Administrators – Division of Grid Reality – What we found • Installation by System Administrators – Division of roles – Didn’t meet expectations – Extra effort to support multiple roles • System Administrators – install, configure and secure • Globus Administrators – test, maintain • Globus Developers – develop, deploy, test/use Grid services

Grid Reality – What we found • Tomcat or Test container – Differences in Grid Reality – What we found • Tomcat or Test container – Differences in deployment, configuration, and management – With Tomcat, increased potential for centralised management, and sand-boxing of run-time environment

Grid Reality – What we found • Compilation issues on Solaris – Took longer Grid Reality – What we found • Compilation issues on Solaris – Took longer than expected – Only Linux testing and support can be taken for granted

Grid Reality – What we found • Exponential increase in testing complexity as number Grid Reality – What we found • Exponential increase in testing complexity as number of nodes increases – Testing (and maintaining) interoperability between m client machines, and n servers gets complicated. – How well will this scale for 100 s, 1000 s of nodes?

Grid Reality – Security • In theory just had to – – – obtain Grid Reality – Security • In theory just had to – – – obtain (and update) host, client, and CA certificates convert install configure generate (and update) proxies. • However, parts of “All Services” package also needed.

Grid Security - What we found • Interactions between security for multiple installations • Grid Security - What we found • Interactions between security for multiple installations • Essential to test non-secure interoperability first • Windows client-side security • Testing and viewing security configuration • Debugging secure calls • Client side security is programmatic • Security management scalability – Construction and maintenance of user accounts and grid -map file entries.

Grid Security - What we found • Interactions between security for multiple installations – Grid Security - What we found • Interactions between security for multiple installations – For testing may want • multiple versions, or duplicates (with different configurations) of same versions. • One container with no security, and another container with security – May want test/production environments

Grid Security - What we found • Essential to test non-secure interoperability first – Grid Security - What we found • Essential to test non-secure interoperability first – Trying to test interoperability and security simultaneously wasn’t fun

Grid Security - What we found • Windows client-side security – Still havn’t got Grid Security - What we found • Windows client-side security – Still havn’t got it working – Not obvious exactly what parts of Globus are needed for client side code with security (no “client plus security” package).

Grid Security - What we found • Testing and viewing security configuration – Need Grid Security - What we found • Testing and viewing security configuration – Need to be able to view/edit and check security configuration for containers and services – Confusion about hierarchical security settings • Virtual Organisations, clusters, servers, containers, factories, services, methods, and instances. – Remotely – Validate security deployment before run-time

Grid Security - What we found • Debugging secure calls (or any stateful service) Grid Security - What we found • Debugging secure calls (or any stateful service) – Proxy interceptor approach (e. g. TCPMON) won’t work with stateful services • As grid handle returned to client contains the port number of the instance, not the proxy – But proxies are an important design pattern for SOAs… – GT 4/WS-RF may be different • Handle resolvers, WS-Addressing and WSRenewable. References

Grid Security - What we found • Client side security is programmatic – Client Grid Security - What we found • Client side security is programmatic – Client side code modifications required to call services/methods with required protocols – Should be declarative – Sensitive to server side security credentials

Grid Security - What we found • Security management scalability – Construction and maintenance Grid Security - What we found • Security management scalability – Construction and maintenance of user accounts and grid-map file entries. – For each server, each user needs an account, and an entry in the container gridmap file (mapping client certificate to account) – May also need service specific gridmap files – Not scalable for large numbers of users, services. • Alternatives? – Tool support – Role based authentication – Shared accounts or certificates

Grid Recommendations • If Globus is middleware, then need: – Platform independent, automatic, installation. Grid Recommendations • If Globus is middleware, then need: – Platform independent, automatic, installation. – Tool support for configuration and deployment creation, validation, viewing and editing. – Management console for grid, nodes, globus packages, containers and services. – Support for remote, location independent, crossorganisational, multiple role scenarios.

Grid Recommendations (continued) • If Globus is middleware, then need: – Remote deployment and Grid Recommendations (continued) • If Globus is middleware, then need: – Remote deployment and management of services. – Remote distributed debugging of grid installations, services, and applications. – Tool support, and more scalable processes for security.

Grid Alternatives • Next we plan to evaluate the two architectural choices in more Grid Alternatives • Next we plan to evaluate the two architectural choices in more detail – Science exposed as services, vs science code managed by higher level grid services. • Explore alternative mechanisms for: – Load balancing and resource management – Directory services (service and resource discovery) – Data movement approaches (e. g. SOAP Attachments vs Grid. FTP)

Grid Performance • First approach (initial results) – Scientific benchmark (Sci. Mark 2. 0) Grid Performance • First approach (initial results) – Scientific benchmark (Sci. Mark 2. 0) modified to measure throughput, and invoked as a Stateful Grid Service – Metric is Calls Per Minute (CPM) – one unit of work. – No data movement, just computation and memory load. – JVM: 512 MB Heap and –server (of course ) • Good performance and scalability – Security has minimal overhead – Problem with client side timeouts as response times increase

Grid Performance Tomcat Fastest: 3. 6 s (Edinburgh) Slowest: 25 s (UCL) Grid Performance Tomcat Fastest: 3. 6 s (Edinburgh) Slowest: 25 s (UCL)

Grid Performance 95% of predicted maximum throughput Grid Performance 95% of predicted maximum throughput

Grid Performance • Tomcat vs Test container – No difference on 3 out of Grid Performance • Tomcat vs Test container – No difference on 3 out of 4 nodes – But 67% faster on one node (Newcastle, slowest Intel box) • Attachments will work with GT 3 and Tomcat – But not with security – Limit of 1 GB (DIME) – Bug in Axis – doesn’t clean up temporary files.

Grid Performance • Stateful instances can be problematic – Intermittent unreliability • On some Grid Performance • Stateful instances can be problematic – Intermittent unreliability • On some runs, 1 exception in 300 calls (reliability of. 9967) – But non-repeatable, SOAP/network related? • What is the safe response to exceptions? Can’t just retry. – Possible to kill container (relies on clients being well behaved): • By invoking same instance/method more than once. • By consuming container resources – But instances can be passivated/activated in theory – Could be used to enable fine-grain (per instance) control over resource usage.

Grid Deployment • How to install and configure Grid infrastructure and services - scalably Grid Deployment • How to install and configure Grid infrastructure and services - scalably and securely? • Install GT 3 infrastructure and security manually – MMJFS allows executable code to be staged automatically (But not services - could provide a deployment service). • Install bootstrapping code, and then install and deploy all other code and security automatically. – Using Smart. Frog (HP) in the lab, and then test-bed. – Configuring GT 3 security remotely is an open-issue, as is “trust” with System Administrators.

Grid Dreams - Debugging • Debugging distributed systems is tricky – Need better support Grid Dreams - Debugging • Debugging distributed systems is tricky – Need better support for cross-cutting non-functional concerns such as deployment and debugging. – (One) problem with debugging services is not knowing the context of errors (to aid diagnosis or cure) – a service is just an interface. • Deployment aware debugging: – Starting from functional work-flows, generate deployment-flows, which are executed prior to, or concurrent with, functional workflows. – If failure in functional work-flow, then corresponding deploymentflow is examined to determine likely causes, and parts are reexecuted.

Grid Dreams - Debugging • Backtrack through deployment steps (Like peeling an onion) – Grid Dreams - Debugging • Backtrack through deployment steps (Like peeling an onion) – Some steps will need to be reversed – Track dependencies, and redundant operations. • This approach may fix an (interesting) sub-class of problems: • Those which can be fixed by simply redoing (or replicating) (part of) the installation, E. g. – Intermittent failure of container or services – Resource starvation or overload • Security problems that can be fixed with reconfiguration or refresh of certificates/proxies. – But not: • network, or all configuration and security/access problems.

UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. ac. uk – After November: Paul. Brebner@csiro. au

UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. ac. uk – After November: Paul. Brebner@csiro. au • Not

UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. ac. uk – After November: Paul. Brebner@csiro. au • Not (quite)

UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. ac. uk – After November: Paul. Brebner@csiro. au • Not (quite) the

UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. ac. uk – After November: Paul. Brebner@csiro. au • Not (quite) the End

UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. UK OGSA Evaluation Project • Thank you – Questions/Comments? • Email: P. Brebner@cs. ucl. ac. uk – After November: Paul. Brebner@csiro. au • Not (quite) the End…

Postscript – The Secret Life of Grid? UK OGSA Evaluation Project Report 1. 0 Postscript – The Secret Life of Grid? UK OGSA Evaluation Project Report 1. 0 Evaluation of Globus Toolkit 3. 2 (GT 3. 2) Installation http: //sse. cs. ucl. ac. uk/UK-OGSA/Report 1. doc

Postscript – The Secret Life of Grid? Our experiences Evaluating Grid technology reminds me Postscript – The Secret Life of Grid? Our experiences Evaluating Grid technology reminds me of an Australian book (“The Secret Life of Wombats”) about a school boy who used to sneak out of his dormitory after everyone was asleep to go “wombatting”. He spent his nights secretly crawling down Wombat burrows with a flashlight – a potentially lethal activity (not just from cave-ins, as wombats are ferocious when cornered!) – and wrote copious notes resulting in a substantial increase in knowledge of these “mysterious and often misunderstood creatures”. UK OGSA Evaluation Project Report 1. 0 Evaluation of Globus Toolkit 3. 2 (GT 3. 2) Installation http: //sse. cs. ucl. ac. uk/UK-OGSA/Report 1. doc

Postscript – The Secret Life of Grid? Our experiences Evaluating Grid technology reminds me Postscript – The Secret Life of Grid? Our experiences Evaluating Grid technology reminds me of an Australian book (“The Secret Life of Wombats”) about a school boy who used to sneak out of his dormitory after everyone was asleep to go “wombatting”. He spent his nights secretly crawling down Wombat burrows with a flashlight – a potentially lethal activity (not just from cave-ins, as wombats are ferocious when cornered!) – and wrote copious notes resulting in a substantial increase in knowledge of these “mysterious and often misunderstood creatures”. UK OGSA Evaluation Project Report 1. 0 Evaluation of Globus Toolkit 3. 2 (GT 3. 2) Installation http: //sse. cs. ucl. ac. uk/UK-OGSA/Report 1. doc