Скачать презентацию Grid System Issues MSI-CI 2 Meeting June 29 Скачать презентацию Grid System Issues MSI-CI 2 Meeting June 29

ceceda01bb0f019f0ab1d3590ebfd96c.ppt

  • Количество слайдов: 103

Grid System Issues MSI-CI 2 Meeting June 29 2006 Geoffrey Fox Computer Science, Informatics, Grid System Issues MSI-CI 2 Meeting June 29 2006 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 gcf@indiana. edu http: //www. infomall. org 1

Topics Covered General Issues: Relation to P 2 P Types of Grids Why use Topics Covered General Issues: Relation to P 2 P Types of Grids Why use Service Oriented Architectures Multi-core Chips All the world’s services Workflow Metadata and State Workflow Sensors and Filters SOAP MPI and Communication Performance Grids of Grids Community Tools 2

Web services Web Services build loosely-coupled, distributed applications, (wrapping existing codes and databases) based Web services Web Services build loosely-coupled, distributed applications, (wrapping existing codes and databases) based on the SOA (service oriented architecture) principles. Web Services interact by exchanging messages in SOAP format The contracts for the message exchanges that implement those interactions are described via WSDL interfaces. 3

A typical Web Service In principle, services can be in any language (Fortran. . A typical Web Service In principle, services can be in any language (Fortran. . Java. . Perl. . Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining) The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python Web Services WSDL interfaces Portal Service Security WSDL interfaces Web Services Payment Credit Card Catalog Warehouse Shipping control 4

Philosophy of Web Service Grids Much of Distributed Computing was built by natural extensions Philosophy of Web Service Grids Much of Distributed Computing was built by natural extensions of computing models developed for sequential machines This leads to the distributed object (DO) model represented by Java and CORBA • RPC (Remote Procedure Call) or RMI (Remote Method Invocation) for Java Key people think this is not a good idea as it scales badly and ties distributed entities together too tightly • Distributed Objects Replaced by Services Note CORBA was considered too complicated in both organization and proposed infrastructure • and Java was considered as “tightly coupled to Sun” • So there were other reasons to discard Thus replace distributed objects by services connected by “one-way” messages and not by request-response messages 5

Some ideas to Remember Grids are managed Web Services exchanging Messages P 2 P Some ideas to Remember Grids are managed Web Services exchanging Messages P 2 P Networks are differently managed and architected services exchanging messages Any computer operation involves messages; not all these messages can be isolated • With services all messages are explicit and can be examined Grid Services extend WS-* Web Service Specifications Web Service container replaces computer Service replaces process A stream is an ordered set of messages Service Internet replaces Internet: messages replace packets (Sub)Grids replace Libraries 6

Internet Scale Distributed Services Grids use Internet technology and are distinguished by managing or Internet Scale Distributed Services Grids use Internet technology and are distinguished by managing or organizing sets of network connected resources • Classic Web allows independent one-to-one access to individual resources • Grids integrate together and manage multiple Internetconnected resources: People, Sensors, computers, data systems Organization can be explicit as in • Tera. Grid which federates many supercomputers; • Information Retrieval Grid which federates multiple data resources; • Crisis. Grid which federates first responders, commanders, sensors, GIS, (Tsunami) simulations, science/public data Organization can be implicit as in Internet resources such as curated databases and simulation resources that “harmonize a community” 7

Typical Grid Architecture Each Blob is a Computer Program! System Services Portal Services User Typical Grid Architecture Each Blob is a Computer Program! System Services Portal Services User Services System Services Application Service Middleware System Services “Core” Grid System Services Raw (HPC) Resources Database 8

Classic Grid Architecture Resources Database Composition Content Access Netsolve Security Collaboration Middle Tier Brokers Classic Grid Architecture Resources Database Composition Content Access Netsolve Security Collaboration Middle Tier Brokers Service Providers Computing Middle Tier becomes Web Services Clients Users and Devices 9

Database Peers Database Service Facing Web Service Interfaces Event/ Message Brokers Peer to Peer Database Peers Database Service Facing Web Service Interfaces Event/ Message Brokers Peer to Peer Grid Peers User Facing Web Service Interfaces A democratic organization 10 Peer to Peer Grid

Different Visions of the Grid e-Science or Cyberinfrastructure are virtual organization Grids supporting global Different Visions of the Grid e-Science or Cyberinfrastructure are virtual organization Grids supporting global distributed engineering and science research (note sensors, instruments are people are all distributed) Utility Computing or X-on-demand (X=data, computer. . ) is a major computer Industry interest in Grids and this is key part of enterprise or campus Grids Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and VRVS/Global. MMCS like Internet A/V conferencing are Collaboration Grids) Do. D’s vision of Network Centric Computing can be considered a Grid (linking sensors, warfighters, commanders, backend resources) and they are building the GIG (Global Information Grid) Commercial 3 G Cell-phones and Do. D ad-hoc network initiative are forming mobile Grids support universal Globalization in life, fun, research, business 11

e-moreorlessanything and the Grid e-Business captures an emerging view of corporations as dynamic virtual e-moreorlessanything and the Grid e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world. • The growing use of outsourcing is one example e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses. The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peerto-peer systems to provide the information technology einfrastructure for e-moreorlessanything. A deluge of data of unprecedented and inevitable size must be managed and understood. People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and storage resources must be supported 12

e-Defense and e-Crisis Grids support Command Control and provide Global Situational Awareness • Link e-Defense and e-Crisis Grids support Command Control and provide Global Situational Awareness • Link commanders and frontline troops to themselves and to archival and real-time data; link to what-if simulations • Dynamic heterogeneous wired and wireless networks • Security and fault tolerance essential System of Systems; Grid of Grids • The command information infrastructure of each ship is a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid • Grids must be heterogeneous and federated Crisis Management and Response enabled by a Grid linking sensors, disaster managers, and first responders with decision support 13

1962 Licklider’s Vision “Lick had this concept – all of the stuff linked together 1962 Licklider’s Vision “Lick had this concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job. ” Larry Roberts – Principal Architect of the ARPANET 14

What is e-Science? ‘e-Science is about global collaboration in key areas of science, and What is e-Science? ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it. ’ John Taylor Director General of Research Councils UK, Office of Science and Technology Ø e-Science is about developing tools and technologies that allow scientists to do ‘faster, better or different’ research 15

Some Important Styles of Grids Computational Grids were origin of concepts and link computers Some Important Styles of Grids Computational Grids were origin of concepts and link computers across the globe – high latency stops this from being used as parallel machine • Typically Compute/File Grids where information (messages) exchanged by writing and reading files Knowledge and Information Grids link sensors and information repositories as in Virtual Observatories or Bio. Informatics Education Grids link teachers, learners, parents as a VO with learning tools, distant lectures etc. e-Science Grids link multidisciplinary researchers across laboratories and universities Community Grids focus on Grids involving large numbers of peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts Semantic Grid links Grid, and AI community with Semantic web (ontology/meta-data enriched resources) and Agent concepts Collaboration Grids support the linkage of multiple people and electronic resources (often peer-to-peer architecture) 16

Types of Computing Grids Running “Pleasing Parallel Jobs” as in United Devices, Entropia (Desktop Types of Computing Grids Running “Pleasing Parallel Jobs” as in United Devices, Entropia (Desktop Grid) “cycle stealing systems” Can be managed (“inside” the enterprise as in Condor) or more informal (as in SETI@Home) Computing-on-demand in Industry where jobs spawned are perhaps very large (SAP, Oracle …) Support distributed file systems as in Legion (Avaki), Globus with (web-enhanced) UNIX programming paradigm • Particle Physics will run some 30, 000 simultaneous jobs Distributed Simulation HLA style Grids (some work) Linking Supercomputers as in Tera. Grid Pipelined applications linking data/instruments, compute, visualization Seamless Access where Grid portals allow one to choose one of multiple resources with a common interfaces Parallel Computing typically NOT suited for a Grid (latency) 17

Analysis and Visualization Large Disks Old Style Metacomputing Grid Large Scale Parallel Computers Spread Analysis and Visualization Large Disks Old Style Metacomputing Grid Large Scale Parallel Computers Spread a single large Problem over multiple supercomputers 18

Utility and Service Computing An important business application of Grids is believed to be Utility and Service Computing An important business application of Grids is believed to be utility computing Namely support a pool of computers to be assigned as needed to take-up extra demand • Pool shared between multiple applications Natural architecture is not a cluster of computers connected to each other but rather a “Farm of Grid Services” connected to Internet and supporting services such as • Web Servers • Financial Modeling • Run SAP • Data-mining • Simulation response to crisis like forest fire or earthquake • Media Servers for Video-over-IP Note classic Supercomputer use is to allow full access to do “anything” via ssh etc. • In service model, one pre-configures services for all programs and you access portal to run job with less security issues 19

UK National Grid Service Grid Operation Support Centre Web Services based National Grid Infrastructure UK National Grid Service Grid Operation Support Centre Web Services based National Grid Infrastructure 20

Towards an International Grid Infrastructure US Tera. Grid SDSC Starlight (Chicago) UK NGS Leeds Towards an International Grid Infrastructure US Tera. Grid SDSC Starlight (Chicago) UK NGS Leeds Manchester Netherlight (Amsterdam) Oxford RAL NCSA PSC UCL UKLight SC 05 All sites connected by production network (not all shown) Computation Steering clients Network Po. P Service Registry Local laptops in Seattle and UK 21

Cyberinfrastructure At Home • BOINC (Berkeley Open Infrastructure for Network Computing) (http: //boinc. berkeley. Cyberinfrastructure At Home • BOINC (Berkeley Open Infrastructure for Network Computing) (http: //boinc. berkeley. edu) • Climateprediction. net: study climate change • Einstein@home: search for gravitational signals emitted by pulsars • LHC@home: improve the design of the CERN LHC particle accelerator • Predictor@home: investigate proteinrelated diseases • Rosetta@home: help researchers develop cures for human diseases • SETI@home: Look for radio evidence of extraterrestrial live Arecibo telescope • Etc. SETI@Home averages 138 TFLOPS on 100, 000’s of computers in 100’s of countries SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO

climateprediction. net Since September 2003: 95, 000 registered participants in 150 countries Donated 8, climateprediction. net Since September 2003: 95, 000 registered participants in 150 countries Donated 8, 000 years of computer time Completed 100, 000 simulations of over 4 M model years 23

Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments, file systems, curated databases Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments, file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012) • Moore’s law for Sensors Possible filters assigned dynamically (on-demand) • Run image processing algorithm on telescope image • Run Gene sequencing algorithm on compiled data Needs decision support front end with “what-if” simulations Metadata (provenance) critical to annotate data Integrate across experiments as in multi-wavelength astronomy Data Deluge comes from pixels/year available 24

Data Deluged Science In the past, we worried about data in the form of Data Deluged Science In the past, we worried about data in the form of parallel I/O or MPI-IO, but we didn’t consider it as an enabler of new algorithms and new ways of computing Data assimilation was not central to HPCC Do. E ASCI set up because didn’t want test data! Now particle physics will get 100 petabytes from CERN • Nuclear physics (Jefferson Lab) in same situation • Use around 30, 000 CPU’s simultaneously 24 X 7 Weather, climate, solid earth (Earth. Scope) Bioinformatics curated databases (Biocomplexity only 1000’s of data points at present) Virtual Observatory and Sky. Server in Astronomy Environmental Sensor nets 25

Data Deluged Science Computing Paradigm Data Assimilation Simulation Informatics Model Ideas Computational Science Datamining Data Deluged Science Computing Paradigm Data Assimilation Simulation Informatics Model Ideas Computational Science Datamining Reasoning

Repositories Federated Databases Database Sensors Streaming Data Field Trip Database Sensor Grid Database Grid Repositories Federated Databases Database Sensors Streaming Data Field Trip Database Sensor Grid Database Grid Research Compute Grid Data Filter Services Research Simulations SERVOGrid ? GIS Discovery Grid Services Education Customization Services From Research to Education Analysis and Visualization Portal Grid of Grids: Research Grid and Education Grid Computer Farm 27

SERVOGrid Requirements Seamless Access to Data repositories and large scale computers Integration of multiple SERVOGrid Requirements Seamless Access to Data repositories and large scale computers Integration of multiple data sources including sensors, databases, file systems with analysis system • Including filtered OGSA-DAI (Grid database access) Rich meta-data generation and access with SERVOGrid specific Schema extending open. GIS (Geography as a Web service) standards and using Semantic Grid Portals with component model for user interfaces and web control of all capabilities Collaboration to support world-wide work Basic Grid tools: workflow and notification NOT metacomputing 28

Community Tools e-mail and list-serves are oldest and best used Kazaa, Instant Messengers, Skype, Community Tools e-mail and list-serves are oldest and best used Kazaa, Instant Messengers, Skype, Napster, Bit. Torrent for P 2 P Collaboration – text, audio-video conferencing, files del. icio. us, Connotea, Citeulike manage shared bookmarks hotornot. com or similar sites allow you to create community resources and share them Writely, Wikis and Blogs are powerful specialized shared document systems Conference. XP and Web. Ex share general applications Google Scholar tells you who has cited your papers while publisher sites tell you about co-authors Note sharing resources creates (implicit) communities • Social network tools study graphs to both define communities and extract their properties

Why use SOA’s Globalization of applications: Life, Fun, Research, Business, Defense as an International Why use SOA’s Globalization of applications: Life, Fun, Research, Business, Defense as an International collaborative activity Globalization of Software Production: Software components including open-source made everywhere Interoperability: in interfaces and protocol (messages) requires Web Services as only broadly supported SOA Anti-Performance: if Moore’s law gives you a factor X, then use √X for performance, √ X for improved lifecycle (re-use) Software Engineering: Software paradigms are ways of “packaging” modules/components/objects/methods/subroutines. Services have minimal coupling and best re-use (lowest performance). 1962 Fortran easier re-use than 2006 Java Multicore chips: requires pervasive concurrency without side effects. Even Microsoft must be able to use 32 -128 way parallelism on a chip over next 5 years

Intel Fall 2005 Multicore Roadmap March 2006 Sun T 1000 8 core Server at Intel Fall 2005 Multicore Roadmap March 2006 Sun T 1000 8 core Server at <$6, 000

Peter Kogge 1997 Normalized SPECINTS Normalized SPECFLTS Performance Per Transistor Millions of Transistors (CPU) Peter Kogge 1997 Normalized SPECINTS Normalized SPECFLTS Performance Per Transistor Millions of Transistors (CPU) Performance data from u. P vendors Transistor count excludes on-chip caches Performance normalized by clock rate Conclusion: Simplest is best! (250 K Transistor CPU)

The Grid and Web Service Institutional Hierarchy 4: Application or Community of Interest (Co. The Grid and Web Service Institutional Hierarchy 4: Application or Community of Interest (Co. I) Specific Services such as “Map Services”, “Run BLAST” or “Simulate a Missile” XBML XTCE VOTABLE CML Cell. ML 3: Generally Useful Services and Features (OGSA and other GGF, W 3 C) Such as “Collaborate”, “Access a Database” or “Submit a Job” OGSA GS-* and some WS-* GGF/W 3 C/…. XGSP (Collab) 2: System Services and Features (WS-* from OASIS/W 3 C/Industry) Handlers like WS-RM, Security, UDDI Registry 1: Container and Run Time (Hosting) Environment (Apache Axis, . NET etc. ) Must set standards to get interoperability WS-* from OASIS/W 3 C/ Industry Apache Axis. NET etc. 33

Sources of Grid Technology Grids support distributed collaboratories or virtual organizations integrating concepts from Sources of Grid Technology Grids support distributed collaboratories or virtual organizations integrating concepts from The Web Agents Distributed Objects (CORBA Java/Jini COM) Globus, Legion, Condor, Net. Solve, Ninf and other High Performance Computing activities Peer-to-peer Networks With perhaps the Web and P 2 P networks being the most important for “Information Grids” and Globus for “Compute/File Grids” 34

The Essence of Grid Technology? We will start from the Web view and assert The Essence of Grid Technology? We will start from the Web view and assert that basic paradigm is Meta-data rich Web Services communicating via messages These have some basic support from some runtime such as. NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise Java. Beans, Web. Sphere (IBM) or GT 3/4 (Globus Toolkit 3/4) • These are the distributed equivalent of operating system functions as in UNIX Shell • Called Hosting Environment or platform W 3 C standard WSDL defines IDL (Interface standard) for Web Services 35

What is Happening? Grid ideas are being developed in (at least) four communities • What is Happening? Grid ideas are being developed in (at least) four communities • Web Service – W 3 C, OASIS, (DMTF) • Global Grid Forum (High Performance Computing, e-Science) • Enterprise Grid Alliance (Commercial “Grid Forum” with a near term focus) merged with GGF to make Open Grid Forum Service Standards are being debated Grid Operational Infrastructure is being deployed Grid Architecture and core software being developed • Apache has several important projects as do academia; large and small companies Particular System Services are being developed “centrally” – OGSA framework for this in GGF; WS-* for OASIS/W 3 C/Microsoft-IBM Lots of fields are setting domain specific standards and building domain specific services USA started but now Europe is probably in the lead and Asia will soon catch USA if momentum (roughly zero for USA) continues 36

What do Grids Add? Grids use all of the Web Services They address management What do Grids Add? Grids use all of the Web Services They address management and deployment of large distributed systems of services • Internet Scale Distributed Services • I will use Grid more simply as a composable coordinated collection of services They address security and management issues of virtual organizations crossing multiple administrative domains GGF is developing specific services of relevance including job management, many aspects of data and scheduling • Not much on sensors, real-time, P 2 P GGF has a good process for developing new higher level specifications 37

Technical Activities of Note Look at different styles of Grids such as Autonomic (Robust Technical Activities of Note Look at different styles of Grids such as Autonomic (Robust Reliable Resilient) New Grid architectures hard due to investment required Program the Grid – Workflow Access the Grid – Portals, Grid Computing Environments Low Critical Services Such as Level WS-* • Security – build message based not connection based • Notification – event services • Metadata – Use Semantic Web, provenance • Fabric and Service Management • Databases and repositories – instruments, sensors • Computing – Submit job, scheduling, distributed file systems • Visualization, Computational Steering High Level • Network performance e. g. OGSA 38

What do Web Services Prescribe? • The specify interfaces for system services (and generally What do Web Services Prescribe? • The specify interfaces for system services (and generally useful services like database) • They specify an interface language (WSDL) for all services • They develop containers and frameworks to use to host services • They specify a message format (SOAP) for ALL messages that defines both application and system actions precisely • They imply a process be started to define domain specific services • There are multiple competing activities from Microsoft and IBM to Apache, and IU (for example) developing system and application services • Unlike for RTI and CORBA, services from different vendors should interoperate Container System Processing H 1 H 2 H 3 H 4 Body F 1 F 2 F 3 F 4 Container Handlers Service 39

Plethora of Standards Java is very powerful partly due to its many “frameworks” that Plethora of Standards Java is very powerful partly due to its many “frameworks” that generalize libraries e. g. • Java Media Framework • Java Database Connectivity JDBC Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services” • About 60 WS-* specifications introduced in last 2 -3 years • These are low level with higher level standards such as access database (OGSA-DAI) or “Submit a job” built on top of these Many battles both between standard bodies and between companies as each tries to set standards they consider best; thus there are multiple standards for many of key Web Service functionalities Microsoft a key player and stands to benefit as Web Services open up enterprise software space to all participants • e. g. MQSeries (IBM) and Tibco have to change their messaging systems to support new open standards 40

The Ten areas covered by the 60 core WS-* Specifications WS-* Specification Area Examples The Ten areas covered by the 60 core WS-* Specifications WS-* Specification Area Examples 1: Core Service Model XML, WSDL, SOAP 2: Service Internet WS-Addressing, WS-Message. Delivery; Reliable Messaging WSRM; Efficient Messaging MOTM 3: Notification WS-Notification, WS-Eventing (Publish-Subscribe) 4: Workflow and Transactions BPEL, WS-Choreography, WS-Coordination 5: Security WS-Security, WS-Trust, WS-Federation, SAML, WS-Secure. Conversation 6: Service Discovery UDDI, WS-Discovery 7: System Metadata and State WSRF, WS-Metadata. Exchange, WS-Context 8: Management WSDM, WS-Management, WS-Transfer 9: Policy and Agreements WS-Policy, WS-Agreement 10: Portals and User Interfaces WSRP (Remote Portlets) 41

Activities in Global Grid Forum Working Groups GGF Area GS-* and OGSA Standards Activities Activities in Global Grid Forum Working Groups GGF Area GS-* and OGSA Standards Activities 1: Architecture High Level Resource/Service Naming (level 2 of slide 6), Integrated Grid Architecture 2: Applications Software Interfaces to Grid, Grid Remote Procedure Call, Checkpointing and Recovery, Interoperability to Job Submittal services, Information Retrieval, 3: Compute Job Submission, Basic Execution Services, Service Level Agreements for Resource use and reservation, Distributed Scheduling 4: Database and File Grid access, Grid FTP, Storage Management, Data replication, Binary data specification and interface, High-level publish/subscribe, Transaction management 5: Infrastructure Network measurements, Role of IPv 6 and high performance networking, Data transport 6: Management Resource/Service configuration, deployment and lifetime, Usage records and access, Grid economy model 7: Security Authorization, P 2 P and Firewall Issues, Trusted Computing 42

Net-Centric Core Enterprise Services Service Functionality NCES 1: Enterprise Services Management (ESM) including life-cycle Net-Centric Core Enterprise Services Service Functionality NCES 1: Enterprise Services Management (ESM) including life-cycle management NCES 2: Information Assurance (IA)/Security Supports confidentiality, integrity and availability. Implies reliability and autonomic features NCES 3: Messaging Synchronous or asynchronous cases NCES 4: Discovery Searching data and services NCES 5: Mediation Includes translation, aggregation, integration, correlation, fusion, brokering publication, and other transformations for services and data. Possibly agents NCES 6: Collaboration Provision and control of sharing with emphasis on synchronous real-time services NCES 7: User Assistance Includes automated and manual methods of optimizing the user Gi. G experience (user agent) NCES 8: Storage Retention, organization and disposition of all forms of data NCES 9: Application Provisioning, operations and maintenance of applications. 43

The Core Features/Service Areas I Service or Feature WS-* GS-* NCES (Do. D) Comments The Core Features/Service Areas I Service or Feature WS-* GS-* NCES (Do. D) Comments A: Broad Principles FS 1: Use SOA: Service Oriented Arch. WS 1 Core Service Architecture, Build Grids on Web Services. Industry best practice FS 2: Grid of Grids Distinctive Strategy for legacy subsystems and modular architecture B: Core Services FS 3: Service Internet, Messaging WS 2 NCES 3 Streams/Sensors. Team FS 4: Notification WS 3 NCES 3 JMS, MQSeries. FS 5 Workflow WS 4 NCES 5 Grid Programming FS 6 : Security WS 5 FS 7: Discovery WS 6 FS 8: System Metadata & State WS 7 FS 9: Management WS 8 FS 10: Policy WS 9 GS 7 NCES 2 Grid-Shib, Permis Liberty Alliance. . . NCES 4 UDDI Globus MDS Semantic Grid, WS-Context GS 6 NCES 1 CIM ECS 44

The Core Feature/Service Areas II Service or Feature WS-* GS-* NCES Comments NCES 7 The Core Feature/Service Areas II Service or Feature WS-* GS-* NCES Comments NCES 7 Portlets JSR 168, NCES Capability Interfaces NCES 8 NCOW Data Strategy Federation at data/information layer major research area; CGL leading role B: Core Services (Continued) FS 11: Portals and User WS 10 assistance FS 12: Computing GS 3 FS 13: Data and Storage GS 4 FS 14: Information GS 4 FS 15: Applications and User Services GS 2 FS 16: Resources and Infrastructure GS 5 FS 17: Collaboration and Virtual Organizations GS 7 FS 18: Scheduling and matching of Services and Resources GS 3 JBI for Do. D, WFS for OGC NCES 9 Standalone Services Proxies for jobs Ad-hoc networks NCES 6 XGSP, Shared Web Service ports Current work only addresses scheduling “batch jobs”. Need networks and services 45

A List of Web Services 1 • 1) Core Service Architecture • XSD XML A List of Web Services 1 • 1) Core Service Architecture • XSD XML Schema (W 3 C Recommendation) V 1. 0 February 1998, V 1. 1 February 2004 • WSDL 1. 1 Web Services Description Language Version 1. 1, (W 3 C note) March 2001 • WSDL 2. 0 Web Services Description Language Version 2. 0, (W 3 C under development) March 2004 • SOAP 1. 1 (W 3 C Note) V 1. 1 Note May 2000 • SOAP 1. 2 (W 3 C Recommendation) June 24 2003 46

A List of Web Services 2 • 2) Service Internet including messaging • WS-Addressing A List of Web Services 2 • 2) Service Internet including messaging • WS-Addressing Web Services Addressing (BEA, IBM, Microsoft, SAP, Sun) in W 3 C consideration August 2004 • WS-Message. Delivery Web Services Message Delivery (W 3 C Submission by Oracle, Sun. . ) April 2004 • WS-Reliability Web Services Reliable Messaging (OASIS Web Services Reliable Messaging TC) March 2004 • WS-RM Web Services Reliable Messaging (BEA, IBM, Microsoft, Tibco) v 0. 992 February 2005 linked to WS-Reliability in OASIS as Web Services Reliable Exchange (WS-RX) • WS-RM Policy Web Services Reliable Messaging Policy Assertion (BEA, IBM, Microsoft, Tibco) March 2006 • WS-RX Web Services Reliable Exchange (Many members) integrating previous reliability specifications • SOAP MOTM SOAP Message Transmission Optimization Mechanism (W 3 C) June 2004 • SOAP-over-UDP Binding of SOAP to UDP (Microsoft, BEA …) September 2004 • Many obsolete specifications like WS-Routing and Referral SOAP Routing Protocol (Microsoft) October 2001 47

Application Specific Grids Generally Useful Services and Grids Workflow WSFL/BPEL Service Management (“Context etc. Application Specific Grids Generally Useful Services and Grids Workflow WSFL/BPEL Service Management (“Context etc. ”) Service Discovery (UDDI) / Information Service Internet Transport Protocol Service Interfaces WSDL Base Hosting Environment Protocol HTTP FTP DNS … Presentation XDR … Session SSH … Transport TCP UDP … Network IP … Data Link / Physical Higher Level Services Service Context Service Internet Bit level Internet (OSI Stack) Layered Architecture for Web Services and Grids 48

WS-* implies the Service Internet We have the classic (CISCO, Juniper …. ) Internet WS-* implies the Service Internet We have the classic (CISCO, Juniper …. ) Internet routing the flood of ordinary packets in OSI stack architecture Web Services build the “Service Internet” or IOI (Internet on Internet) with • Routing via WS-Addressing not IP header • Fault Tolerance (WS-RM not TCP) • Security (WS-Security/Secure. Conversation not IPSec/SSL) • Data Transmission by WS-Transfer not HTTP • Information Services (UDDI/WS-Context not DNS/Configuration files) • At message/web service level and not packet/IP address level Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software overlay network defining Grid (analogy is VPN) SOAP Header contains all information needed for the “Service Internet” (Grid Operating System) with SOAP Body containing information for Grid application service

A List of Web Services 3 • 3) Notification and high-level publish/subscribe information dissemination A List of Web Services 3 • 3) Notification and high-level publish/subscribe information dissemination • WS-Eventing Web Services Eventing (BEA, Microsoft, TIBCO) August 2004 • WS-Event. Notification (HP, IBM, Intel, Microsoft) March 2006 uses resources to manage subscriptions • WS-Notification Framework for Web Services Notification with WSTopics, WS-Base. Notification, and WS-Brokered. Notification (OASIS) OASIS Web Services Notification TC Set up March 2004 • JMS Java Message Service V 1. 1 March 2002 • Different from using publish-subscribe to robustly support messaging between Web services – Bind SOAP to JMS or MQSeries 50

A List of Web Services 4 • 4) Coordination and Workflow, Transactions and Contextualization A List of Web Services 4 • 4) Coordination and Workflow, Transactions and Contextualization • BPEL Business Process Execution Language for Web Services (OASIS) V 1. 1 May 2003 (V 1. 1) with V 2. 0 under development • WS-CDL Web Services Choreography Language (W 3 C) V 1. 0 Working Draft 17 December 2004 • WSCI (W 3 C) Web Service Choreography Interface V 1. 0 (W 3 C Note from BEA, Intalio, SAP, Sun, Yahoo) • WSCL Web Services Conversation Language (W 3 C Note) HP March 2002 • Workflow is general linkage between services; transactions are a critical special case • Concept of workflow generalizes traditional workflow processes in business • Many competing workflow implementations and standards; 51 many implementations “reject” current standards

Role of Workflow Service-1 Service-3 Service-2 Programming SOAP and Web Services (the Grid): Workflow Role of Workflow Service-1 Service-3 Service-2 Programming SOAP and Web Services (the Grid): Workflow describes linkage between services As distributed, linkage must be by messages Linkage is two-way and has both control and data Apply to multi-disciplinary, multi-scale linkage, multi-program linkage, link visualization to simulation, GIS to simulations and visualization filters to each other Microsoft-IBM specification BPEL is current preferred Web Service XML specification of workflow 52

Example workflow Here a sensor feeds a datamining application (We are extending datamining in Example workflow Here a sensor feeds a datamining application (We are extending datamining in Do. D applications with Grossman from UIC) The data-mining application drives a visualization 53

Example Flood Simulation workflow 54 Example Flood Simulation workflow 54

SERVOGrid Codes, Relationships Elastic Dislocation Inversion Viscoelastic FEM Viscoelastic Layered BEM Elastic Dislocation Pattern SERVOGrid Codes, Relationships Elastic Dislocation Inversion Viscoelastic FEM Viscoelastic Layered BEM Elastic Dislocation Pattern Recognizers Fault Model BEM 55 This linkage called Workflow in Grid/Web Service parlance

Two-level Programming I • The Web Service (Grid) paradigm implicitly assumes a two -level Two-level Programming I • The Web Service (Grid) paradigm implicitly assumes a two -level Programming Model • We make a Service (same as a “distributed object” or “computer program” running on a remote computer) using conventional technologies – C++ Java or Fortran Monte Carlo module – Data streaming from a sensor or Satellite – Specialized (JDBC) database access • Such services accept and produce data from users files and databases Service Data • The Grid is built by coordinating such services assuming we have solved problem of programming the service 56

Two-level Programming II The Grid is discussing the composition of distributed services with the Two-level Programming II The Grid is discussing the composition of distributed services with the runtime Service 1 Service 2 interfaces to Grid as opposed to UNIX Service 3 Service 4 pipes/data streams Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs Such interpretative environments are the single processor analog of Grid Programming Some projects like Gr. ADS from Rice University are looking at integration between service and composition levels but dominant effort looks at each level separately 57

3 Layer Programming Model Web Service 1 WS 2 WS N-1 Web Service N 3 Layer Programming Model Web Service 1 WS 2 WS N-1 Web Service N Level 1 Programming inside services Application expressed in in Java Fortran C++ MPI etc. WS-* Infrastructure Level 2 Programming choosing services by virtualization Application Semantics (Metadata, Ontology) Semantic Grid Level 3 Grid Programming composing multiple services Service Workflow, Transactions, Mediation Substantial work in UK e-Science program, international semantic web community 58

A List of Web Services 4 -Continued • 4) Transactions, Business Processes and Contextualization A List of Web Services 4 -Continued • 4) Transactions, Business Processes and Contextualization • WS-CAF Web Services Composite Application Framework including WSCTX, WS-CF and WS-TXM below (OASIS Web Services Composite Application Framework TC) • WS-CTX Web Services Context (OASIS Web Services Composite Application Framework TC) V 0. 9. 2 July 2005 • WS-CF Web Services Coordination Framework (OASIS Web Services Composite Application Framework TC) V 0. 1 April 2005 • WS-TXM Web Services Transaction Management (OASIS Web Services Composite Application Framework TC) including WS-ACID (V 0. 1 May 2005), WS-BP (Business Process V 0. 1 May 2005), WS-LRA (Long running action V 0. 1 May 2005) • WS-Coordination Web Services Coordination (BEA, IBM, Microsoft) November 2004 • WS-Atomic. Transaction Web Services Atomic Transaction (BEA, IBM, Microsoft) November 2004 • WS-Business. Activity Web Services Business Activity Framework (BEA, IBM, Microsoft) November 2004 • BTP Business Transaction Protocol (OASIS) May 2002 with V 1. 1 November 2004 • eb. XML BPSS Business Process (OASIS) with V 2. 0. 1 pre-Committee Draft 59 review 17 July 2005

A List of Web Services 5 • 5) Security Frameworks and Core Specifications • A List of Web Services 5 • 5) Security Frameworks and Core Specifications • WS-Security 2004 Web Services Security: SOAP Message Security (OASIS) Standard March 2004. • WS-I Basic Security Profile V 1. 0 Web Services Interoperability Organization Working Group Draft May 15 2005 • WS-Security Username Token Profile Web Services Security Username Token Profile V 1. 0 OASIS Standard, March 2004 • WS-Security X. 509 Certificate Token Profile Web Services Security X. 509 Certificate Token Profile OASIS Standard, March 2004 • WS-Security REL Profile Web Services Security Rights Expression Language (REL) Token Profile OASIS Standard: 19 December 2004 • WS-I REL Token Profile V 1. 0 Web Services Interoperability Organization Working Group Draft 13 May 2005 • WS-Security Kerberos Web Services Security Kerberos Binding (Microsoft) December 2003 • Web-SSO Web Single Sign-On Metadata Exchange Protocol (Microsoft, Sun) April 2005 • Web-SSO-Mex Web Single Sign-On Interoperability Profile (Microsoft, Sun) April 2005 • WS-Security. Policy Web Services Security Policy Language (IBM, Microsoft, 60 RSA, Verisign) V 1. 1 July 2005

A List of Web Services 5 - Contd • 5) Security Capabilities • WS-Trust A List of Web Services 5 - Contd • 5) Security Capabilities • WS-Trust Web Services Trust Language (BEA, IBM, Microsoft, RSA, Verisign …) February 2005 • WS-Secure. Conversation Web Services Secure Conversation Language (BEA, IBM, Microsoft, RSA, Verisign …) February 2005 • WS-Federation Web Services Federation Language (BEA, IBM, Microsoft, RSA, Verisign) July 2003 • WS-Federation Active Requestor Profile Web Services Federation Language Active Requestor Profile V 1. 0 (BEA, IBM, Microsoft, RSA, Verisign) July 8, 2003 • WS-Federation Passive Requestor Profile Web Services Federation Language Passive Requestor Profile V 1. 0 (BEA, IBM, Microsoft, RSA, Verisign) July 8, 2003 • WS-Authorization is being developed by IBM and Microsoft and will build on WS-Trust to describe how access to particular web services is specified and managed. • WS-Privacy is being developed by IBM and Microsoft and will build on WS-Policy to describe the binding of privacy policies to Web services and their exchanged data. 61

A List of Web Services 5 - Contd • 5) Security Languages • SAML A List of Web Services 5 - Contd • 5) Security Languages • SAML Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V 2. 0 OASIS Standard, 15 March 2005 • WS-Security SAML Token Profile Web Services Security SAML Token Profile OASIS Standard, 1 December 2004 • WS-I SAML Token Profile V 1. 0 Web Services Interoperability Organization Working Group Draft 13 May 2005 • XACML e. Xtensible Access Control Markup 62 Language (OASIS) V 2. 0 1 February 2005

A List of Web Services 6 • 6) Service Discovery • UDDI (Broadly Supported A List of Web Services 6 • 6) Service Discovery • UDDI (Broadly Supported OASIS Standard) V 3 August 2003 • WS-Discovery Web services Dynamic Discovery (Microsoft, BEA, Intel …) February 2004 • WS-IL Web Services Inspection Language, (IBM, Microsoft) November 2001 • Note WS-Context as a metadata catalog and WSManagement Catalog are examples of related services 63 • There are many UDDI extensions

A List of Web Services 7 • 7) Metadata and State • RDF Resource A List of Web Services 7 • 7) Metadata and State • RDF Resource Description Framework (W 3 C) Set of recommendations expanded from original February 1999 standard • DAML+OIL combining DAML (Darpa Agent Markup Language) and OIL (Ontology Inference Layer) (W 3 C) Note December 2001 • OWL Web Ontology Language (W 3 C) Recommendation February 2004 • WS-Metadata. Exchange 1. 1 Web Services Metadata Exchange (HP, IBM, Intel, Microsoft) March 2006 • ASAP Asynchronous Service Access Protocol (OASIS) with V 1. 0 working draft 2 B December 11 2004 • WS-GAF Web Service Grid Application Framework (Arjuna, Newcastle University) August 2003 • WBEM Web-Based Enterprise Management including CIM (Common Information Model) from DMTF (Distributed Management Task Force) 2004 -2005 64

A List of Web Services 7 • 7) Metadata and State: Resource Framework • A List of Web Services 7 • 7) Metadata and State: Resource Framework • WS-RF Web Services Resource Framework (OASIS) including • WS-Resource Framework Web Services Resource 1. 2 (OASIS) Public Review Draft 01, 10 June 2005 • WS-Resource. Properties Web Services Resource Properties V 1. 2 Public Review Draft 01, 10 June 2005 • WS-Resource. Lifetime Web Services Resource Lifetime V 1. 2 Public Review Draft 01, 13 June 2005 • WS-Service. Group Web Services Service Group V 1. 2 Public Review Draft 01, 10 June 2005 • WS-Base. Faults Web Services Base Faults V 1. 2 Public Review Draft 01, June 13, 2005 65

Metadata and Service Context Consider a collection of services working together • Workflow tells Metadata and Service Context Consider a collection of services working together • Workflow tells you how to specify service interaction but more basically there is shared information or context specifying/controlling collection WS-RF and WS-GAF have different approaches to contextualization – supplying a common “context” which at its simplest is a token to represent state More generally core shared information includes dynamic service metadata and the equivalent of configuration information. One can supports such a common context either as pool of messages or as message-based access to a “database” (Context Service) Two services linked by a stream are perhaps simplest example of a collection of services needing context Note that there is a tension between storing metadata in messages and services. • This is shared versus distributed memory debate in parallel computing 66

Stateful Interactions There are (at least) four approaches to specifying state • OGSI use Stateful Interactions There are (at least) four approaches to specifying state • OGSI use factories to generate separate services for each session in standard distributed object fashion • Globus GT-4 and WSRF use metadata of a resource to identify state associated with particular session • WS-GAF uses WS-Context to provide abstract context defining state. Has strength and weakness that reveals less about nature of session • WS-I+ “Pure Web Service” leaves state specification the application – e. g. put a context in the SOAP body I think we should smile and write a great metadata service hiding all these different models for state and metadata 67

A List of Web Services 8 • 8) Management – original OASIS • WS-Distributed. A List of Web Services 8 • 8) Management – original OASIS • WS-Distributed. Management Web Services Distributed Management Framework with MUWS and MOWS below (OASIS) • WSDM-MUWS Web Services Distributed Management: Management Using Web Services (OASIS) OASIS Standard March 9 2005 • WSDM-MOWS Web Services Distributed Management: Management of Web Services (OASIS) OASIS Standard March 9 2005 68

A List of Web Services 8 - Contd • 8) Management: Microsoft Converged Stack A List of Web Services 8 - Contd • 8) Management: Microsoft Converged Stack • WS-Management Web Services for Management (Microsoft, Intel, Sun …) August 2005 • WS-Management Catalog The WS-Management Catalog (Microsoft, Intel, Sun …) August 2005 • WS-Resource. Transfer Web Service Resource Transfer (HP, IBM, Intel, Microsoft) March 2006 • WS-Transfer Web Service Transfer (Microsoft, BEA, Sonic Software etc. ) September 2004 • WS-Transfer. Addendum Extensions to Web Service Transfer (HP, IBM, Intel, Microsoft) March 2006 • WS-Enumeration Web Service Enumeration (Microsoft, BEA, Sonic Software etc. ) September 2004 69

A List of Web Services 9 • 9) General Service Characteristics • WS-Policy. Framework A List of Web Services 9 • 9) General Service Characteristics • WS-Policy. Framework Web Services Policy Framework (BEA, IBM, Microsoft, SAP …) September 2004 • WS-Policy. Attachment Web Services Policy Attachment (BEA, IBM, Microsoft, SAP …) September 2004 • WS-Policy. Assertions Web Services Policy Assertions Language (BEA, IBM, Microsoft, SAP) 18 December 2002 (Superseded by WS-Policy. Framework) • WS-Agreement Web Services Agreement Specification (GGF under development) 9 August 2004 70

A List of Web Services 10 • 10) User Interfaces • WSRP Web Services A List of Web Services 10 • 10) User Interfaces • WSRP Web Services for Remote Portlets (OASIS) OASIS Standard August 2003 • JSR 168: JSR-000168 Portlet Specification for Java binding (Java Community Process) October 2003 • WSRP specifies the client-service protocol while JSR 168 specifies how portlets are implemented for each supported service user-facing Web service ports inside aggregating portalslike Jet. Speed, Grid. Sphere or u. Portal 71

WS-I Interoperability Critical underpinning of Grids and Web Services is the gradually growing set WS-I Interoperability Critical underpinning of Grids and Web Services is the gradually growing set of specifications in the Web Service Interoperability Profiles Web Services Interoperability (WS-I) Interoperability Profile 1. 0 a. " http: //www. ws-i. org. gives us XSD, WSDL 1. 1, SOAP 1. 1, UDDI in basic profile and parts of WS-Security in their first security profile. We imagine the “ 60 Specifications” being checked out and evolved in the cauldron of the real world and occasionally best practice identifies a new specification to be added to WS-I which gradually increases in scope • Note only 4. 5 out of 60 specifications have “made it” in this definition 72

Raw Data Information Knowledge Wisdom Another Grid SS SS FS OS OS FS FS Raw Data Information Knowledge Wisdom Another Grid SS SS FS OS OS FS FS SS FS FS MD SS SS FS es sa al ge s MD F S FS FS SS MD OS FS SS Other Service OS OS MD SS OS FS Filter Service SS rt SS SS Meta. Data Sensor Service SS SS M MD MD SS P F S OS FS Po A OS FS MD FS SS SO FS OS Another Grid MD FS SS SS FS MD SS Another Service FS OS SS SS Another Grid Decisions Database Another Service 73

Semantic Grid and Services Implications of SOA (Service Oriented Architectures) for SG (Semantic Grid) Semantic Grid and Services Implications of SOA (Service Oriented Architectures) for SG (Semantic Grid) • Build services to implement SG Implications of SG for SOA • Build metadata rich systems of services using SG Services receive data in SOAP messages, manipulate it and produce transformed data as further messages Meta-data is carried in SOAP messages Meta-data controls processing and transport of SOAP Messages Knowledge is created from data by services The Grid enhances Web services with semantically rich system and application specific management One must exploit and work around the different approaches to meta-data and their manipulation in Web Services 74

Structure of SOAP Messages Container Workflow H 1 H 2 H 3 H 4 Structure of SOAP Messages Container Workflow H 1 H 2 H 3 H 4 Body F 1 F 2 F 3 F 4 Service Container Handlers SOAP Messages have System information in the header including WS-Policy based meta-data defining processing options • Processed by Handlers Application data and meta-data is the body (controversies here!) • Processed by the Service itself Some meta-data like WS-RF is logically “only in messages” Other like that in WS-Context or the SRB are stored in logical equivalent of XML databases We only need to preserve semantic structure (XML/SOAP Infoset) so transport in fast XML and store in efficient relational databases 75

Support for Messages Optimize XML representation and transport protocol XML’’ Std. XML Filter 2 Support for Messages Optimize XML representation and transport protocol XML’’ Std. XML Filter 2 Filter 1 Choose Invertible Filter XML’ Choose Protocol Database (WS-Context) Filter 1 -1 Filter 2 -1 Std. XML’’ Filters Preserve Infoset 76

FI (Fast Infoset=Binary XML) v Traditional XML Messages 77 FI (Fast Infoset=Binary XML) v Traditional XML Messages 77

PDA to Web Service Optimized Communication 78 PDA to Web Service Optimized Communication 78

Requirements for MPI Messaging tcalc tcomm tcalc MPI and SOAP Messaging both send data Requirements for MPI Messaging tcalc tcomm tcalc MPI and SOAP Messaging both send data from a source to a destination • MPI supports multicast (broadcast) communication; • MPI specifies destination and a context (in comm parameter) • MPI specifies data to send • MPI has a tag to allow flexibility in processing in source processor • MPI has calls to understand context (number of processors etc. ) MPI requires very low latency and high bandwidth so that tcomm/tcalc is at most 10 • Blue. Gene/L has bandwidth between 0. 25 and 3 Gigabytes/sec/node and latency of about 5 microseconds • Latency determined so Message Size/Bandwidth > Latency 79

Requirements for SOAP Messaging Web Services has much of the same requirements as MPI Requirements for SOAP Messaging Web Services has much of the same requirements as MPI with two differences where MPI more stringent than SOAP • Latencies are inevitably 1 (local) to 100 milliseconds which is 200 to 20, 000 times that of Blue. Gene/L 1) 0. 000001 ms – CPU does a calculation 2) 0. 001 to 0. 01 ms – MPI latency 3) 1 to 10 ms – wake-up a thread or process 4) 10 to 1000 ms – Internet delay • Bandwidths for many business applications are low as one just needs to send enough information for ATM and Bank to define transactions SOAP has MUCH greater flexibility in areas like security, faulttolerance, “virtualizing addressing” because one can run a lot of software in 100 milliseconds • Typically takes 1 -3 milliseconds to gobble up a modest message in Java and “add value” 80

Structure of SOAP defines a very obvious message structure with a header and a Structure of SOAP defines a very obvious message structure with a header and a body just like email The header contains information used by the “Internet operating system” • Destination, Source, Routing, Context, Sequence Number … The message body is partly further information used by the operating system and partly information for application when it is not looked at by “operating system” except to encrypt, compress it etc. • Note WS-Security supports separate encryption for different parts of a document Much discussion in field revolves around what is referenced in header This structure makes it possible to define VERY Sophisticated messaging 81

MPI and SOAP Integration Note SOAP Specifies format and through WSDL interfaces MPI only MPI and SOAP Integration Note SOAP Specifies format and through WSDL interfaces MPI only specifies interface and so interoperability between different MPIs requires additional work • IMPI http: //impi. nist. gov/IMPI/ Pervasive networks can support high bandwidth (Terabits/sec soon) but latency issue is not resolvable in general way Can combine MPI interfaces with SOAP messaging but I don’t think this has been done Just as walking, cars, planes, phones coexist with different properties; so SOAP and MPI are both good and should be used where appropriate 82

When is a High Performance Computer? We might wish to consider three classes of When is a High Performance Computer? We might wish to consider three classes of multi-node computers 1) Classic MPP with microsecond latency and scalable internode bandwidth (tcomm/tcalc ~ 10 or so) 2) Classic Cluster which can vary from configurations like 1) to 3) but typically have millisecond latency and modest bandwidth 3) Classic Grid or distributed systems of computers around the network • Latencies of inter-node communication – 100’s of milliseconds but can have good bandwidth All have same peak CPU performance but synchronization costs increase as one goes from 1) to 3) Cost of system (dollars per gigaflop) decreases by factors of 2 at each step from 1) to 2) to 3) One should NOT use classic MPP if class 2) or 3) suffices unless some security or data issues dominates over cost-performance One should not use a Grid as a true parallel computer – it can link parallel computers together for convenient access etc. 83

Linking Modules Closely coupled Java/Python … Module B Module A Method Calls. 001 to Linking Modules Closely coupled Java/Python … Module B Module A Method Calls. 001 to 1 millisecond Coarse Grain Service Model Service B Messages Service A 0. 1 to 1000 millisecond latency From method based to RPC to message based to event-based publish-subscribe Message Oriented Middleware “Listener” Subscribe to Events Service B Publisher Post Events Message Queue in the Sky Service A 84

What is a Simple Service? Take any system – it has multiple functionalities • What is a Simple Service? Take any system – it has multiple functionalities • We can implement each functionality as an independent distributed service • Or we can bundle multiple functionalities in a single service Whether functionality is an independent service or one of many method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond” • Distributed services incur messaging overhead of one (local) to 100’s (far apart) of milliseconds to use message rather than method call • Use scripting or compiled integration of functionalities ONLY when require <1 millisecond interaction latency Apache web site has many (pre Web Service) projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services • Makes it hard to integrate sharing common security, user 85 profile, file access. . services

 • • • Grids of Simple Services Link via methods messages streams Services • • • Grids of Simple Services Link via methods messages streams Services and Grids are linked by messages Internally to service, functionalities are linked by methods A simple service is the smallest Grid We are familiar with method-linked hierarchy Lines of Code Methods Objects Programs Packages Methods CPUs Services Clusters MPPs Databases Sensor Federated Databases Sensor Nets Component Grids Compute Resource Grids Data Resource Grids Overlay and Compose Grids of Grids 86

Component Grids? So we build collections of Web Services which we package as component Component Grids? So we build collections of Web Services which we package as component Grids • Visualization Grid • Sensor Grid • Utility Computing Grid • Collaboration Grid • Earthquake Simulation Grid • Control Room Grid • Crisis Management Grid • Drug Discovery Grid • Bioinformatics Sequence Analysis Grid • Intelligence Data-mining Grid We build bigger Grids by composing component Grids using the Service Internet 87

Typical use of Grid Messaging in NASA Sensor Grid implementing using NB NB Datamining Typical use of Grid Messaging in NASA Sensor Grid implementing using NB NB Datamining Grid GIS Grid 88

Services Using the Grid of Grids and Core Services to build multiple application grids Services Using the Grid of Grids and Core Services to build multiple application grids re-using common components. Bio. Informatics Grid Chemical Informatics Grid 15: Application Services Screening Tools Quantum Calculations … 14: Information 11: Portals 17: Collaboration 9: Management 7: Discovery 6: Security … Domain Specific Grids/Services 12: Computing 18: Scheduling 4: Notification Core Low Level Grid Services 3: Messaging 5: Workflow 15: Application Services Sequencing Tools Biocomplexity Simulations Instrument/Sensor 13: Data Access/Storage 10: Policy 8: Metadata 9: Management Physical Network (monitored by FS 16) 89

Flood CIGrid … Electricity CIGrid … Flood Services and Filters Collaboration Grid Sensor Grid Flood CIGrid … Electricity CIGrid … Flood Services and Filters Collaboration Grid Sensor Grid Registry Security Portals GIS Grid Data Access/Storage Core Grid Services Notification Workflow Gas CIGrid Gas Services and Filters Visualization Grid Compute Grid Metadata Messaging Physical Network Critical Infrastructure (CI) Grids built as Grids of Grids 90

Port Mediation and Transformation in a Grid of Grids and Simple Services Port Internal Port Mediation and Transformation in a Grid of Grids and Simple Services Port Internal Port Interfaces Subgrid or service Messaging Port Subgrid or service Port Internal Port Interfaces Port External facing Interfaces Mediation and Transformation Services Subgrid or service 91

Why can we build better software? In 1962 I was punching holes in cards Why can we build better software? In 1962 I was punching holes in cards and paper tape to persuade tiny slow computers to manipulate words in memory to string together instructions like a = b + c Now computers are much faster and languages are better but not a lot better • I suspect I would only be a factor of 2 or so faster programming the same program today However A B C can now be resources (Bank records, Drugs, Games, Supernova) and + can be a service composition • Objects were insufficient as they distributed ordinary programs; services express distributed independent entities (communication time very different inter and intra computers) • Services are essential for reliable modular programming

What’s wrong with old programs They were made of instructions, methods, subroutines and libraries What’s wrong with old programs They were made of instructions, methods, subroutines and libraries thereof Languages (Java, C++) encouraged spaghetti programming that linked parts of programs together • This leads to efficient but unmaintainable software However now computers and networks are several orders of magnitude faster • Optimize for modularity and maintainability and rarely if ever optimize for performance Old programs have the wrong optimization and by construction are hard to maintain/change

Old and New Software Regime Web Services, Grids and P 2 P systems are Old and New Software Regime Web Services, Grids and P 2 P systems are built with • The new software model: independent entities connected by explicit messages All computer entities are actually connected by some form of message (traveling on bus or from memory to register) but often implicit • And they support the distributed services and resources needed for global science, fun and business • Google, Amazon, Yahoo and perhaps Microsoft and Electronic Arts can exploit this model Old programs have the old architecture and cannot be modified • At best can wrap partial functionalities as services and use as a black box • IBM, Oracle and the old Enterprise software companies have this noose around their necks

Delicious Applications http: //del. icio. us purchased by Yahoo for ~$30 M http: //www. Delicious Applications http: //del. icio. us purchased by Yahoo for ~$30 M http: //www. Cite. ULike. org http: //www. connotea. org (Nature) http: //www. bibsonomy. org/ • Associate metadata with Bookmarks specified by URL’s, DOI’s (Digital Object Identifiers) • Users add comments and keywords (called tags) • Users are linked together into groups (communities) • Information such as title and authors extracted automatically from some sites (Pub. Med, ACM, IEEE, Wiley etc. ) • Bibtex like additional information This is de facto Semantic Web – remarkable for its simplicity 95

Connotea 96 Connotea 96

Connotea queried by SERVOGrid 97 Connotea queried by SERVOGrid 97

Provenance and Delicious ? ? ? ? is any field such as chemistry All Provenance and Delicious ? ? ? ? is any field such as chemistry All ? ? Data should be associated with provenance that describes its lineage • How and when it was created • Compiler options used in simulation • ? ? XMLfrontended. Database query used on what ? ? Grid. Nodes Provenance produced by computer automatically and/or by user All ? ? Data can and should be labeled by a URI such as cicc: //ciccnodenumber. xx. yy. whathaveyou We can use del. icio. us style interface to annotate ? ? Data with missing provenance and user comments of any type (describing quality of data or a keyword relating different data etc. ) 98

Semantic Scholar Grid Citeseer and Google Scholar scour the Internet and analyze documents for Semantic Scholar Grid Citeseer and Google Scholar scour the Internet and analyze documents for incidental metadata Title, author and institution of documents Citations with their own metadata allowing one to match to other documents These capabilities are sure to become more powerful and to be extended • Give “Citation Index” in real time • Tell you all authors of all papers that cite a paper that cites you etc. (Note it’s a small world so don’t go too far in link analysis) • Tell you all citations of all papers in a workshop Such high value tools will appear on “publisher” sites of future (or else publishers will disappear) 99

OSCAR 2 Chemistry Document analysis It detects “magic” chemical strings in text and then OSCAR 2 Chemistry Document analysis It detects “magic” chemical strings in text and then • Stores them as metadata associated with document Queries Chem. Informatics repositories to tell you lots of information about identified compounds Tells you which other documents have this compound 100

? ? Version of OSCAR Some of the ? ? Nodes will store metadata ? ? Version of OSCAR Some of the ? ? Nodes will store metadata associated with ? ? Data – including documents • Note documents could be anywhere on the Internet – the ? ? Node may choose to store (a copy of) document or just its metadata • Note all ? ? Nodes are federated i. e. there is no “one central” store of any type of data Metadata will be user annotations including tags, Citeseer style citation information for all scientific fields Then each scientific field has its own version of OSCAR tuned to extract natural metadata for science – for Earthquake science this is GML and Chemistry is CML … 101

Document-enhanced Research Grid Export: RSS, Bibtex Endnote etc. Traditional Cyberinfrastructure Windows Live Academic Search Document-enhanced Research Grid Export: RSS, Bibtex Endnote etc. Traditional Cyberinfrastructure Windows Live Academic Search Cite. ULike Google Scholar Connotea Citeseer Bibliographic Database Del. icio. us My. Research Database Science. gov Bibsonomy Pub. Chem Pub. Med Generic Document Tools CMT Conference Management Manuscript Central Community Tools etc. Integration/ Enhancement User Interface New Document-enhanced Research Tools Biolicious Existing User Interface Web service Wrappers 102 Existing Document-based Research Tools

Native UI-1 Native UI-3 Tool-1 Del. icio. us SSG MD Store Native UI-4 Tool-2 Native UI-1 Native UI-3 Tool-1 Del. icio. us SSG MD Store Native UI-4 Tool-2 Connotea Tool-3 Cite. ULike Gateway WS-1 Gateway WS-2 Gateway WS-3 Native UI-N Tool–N e. g. Cite. Seer Gateway WS-N SSG Domain-1 Web service SSG Domain-N Web service Integrated User Interface UI Integration Framework of Tools 103