605b9dfec7ce6ec46ec7330708700e7d.ppt
- Количество слайдов: 24
Sun Clusters Ira Pramanick Sun Microsystems, Inc.
Outline n Today’s vs. tomorrow’s clusters - How they are used today and how this will change n Characteristics of future clusters - Clusters as general-purpose platforms n How they will be delivered - Sun’s Full Moon architecture n Summary & conclusions
Clustering Today n n Mostly for HA Little sharing of resources n Exposed topology n Hard to use n Layered on OS n Reactive Solution LAN/WAN IP Switch
Clustering Tomorrow LAN/WAN Central Console Global Networking Global Storage
Sun Full Moon architecture n Turns clusters into general-purpose platforms - Cluster-wide file systems, devices, networking - Cluster-wide load-balancing and resource management n Integrated solution - HW, system SW, storage, applications, support/service n Embedded in Solaris 8 n Builds on existing Sun Cluster line - Sun Cluster 2. 2 -> Sun Cluster 3. 0
Characteristics of tomorrow’s clusters n High-availability n Cluster-wide resource sharing: files, devices, LAN n Flexibility & Scalability n Close integration with the OS n Load-balancing & Application management n Global system management n Integration of all parts: HW, SW, applications, support, HA guarantees
High Availability n End-to-end application availability - What matters: Applications as seen by network clients are highly-available - Enable Service Level Agreements n Failures will happen - SW, HW, operator errors, unplanned maintenance, etc. Mask failures from applications as much as possible Mask application failures from clients
High Availability. . . n No single point of failure - Use multiple components for HA & scalability n Need strong HA foundation integrated into OS - Node group membership, with quorum Well-defined failure boundaries--no shared memory Communication integrated with membership Storage fencing Transparently restartable services
High Availability. . . n Applications are the key - Most applications are not cluster-aware - Mask most errors from applications - Restart when node fails, with no recompile - Provide support for cluster-aware apps - Cluster APIs, fast communication n Disaster recovery - Campus-separation and geographical data replication
Resource Sharing n What is important to applications? n Ability to run on any node in cluster Uniform global access to all storage and network Standard system APIs What to hide? - Hardware topology, disk interconnect, LAN adapters, hardwired physical names
Resource Sharing. . . n What is needed? - Cluster-wide access to existing file systems, volumes, devices, tapes - Cluster-wide access to LAN/WAN - Standard OS APIs: no application rewrite/recompile n Use SMP model - Apps run on machine (not “CPU 5, board 3, bus 2”) - Logical resource names independent of actual path
Resource Sharing. . . n Cluster-wide location-independent resource access n Run applications on any node Failover/switchover apps to any node Global job/work queues, print queues, etc. Change/maintain hardware topology without affecting applications But need not require fully-connected SAN - Main interconnect can be used through software support
Flexibility n Business needs change all the time - Therefore, platform must be flexible n System must be dynamic -- all done on-line - Resources can be added and removed - Dynamic reconfiguration of each node - Hot-plug in and out of IO, CPUs, memory, storage, etc. - Dynamic reconfiguration between nodes - More nodes, load-balancing, application reconfiguration
Scalability n Cluster SMP nodes n Choose nodes as big as needed to scale application - Need expansion room within nodes too n Don’t use clustering exclusively to scale applications - Interconnect speed slower than backplane speed Few cluster-aware applications Clustering large number of small nodes is like herding chicken
Close integration with OS n Currently: multi-CPU SMP support in OS - Does not make sense otherwise n Next step: cluster support in the OS - Next dimension of OS support: across nodes n Clustering will become part of the OS - Not a loosely-integrated layer
Advantages of OS integration n Ease of use - Same administration model, commands, installation n Availability - Integrated heartbeat, membership, fencing, etc. n Performance - In-kernel support, inter-node/process messaging, etc. n Leverage - All OS features/support available for clustering
Load-balancing n Load-balancing done at various levels - Built-in network load-balancing - For example, incoming http requests; TCP/IP bandwidth - Transactions at middleware level - Global job queues n All nodes have access to all storage and network - Therefore any node can be eligible to perform the work
Resource management n Cluster-wide resource management - CPU, network, interconnect, IO bandwidth - Cluster-wide application priorities n Global resource requirements guaranteed locally - Need per-node resource management n High-availability is not just making sure an application is started - Must guarantee resources to finish job
Global cluster management n System management n Perform administrative functions once Maintain same model as single node Same tools/commands as base OS--minimize retraining Hide complexity - Most administrative operations should not deal with HW topology - But still enable low-level diagnostics and management
A Total Clustering Solution Cluster OS Software Applications System Management Middleware HA Guarantee Practice Integration of all components Servers Storage Cluster Interconnect Service and Support
Roadmap n Sun Cluster 2. 2: currently shipping - n Solaris 2. 6, Solaris 7, Solaris 8 3/00 4 nodes Year 2000 compliant Choice of servers, storage, interconnects, topologies, networks 10 Km separation Sun Cluster 3. 0 - External Alpha 6/99, Beta Q 1 CY‘ 00, GA 2 H CY‘ 00 8 nodes Extensive set of new features: cluster fs, global devices, network load-balancing, new APIs (RGM), diskless application failover, Sy. MON integration
Wide Range of Applications n n Agents developed, sold, and supported by Sun - Databases (Oracle, Sybase, Informix XPS), SAP - Netscape (http, news, mail, LDAP), Lotus Notes - NFS, DNS, Tivoli Sold and supported by 3 rd parties - IBM DB 2 and DB 2 PE, BEA Tuxedo Agents developed and supported by Sun Professional Services - A large list, including many in-house applications Toolkit for agent development - Application management API, training, Sun PS support
Full Moon clustering Embedded in Solaris 8 Dynamic domains Built-in load balancing Global resource management Global Networking Wide-range of HW Global devices Global application management Global Storage Cluster APIs Single management console Global file system
Summary n Clusters as general-purpose platforms - Shift from reactive to proactive clustering solution n Clusters must be built on a strong foundation - Embed into a solid operating system - Full Moon -- bakes clustering technology into Solaris n Make clusters easy to use - Hide complexity, hardware details n Must be an integrated solution - From platform, service/support, to HA guarantees


