aa0935837e3447b3a607030a55f908b0.ppt
- Количество слайдов: 24
Best Practices for Setting Up Computer Hardware in a Grid Environment Tom Keefer Doninger Performance Analyst, SAS Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Cheryl R&D Director,
Recipe for Success SAS Grid Computing lots of SAS users § review different grid architectures • different OS’s, network connectivity, storage solutions § show scalable through-put and sustained I/O as number of grid nodes increase § create reference architectures of successful grid configurations to help answer your questions Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
What is Grid Computing? § “Grid computing integrates, virtualizes, and manages resources (software and hardware) to provide a much larger, powerful distributed computing infrastructure. " Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Benefits of SAS on a Grid § § § increases scalability increases availability facilitates provisioning increases flexibility reduces costs Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. = Virtual Data Center
Running SAS on a Grid SAS Grid Manager Distributed Enterprise Scheduling Workload Balancing Parallelized Workload Balancing Distribute jobs within Distribute workloads to a workflows to range of hosts. shared pool of resources. Distribute parallelized SAS workloads to a shared pool of resources. Automatically find and use the best available resource for each job. Automatically find and use the best available resource. Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
What products can leverage SAS Grid Manager? SAS Grid Manager Distributed Enterprise Scheduling SAS Data Integration Studio SAS Web Report Studio SAS Marketing Automation Workload Balancing Any SAS program (with wrapper) including stored processes and SAS Enterprise Guide programs SAS Marketing Optimization Any SAS program Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Parallelized Workload Balancing SAS Data Integration Studio SAS Enterprise Miner SAS Risk Dimensions Any SAS program (with modification)
SAS Grid Architecture Topology SASApp Central File Server for: • Job Deployment Directories • Source and Target Data • SAS Log files Grid Client + Metadata Server Management Console (Grid Manager plug-in) DIS or EM SAS Program LSF Platform Grid Management Service Platform LSF . . . Platform LSF Platform Process Mgr Base SAS/Connect Metadata Server 3 SAS Workspace Server SAS Grid Server SAS Data Step Batch Server Grid Control Machine Base SAS 2 SAS/Connect 1 SAS Grid Server SAS Data Step Batch Server 1 Grid Node 1 Base SAS 2 SAS/Connect 1 Grid Node 2 Grid Node n SAS Grid Server SAS Data Step Batch Server 1 Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. SAS Grid Server SAS Data Step Batch Server 1
Keys To Success – Areas To Focus § node configuration • heterogeneous or homogeneous § number and type of processors § memory § storage/data access no different than single server - just more systems. Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Data Storage is The Key § § sharable throughput across the grid scalable locality of data • • input files output files temporary files external data access Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Shared File System Testing Efforts Operating System File Sharing Technology Red Hat Linux (RHEL 4) EMC Celerra Multi-Path File System on i. SCSI (MPFSi) Red Hat Linux (RHEL 4) Network Appliance (NFS) Sun Solaris 10 Sun Storage. Tek QFS Red Hat Linux (RHEL 4)* Global File System (GFS) Windows* Polyserve / HP Matrix AIX* IBM Global Parallel File System (GPFS) HP-UX* Veritas Clustered File System (CFS) *Efforts ongoing Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Steps to Success With Grid § determine your system requirements • what does your application do? • data flow diagram § architect your system § test throughput outside of SAS first • third party tools • replicate your applications behavior (i/o pattern) § single node SAS tests, then scale out Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
EMC MPFSi Architecture /work IP Traffic Switch Notes: § NAS § MPFSi client on nodes “The Directory” Conversion § network “managers” Fiber Channel § leverage existing net /work EMC Storage Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. /data
EMC MPFSi Discussion Points § based on previous “Highroad” product § SAS data integration benchmarking scenario § 40 Linux grid nodes • dual core, dual Ethernet per node for data • up to 160 simultaneous SAS processes § performance tips: • • • analyze throughput from node to storage – data flow!! watch placement of disk volumes for performance don’t allow non-grid activity on network separate client and admin network monitor director and data mover throughput Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Network Appliance NFS Architecture Linux Nodes Notes: § NAS Network Switch § NFS client on nodes ALL Ethernet § leverage existing network § NFS everywhere /data Net. App FAS 6030 (network storage) Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. /work
Netapp NFS Discussion Points § pure network file system implementation (NFS) § SAS data integration benchmarking scenario § 10 Linux grid nodes • quad core* - single Ethernet per node for data § performance tips: • • check throughput from node to storage – data flow!!! don’t allow non-grid activity on network separate client and admin network watch placement of disk volumes for performance * important note: core to throughput per node ratio Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Sun QFS Architecture server nodes Notes: fibre channel FC Switch § SAN § QFS software on nodes fibre channel § QFS server “master” § fibre channel – node to disk Sun storage /data /work Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Sun QFS Discussion Points § pure fibre channel (SAN) § SAS data integration benchmarking scenario § up to 4 Solaris server nodes • 48 to 64 core grid nodes (144 total on grid) • up to 180 simultaneous SAS processes • up to 20 fiber channel connections per server § performance tips: • check throughput from node to storage – data flow!!! • watch placement of disk volumes for performance • setup of QFS master server Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other Shared File System Technologies § SAN based – fibre channel • Multi-Path File System (MPFS) NOT i. SCSI • IBM Global Parallel File System (GPFS) • Polyserve / HP Matrix − only one available for windows!! • Linux Global File System (GFS) • Veritas Clustered File System (CFS) § NAS - Ethernet • Net. App with i. SCSI Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. SAS is continuing its testing efforts with various partners.
Overall Best Practices for Shared File Systems § data flow diagram • understand your applications throughput requirements before you talk to a storage vendor § monitoring and management tools are a must! § test throughput OUTSIDE of SAS first! § some technologies have volume placement limitations! • i. e. can you span all the arrays with a single volume? § analyze throughput per $ before you buy § availability…. backups…. future scalability…. Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
SAS Scalable Performance Data Server on a Grid each server / grid node runs its own instance of SAS and SPDS Server server / grid nodes /spds/index SAN or NAS /spds/meta /spds/data 2 bottom line: myspdslib. mysastable is available on any server! /spds/data 1 Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. shared file systems SPDS directories
SAS Really Scales in a Grid § scalable I/O throughput § lots of choices for OS, storage solution, etc. § our work will continue. . . Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
More to See and Do. . . § “A Throughput-Intensive Compute and Storage Grid Using SAS® Grid Manager” • Somantak Chanda, American Express • Tues 1: 30 -2: 20, Northern Hemisphere E-2 § SAS Grid demo booth #16 § IT Intelligence for Grid Optimization- demo booth #53 § Platform Computing – Alliance Café booth #87 § various storage partners – Alliance Café Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
For More Information. . . § scalability website: http: //support. sas. com/rnd/scalability/grid § today’s presentation http: //support. sas. com/rnd/scalability/gridpapers. html Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
aa0935837e3447b3a607030a55f908b0.ppt