d335a5894ed1075d28ae5772ff1d6371.ppt
- Количество слайдов: 58
A Tutorial Designing Cluster Computers and High Performance Storage Architectures At HPC ASIA 2002, Bangalore INDIA December 16, 2002 By Dheeraj Bhardwaj N. Seetharama Krishna Department of Computer Science & Engineering Indian Institute of Technology, Delhi INDIA e-mail: dheerajb@cse. iits. ac. in Centre for Development of Advanced Computing Pune University Campus, Pune INDIA e-mail: krishna@cdacindia. com Dheeraj Bhardwaj
Acknowledgments • All the contributors of LINUX • All the contributors of Cluster Technology • All the contributors in the art and science of parallel computing • Department of Computer Science & Engineering, IIT Delhi • Centre for Development of Advanced Computing, (C-DAC) and collaborators Dheeraj Bhardwaj
Disclaimer • The information and examples provided are based on the Red Hat Linux 7. 2 installation on the Intel PCs platforms ( our specific hardware specifications) • Much of it should be applicable to other versions of Linux, • There is no warranty that the materials are error free • Authors will not be held responsible for any direct, indirect, special, incidental or consequential damages related to any use of these materials Dheeraj Bhardwaj
Outline • Introduction • Brief history of storage • Overview of storage technologies components • Importance of storage • Overview of Storage Models subsystems • Files Systems • Recent requirements and developments • I/O • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Introduction Brief History of Storage Technologies - Make 2 -3 slides Dheeraj Bhardwaj
Introduction Importance of Storage Subsystems • Greater Demand from Technical and commercial users for – Higher capacity to meet the growing demands – Higher performance for meeting the increased user base – Very high performance to meet the balance between compute and I/O in technical computing. Dheeraj Bhardwaj
Introduction Importance of Storage Subsystems • Greater Demand from Technical and commercial users for – Manageability challenges for managing data • A large user base demands • Large capacity • Ever increasing demand for through put • Ever changing application configuration needs Dheeraj Bhardwaj
Introduction • Required Capabilities – Meet the demands of Multi Tera Flop Compute power – Scalable from 1 TF needs to 10 TF needs – Network-centered Architecture – Scalable in performance and capacity – Centralized back up and archive and management Dheeraj Bhardwaj
Introduction • Required Capabilities – In Built Parallel operation – A Design Based on Standard Components – Multiple Hierarchies and Class of Service – Heterogeneous compute systems support – Large file size support – Balanced architecture for mixed work load Dheeraj Bhardwaj
Introduction Today’s Storage Challenges • Managing the increasing Volume of Data • Providing continuous access to information • Adopting an evolving set of Storage Technologies • Investment protection on legacy resources • Multi vendor Inter operability Issues Dheeraj Bhardwaj
Introduction Today’s Storage Challenges Solution : • An open, standards-based approach to storage management must be the rule, not the exception • Open standards address key concerns – Supporting changing requirements – Managing heterogeneous device topologies – Incorporating best-of-breed products to create a complete storage solution. Dheeraj Bhardwaj
Objective • To create state of the art Scalable, Enterprise wide, Interoperable, Manageable, Modular and High Performance Storage involving – Study of existing technologies – Sizing the requirements : capacity and performance – Architecture to meet HPC and Non HPC user community. – Meet the mixed and ever changing work load patterns. Dheeraj Bhardwaj
Objective • To create state of the art Scalable, Enterprise wide, Interoperable, Manageable, Modular and High Performance Storage involving – Central storage facility accessible to authentic in house & remote users. – Central Back up facility to take backup of storage as well as local clients. – Cost effective Storage Solution Dheeraj Bhardwaj
Outline • Introduction • Overview of Storage Components • Overview of storage components • Disks • Overview of Storage Models • Interfaces • Files Systems • Protocols (SCSI, FCAL, i. SCSI, FC-IP) • Parallel I/O • Secondary Storage • Storage management (RAID) Software • Tertiary Storage (Back • Security tapes) • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Storage Components - Disks Please add at least one slide for one component Dheeraj Bhardwaj
Storage Components - Interfaces Please add at least one slide for one component Dheeraj Bhardwaj
Storage Components - Protocols Please add at least one slide for one component Dheeraj Bhardwaj
Storage Components – Secondary Storage (RAID) Please add at least one slide for one component Dheeraj Bhardwaj
Storage Components – Tertiary Storage (Tape) Please add at least one slide for one component Dheeraj Bhardwaj
Outline • Introduction • Overview of Storage Models • Overview of storage components • DAS • Overview of Storage Models • NAS • Files Systems • SAN • Parallel I/O • FAS (NAS & SAN coexists) • Storage management Software • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Overview of Storage Models - DAS Direct Attached Storage (DAS) Model Dheeraj Bhardwaj
Direct Attached Storage Please write Features. Advantages and Disadvantages Dheeraj Bhardwaj
Network Attached Storage (NAS) Model Dheeraj Bhardwaj
Network Attached Storage (NAS) Please write Features. Advantages and Disadvantages Dheeraj Bhardwaj
Storage Area Network (SAN) Model Dheeraj Bhardwaj
Storage Area Network (SAN) Please write Features. Advantages and Disadvantages Dheeraj Bhardwaj
Fiber Attached Storage (FAS) – NAS and SAN co-exists Dheeraj Bhardwaj
NAS and SAN co- exists Justify NAS and SAN co-existence – Pick up from our papers Dheeraj Bhardwaj
Advantages of FAS • Centralizing management to improve staff efficiency for monitoring and administration • Enabling storage to be more readily available to any servers on the network, making stored information a more valuable asset, and increasing the utility of the network itself. • Improving the availability, usefulness, and distribution of business applications. • Making automation simpler, and reducing IT operational costs and staffing requirements. • Providing greater visibility into the availability and performance of storage components. • Facilitating continuous availability requirements. Dheeraj Bhardwaj
Outline • Introduction • File Systems • Overview of storage • Overview components • File System • Overview of Storage Models Calculations • Files Systems • VFS • Parallel I/O • CFS • Storage management • PFS Software • HPSS • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
File System Calculation Aggregate Bandwidth Rates for One Parallel Job Teraflops 1 Memory Size (GB) 700 GB I/O Rates (GB/s) 1. 17 – 2 • Assumptions : – The lower estimates of memory it is assumed that for n teraflops machine n 3/4 TB of memory is required. – The higher estimates of memory it is assumed that for n teraflops machine 2/3*n Terabytes is required. – Reference : • Statement of Work : SGS File System • Report – DOE National Nuclear Security Administration , USA , April 2001 Dheeraj Bhardwaj
Assumptions for File System Capacity Calculations • The lower I/O rate estimates are based on the throughput needed to store one half of the smaller memory in five minutes. – (1/2 *700 GB) / (5 * 60 s) = 1. 17 GB/sec. • The higher I/O rate estimates are assumed that applications will store one byte for every 500 floating point operations. This is a common thumb rule used. – 1 TF / 500 Flops = 2 GB/sec Dheeraj Bhardwaj
Assumptions for File System Capacity Calculations • For number of directories it is assumed that very user will have approximately 5000 directories. – 300 users * 5000 directories = 1. 5*106 • For number of files it is assumed that minimum 25 files per directory and maximum 2, 000 files per directory. – Minimum 37. 5 * 106 1. 5*106 directories * 25 files – Maximum 1011 1. 5*106 directories * 2*105 files = 3 * Dheeraj Bhardwaj
Assumptions for File System capacity calculations • File system size is derived using formula – File system size = 1. 25 (7 to 18 * Peak Performance) TB • Minimum 1. 25 (7 * 1 TF) = 8. 75 TB • Minimum 37. 5 * 106 * 256 K = 9. 6 TB • Maximum 1. 25 (18 * 1 TF) = 22. 5 TB • For number of devices/subsystem we are assuming that 72 GB drives are used. • 8. 75 TB / 72 GB 121 drives • 22. 5 TB / 72 GB 312 drives Dheeraj Bhardwaj
In Summary File System Capacities Teraflops 1 Number of Users 300 Number of Directories 1. 5*106 Number of Files 37. 5 * 106 to 3 * 1011 File System size (TB) 8. 75 – 22 Number of devices/subsystem Dheeraj Bhardwaj
I/O Bandwidth • The File system Maximum Sustained Bandwidth can be obtained by the formula • Bfs = N * Bdrives * E Max Sustained Bandwidth Bfs File System N Number of Drives Bdrives Sustained bandwidth of the slowest disk E File system efficiency factor (0. 85) – Minimum Bfs = 121 * 100 MB/s * 0. 85 = 10. 28 GB/s – Maximum Bfs = 312 * 100 MB/s * 0. 85 = 22. 70 GB/s Dheeraj Bhardwaj
Parallel File System • PFS is designed as a client-server system with multiple I/O servers, which have disk/RAID attached to them. Each PFS file is striped across the disk on the I/O nodes. • PFS also has a manager that handles only metadata operations such as permission checking for file creation, open, close and remove operations. • Direct Parallel I/O – All participating clients access the storage directly via request to parallel I/O server. – This provides the maximum throughput as it by passes the overheads of intermediate file servers. Dheeraj Bhardwaj
Cluster File System Dheeraj Bhardwaj
Outline • Introduction • Parallel I/O • Overview of storage • Introduction components • Parallel I/O • Overview of Storage Models Approaches • Files Systems • (You can add some more) • Parallel I/O • Storage management Software • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Introduction Parallel & Serial I/O: Write the basic differences Dheeraj Bhardwaj
I/O Approaches • Following four I/O approaches can be used for data distribution across the participating processors in the parallel program: – UNIX I/O on NFS – Parallel I/O on NFS – PFS : UNIX I/O with PFS support – Parallel I/O with PFS support – Direct Parallel I/O • UNIX I/O on NFS – UNIX I/O, process with rank zero reads the input file using standard UNIX read, partitions it and distributes it to other processors. – The file is NFS mounted on the processor with process rank zero only. • Parallel I/O on NFS – All the processors open the file concurrently and read their required data blocks by moving offset pointer to the beginning of their corresponding data block in the input file. – File is NFS mounted from server to all the compute nodes. Dheeraj Bhardwaj
I/O Approaches • UNIX I/O with PFS support – Define these terms • Parallel I/O with PFS support – Define these terms • Direct Parallel I/O Dheeraj Bhardwaj
Outline • Introduction • Storage Management Software • Overview of storage components • Overview of Storage Models • Features • Files Systems • Details of available software and their • Parallel I/O features • Storage management • Etc Software • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Storage Management Software Please make few slides --- say 8 -10 Dheeraj Bhardwaj
Outline • Introduction • Storage Security • Overview of storage • Overview components • Other aspects • Overview of Storage Models • Files Systems • Parallel I/O • Storage management Software • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Storage Security Make some slides on Security aspects of Storage systems e. g. Kerberose etc Dheeraj Bhardwaj
Outline • Introduction • Design of Storage Architecture • Overview of storage components • Approach • Overview of Storage Models • Traditional • Files Systems • Ideal • Parallel I/O • Logical • Storage management • Proposed Software • Etc • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Approach on Architecture • Compute Nodes File Servers and File Systems – To support a high bandwidth we have to use special purpose file systems rather than the traditional file systems such as UFS, CIFS. – – Parallel File System (PFS) is necessary to stripe the data file across the multiple disks to increase the total I/O throughput. – • Cluster File System (CFS) is a highly available, distributed, cache-coherent file system that allows UFS file system to be concurrently accessed on multiple cluster nodes A set of File Servers configured with cluster file system (CFS) and parallel file system (PFS) ensures the high availability and throughput of the data to the users C 1 C 2 C 35 PARAM System Area Network C 36 C 37 C 70 Gigabit Switch Cluster of File Servers running Cluster File System and Parallel File System FS 1 FS 2 FS 3 FS 4 FS 5 FS 6 FS 7 FS 8 Distribution Networks – As of today, there are two networks (standard Ethernet and proprietary) available to be used to connect compute nodes to file servers for data transfer. – A third approach, extending the SAN to directly to compute nodes and avoid file servers (Direct parallel I/O) will reduce the network bottleneck but an expensive option. Fiber Switch Storage Array Dheeraj Bhardwaj
Design of Architecture • We propose an architecture, which is the mix of DAS, NAS and SAN connected together to the High Performance Computing Cluster. • We have chosen Direct Attached Storage directly connected to the application server for catering its application development need such as compliers, tools, source codes etc. • It is advisable to keep the application and data storage spaces separate to get the best performance and to avoid the single point of failure. • To achieve a high throughput a massive scalable storage system by combining multiple disk arrays or a single large array with large number of FC-AL interfaces. • To achieve throughput of multi Gigabytes at file system level, we have to size the storage array output to twice the requirement. Dheeraj Bhardwaj
Design of Architecture • We also have to size the number of disks, which can deliver desired sustained performance. • Our approach of keeping the applications data on to DAS and sequential users data on NAS and high performance computing data on SAN attached storage, will automatically separate the data from each other • The highly automated tape library connected to the storage array, NAS and DAS with the Fiber channel interface and accompanied by the data acquisition backup master server, will help to take the online backup in the server free, and LAN free environment. • This will free the CPUs of the file servers for the backup and restore jobs and focus on serving the high performance computing users. Dheeraj Bhardwaj
Scalability • The quantities, which should scale are: – Access – Storage capacity – SAN – I/O bandwidth • Access : Parallel access to multiple devices. • Storage Capacity : This can be addressed in two ways: – Big Monolithic Storage Box : Support several hundreds of disks but Realizing a large disk array may have limitations in terms of bandwidth scalability and reliability. – Multiple RAID arrays: connected to the fiber channel SAN and configure them as a single storage unit to enhance the capacity without affecting the bandwidth. • SAN : Chassis based storage directors where they can scale from eight ports to few hundreds of ports. This will provide a non-blocking, full-fledged scalability in SAN. • I/O bandwidth : Parallel File System (PFS) that stripes the data file across the multiple disks in the array through the I/O nodes to increase the total I/O throughput. Dheeraj Bhardwaj
Typical Storage Architecture System Area Network Backup / Archive System Tape System Cluster A Storage Area Network NFS/CIFS Clients Visualization Cluster A File System LAN System Area Network WAN To other Sites Cluster B File System NFS, CIFS Servers Dheeraj Bhardwaj
Ideal Storage Architecture System Area Network Cluster A Cluster B File System Servers (CFS/PFS) NFS/CIFS Clients Visualization Storage Area Network Gigabit Backup Archive Server LAN To oth Sit er es GPFS NFS, CIFS Servers Tape Dheeraj Bhardwaj
Physical Storage Components connectivity I/O Storage Nodes M 0 FS 1 FS 2 FS 3 FS 4 FS 5 FS 6 32 Port Switch - A I/O Spare, B/up Dev. , Storage Mgr FS 7 FS 8 M 1 M 2 M 3 32 Port Switch - B Tape Library A A B B A A B Storage Controller Disk Subsystem B Storage Controller Disk Subsystem Dheeraj Bhardwaj
Network Based Scalable Cluster of File Servers PARAM 20000 C 1 C 2 running Cluster File System C 35 FS 1 FS 2 FS 3 FS 4 High Performance Storage Architecture Miscellaneous Servers M 0 M 1 M 2 Internet M 3 Router M 0 – Scheduler M 1 – Spare Server M 2 – Developmental User Nodes M 3 – Storage Mgmt. Server M 4 – Visualization Server Storage Area Network M 5 – Gateway & Authentication Server M 6 – Backup Server M 7 – Spare Server FS 1 - FS 8 – File Servers C 1 – C 70 – Compute Nodes NAS Server 1 TB – 3 TB Storage Backup Array Library 2 TB – 20 TB – 200 TB 55 Dheeraj Bhardwaj
Outline • Introduction • Discussions • Overview of storage • Suggested technologies components • Future • Overview of Storage Models • Other aspects • Files Systems • Conclusion • Parallel I/O • Storage management Software • Security • Designing the Storage Architectures • Discussions Dheeraj Bhardwaj
Recommended Technologies • Disks : Min 72 GB , Dual Port FC-AL , 10000 RPM • Protocol : SCSI • Interface : FC-AL Interface • Storage Connectivity : 2 Gb/s Multi-Path Fiber Switches • Storage Array : Host Intelligence Based with Modular and linear scale up Architecture • File System Access : • File System : • Back Up : • Compute node • Architecture : Direct , PFS and NFS V 4 thro’ Gigabit N/W POSIX Compliant IEEE/ANSI 1003. X Cluster File System with PFS Fiber tape Libraries with HSM Access through NFS and PFS on Gigabit Ethernet. FAS Based, Combination of DAS, NAS & SAN Dheeraj Bhardwaj
Futuristic C-DAC Enterprise File System by 2005 Special Purpose Computers Visualization Workstation R&D Project SMP / Numa Systems PARAM 20000 DB Servers Suitable Architecture for GRID Dheeraj Bhardwaj