The RHIC Computing Facility at BNL HEPIX-HEPNT Vancouver

The RHIC Computing Facility at BNL HEPIX-HEPNT Vancouver, BC, Canada October 20, 2003 Ofer Rind RHIC Computing Facility Brookhaven National Laboratory Ofer Rind - RHIC Computing Facility Site Report

RCF - Overview • • • Brookhaven National Lab is a multi-disciplinary DOE research laboratory RCF formed in the mid-90’s to provide computing infrastructure for the RHIC experiments. Named US Atlas Tier 1 computing center in late 90’s Currently supports both HENP and HEP scientific computing efforts as well as various general services (backup, email, web hosting, off-site data transfer…) 25 FTE’s (expanding soon) RHIC Run-3 completed in Spring. Run-4 slated to begin in Dec/Jan Ofer Rind - RHIC Computing Facility Site Report

RCF Structure Ofer Rind - RHIC Computing Facility Site Report

Mass Storage • • • 4 Storage. Tek tape silos managed by HPSS (v 4. 5) Upgraded to 37 9940 B drives (200 GB/cartridge) prior to Run-3 (~2 mos. to migrate data) Total data store of 836 TB (~4500 TB capacity) Aggregate bandwidth up to 700 MB/s – expect 300 MB/s in next run 9 data movers with 9 TB of disk (Future: array to be fully replaced after next run with faster disk) Access via pftp and HSI, both integrated with K 5 authentication (Future: authentication through Globus certificates) Ofer Rind - RHIC Computing Facility Site Report

Mass Storage Ofer Rind - RHIC Computing Facility Site Report

Centralized Disk Storage • Large SAN served via NFS Processed data store + user home directories and scratch o 16 Brocade switches and 150 TB of Fibre Channel Raid 5 managed by Veritas (MTI & Zzyzx peripherals) o 25 Sun Servers (E 450 & V 480) running Solaris 8 (load issues with nfsd and mountd precluded update to Solaris 9) o Can deliver data to farm at up to 55 MB/sec/server o • RHIC and USAtlas AFS cells Software repository + user home directories o Total of 11 AIX servers, 1. 2 TB (RHIC) & 0. 5 TB (Atlas) o Transarc on server side, Open. AFS on client side o RHIC cell recently renamed (standardized) o Ofer Rind - RHIC Computing Facility Site Report

Centralized Disk Storage E 450’s MTI Zzyzx Ofer Rind - RHIC Computing Facility Site Report

The Linux Farm 1097 dual Intel CPU VA and IBM rackmounted servers – total of 918 k. Spec. Int 2000 • Nodes allocated by expt and further divided for reconstruction & analysis • 1 GB memory typically + 1. 5 GB swap • Combination of local SCSI & IDE disk with aggregate storage of >120 TB available to users • o Experiments starting to make significant use of local disk through custom job schedulers, data repository managers and rootd Ofer Rind - RHIC Computing Facility Site Report

The Linux Farm Ofer Rind - RHIC Computing Facility Site Report

The Linux Farm Most RHIC nodes recently upgraded to latest RH 8 rev. (Atlas still at RH 7. 3) • Installation of customized image via Kickstart server • Support for networked file systems (NFS, AFS) as well as distributed local data storage • Support for open source and commercial compilers (gcc, PGI, Intel) and debuggers (gdb, totalview, Intel) • Ofer Rind - RHIC Computing Facility Site Report

Linux Farm - Batch Management • Central Reconstruction Farm Up to now, data reconstruction was managed by a locally produced Perl-based batch system o Over the past year, this has been completely rewritten as a Python-based custom frontend to Condor o Leverages DAGman functionality to manage job dependencies § User defines task using JDL identical to former system, then Python DAG-builder creates job and submits to Condor pool § Tk GUI provided to users to manage their own jobs § Job progress and file transfer status monitored via Python interface to a My. SQL backend § Ofer Rind - RHIC Computing Facility Site Report

Linux Farm - Batch Management • Central Reconstruction Farm (cont. ) o New system solves scalability problems of former system o Currently deployed for one expt. with others expected to follow prior to Run-4 Ofer Rind - RHIC Computing Facility Site Report

Linux Farm - Batch Management • Central Analysis Farm o LSF 5. 1 licensed on virtually all nodes, allowing use of CRS nodes in between data reconstruction runs One master for all RHIC queues, one for Atlas § Allows efficient use of limited hardware, including moderation of NFS server loads through (voluntary) shared resources § Peak dispatch rates of up to 350 K jobs/week and 6 K+ jobs/hour § Condor is being deployed and tested as a possible complement or replacement – still nascent, awaiting some features expected in upcoming release o Both accepting jobs through Globus gatekeepers o Ofer Rind - RHIC Computing Facility Site Report

Security & Authentication Two layers of firewall with limited network services and limited interactive access exclusively through secured gateways • Conversion to Kerberos 5 -based single sign-on paradigm • Simplify life by consolidating password databases (NIS/Unix, SMB, email, AFS, Web). SSH gateway authentication password-less access inside facility with automatic AFS token acquisition o RCF Status: AFS/K 5 fully integrated, Dual K 5/NIS authentication with NIS to be eliminated soon o USAtlas Status: “K 4”/K 5 parallel authentication paths for AFS with full K 5 integration on Nov. 1, NIS passwords already gone o Ongoing work to integrate K 5/AFS with LSF, solve credential forwarding issues with multihomed hosts, and implement a Kerberos certificate authority o Ofer Rind - RHIC Computing Facility Site Report

US Atlas Grid Testbed giis 01 Information Server LSF (Condor) pool amds Mover Globusclient Gatekeeper Job manager Globus RLS aftpexp 00 Server HPSS AFS server aafs 70 MB/S Internet Grid Job Requests Grid. Ftp atlas 02 17 TB Disks Local Grid development currently focused on monitoring and user management Ofer Rind - RHIC Computing Facility Site Report

Monitoring & Control • Facility monitored by a cornucopia of vendor-provided, open-source and home-grown software. . . recently, Ganglia was deployed on the entire farm, as well as the disk servers o Python-based “Farm Alert” scripts were changed from SSH push (slow), to multithreaded SSH pull (still too slow), to TCP/IP push, which finally solved the scalability issues o • Cluster management software is a requirement for linux farm purchases (VACM, x. CAT) o Console access, power up/down…really came in useful this summer! Ofer Rind - RHIC Computing Facility Site Report

The Great Blackout of ‘ 03 Ofer Rind - RHIC Computing Facility Site Report

Future Plans & Initiatives • Linux farm expansion this winter: addition of >100 2 U servers packed with local disk • Plans to move beyond NFS-served SAN with more scalable solutions: o Panasas - file system striping at block level over distributed clients o d. Cache - potential for managing distributed disk repository • Continuing development of grid services with increasing implementation by the two large RHIC experiments • Very successful RHIC run with a large high-quality dataset! Ofer Rind - RHIC Computing Facility Site Report