WLCG NAGIOS Kashif Mohammad Deputy Technical Co-ordinator (South Grid) University of Oxford 02/07/09 1
WLCG NAGIOS • WLCG Nagios is a part of EGEE SA 1 Multi-Level Monitoring (MLM) to provide an integrated project level monitoring system for EGEE III. https: //twiki. cern. ch/twiki/bin/view/EGEE/Multi. Level. Monit oring. Overview • This is based on EGEE III Operations Automation Strategy to suit the future federated Infrastructure such as EGI. org. • WLCG Nagios at ROC Level is suppose to replace central monitoring like SAM in post EGEE era. 02/07/09 2
WLCG NAGIOS WLCG Nagios is based on many components, few of them are • Nagios Configuration Generator (NCG) : It’s a configuration tool which creates configuration file for Nagios by querying GOCDB, site BDII and Metric Description Database. • Metric Description Database : It’s a project level database which provides description of tests which should be run against grid services at EGEE sites. • MSG-Nagios bridge: Listen on messaging system for messages destined to this Nagios and push them to Nagios. 02/07/09 3
02/07/09 4
WLCG NAGIOS WLCG Nagios uses two type of probes at regional level. Remote Probes: These are the probes which are executed against site by some external agents. WLCG Nagios uses two such external agents namely SAM grid monitoring probes and ENOC Network Monitoring Probes. In Nagios term, these are passive service check. Local Probes : These are the test which site monitoring service schedule itself. Most of these tests are replica of SAM tests written as Nagios probes and submitted through User Interface using grid proxy. In Nagios term, these are active service check. 02/07/09 5
UKI WLCG NAGIOS SETUP AT OXFORD ENOC Server SAM Server LCG Grid Local Tests User Interface Nagios Server Myproxy Server Upload Proxy Personal Computer 02/07/09 6
UKI WLCG NAGIOS SETUP AT OXFORD We have installed a WLCG Nagios instance at Oxford for UKI www. gridppnagios. physics. ox. ac. uk/nagios Access is restricted to members of dteam and ops VO. Access can be granted to non vo members having grid certificate. A brief introduction is provided at http: //www. gridpp. ac. uk/wiki/UKI_Regional_Nagios I have to expand it ! 02/07/09 7
SAM NPM Local 02/07/09 8
UKI WLCG NAGIOS SETUP AT OXFORD • You can subscribe alarm notification by dropping me a mail • Local tests are more frequent than SAM test so sometime it can be useful. Is it ? • Which alarms are useful ? • Alarm notifications can be fine tuned. But need feedback. 02/07/09 9