10da03d46d454b6c212f78ec49b891bd.ppt
- Количество слайдов: 20
Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC 2007, Workshop on Grid Monitoring HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Overview v v v Motivation Nagios framework Nagios-based grid monitoring Architecture w Grid extensions w Statistics w Demo w v v v Contributions to WLCG Grid Service Monitoring WG Future work Conclusions HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Motivation v Provide site admin-centric monitoring w v Enable better resource availability w v simplify grid resources operations issue notifications as soon as problem appears Achieve complex sensor’s dependencies enables problem isolation w only relevant notifications are issued w v Visualization & management interface w v grid resources status Report generation w availability, problem history HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Nagios Framework v Open source monitoring framework w v v Host and service problems detection and recovery Provides wide set of basic sensors w v v easy to develop custom sensors Centralized vs. distributed deployment High configurability w v widely used & actively developed service dependencies, fine-grained notification options Web interface w status view, administration HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Nagios-based Grid Monitoring v Monitoring CRO-GRID Infrastructure (2004 -2006) Globus Toolkit Pre-WS & WS, UNICORE, other services w active recovery of services w still in production within CRO NGI w v Monitoring EGEE resources in Central Europe (CE) core services since mid 2006 w all CE sites for 1 st line support since September 2006 w centralized deployment - single server @ SRCE w http: //nagios. ce-egee. org w HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Architecture HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Grid Extensions v Grid sensors w Security facilities & services • CA distribution, Certificate lifetime, My. Proxy, VOMS Admin w Monitoring & information services • R-GMA, BDII, MDS, Grid. ICE w Job management services • Globus Gatekeeper, RB, WMS, WMProxy, Job matching w File management services • Grid. FTP, SRM, DPNS, LFC HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Grid Extensions v v Sensor hierarchy Automatic recovery both local and remote services w security handled with sudo w v v Certificate based authentication for the web interface NCG, SAM gatherer, Credential mgmt. HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Statistics v EGEE implementation statistics 69 hosts w 570 services actively monitored w 1029 services results imported from SAM w v Nagios server statistics (last month) HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Demo EGEE implementation web interface HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Contributions to WLCG Grid Service Monitoring WG v v v All sensors rewritten to be compliant with Probe specification Developed interface to Nagios data compliant with Data exchange format Nagios-based prototype w several grid extensions used (NCG, credential management, SAM gatherer) HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Future Work v v Utilizing our extensions on site level Distributing monitoring deployment w v v v hierarchy of Nagios servers Migration of credential management to robot certificates Further sensor development Service check execution optimization w active vs. passive checks HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Conclusions v Nagios highly configurable monitoring framework with notifications, service dependencies, … w simple, programming language-agnostic sensor API w v Grid extensions integration with existing infrastructure (user certificates, VOMS, GOCDB, SAM) w sensors for key grid services w v Nagios @ grid enables sites’ better availability w admins get only relevant notifications w HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
Thank You! Questions? HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios
10da03d46d454b6c212f78ec49b891bd.ppt