ARG Availability and reliability monitoring for e-Infrastructures C. Kanellopoulos, GRNET K. Kagkelidis, GRNET/AUTH
“You can not manage, what you do not monitor” Usage Responsiveness Availability Reliability Profit Performance Cost Status 2
Dynamic e-Infrastructures • In the last decade our notion of the e-Infrastructure has dramatically changed • Software Defined Networks • Compute and data infrastructure as software components • The boundaries and shape of the e-Infrastructure are dynamically changing to match the needs of the individual user • Monitoring has to evolve in order to catch-up with the dynamic nature of the systems 3
A model for monitoring large scale distributed infrastructures Messaging Infrastructure Ext. Source Agents Ext. Source Ext. Source Agents 4
Agents Messaging Infrastructure Ext. Source Agents Ext. Source Ext. Source Agents - Agents: collect data and publish them to the Messaging Infrastructure - Can be monitoring agents, accounting services etc 5
Monitoring agents • Nagios based monitoring • Auto-configuration of monitored services VO Messaging Infrastructure • Nagios schedules and executes probes against the monitored service end points VO • Metric results are retrieved by each VO Nagios instance, which in turn publishes them to the MBN IT ES UK FR 6
Message Broker Network Messaging Infrastructure Ext. Source Agents Ext. Source Ext. Source Agents - The distributed message broker network is the neural network of the system (Publish - subscribe) - Supports large number of concurrent publishers and 7 subscribers
Messaging in ARGO Message Broker Network - ActiveMQ Message Brokers in Greece, Croatia and Switzerland (now only in Greece and Croatia) - High Availability configuration. Auto-discovery through 8 the Information System
External sources Messaging Infrastructure Ext. Source Agents Ext. Source Ext. Source Agents - External sources: can be used as source for information in order to provide e-Infrastructure specific details - Can be Configuration Databases, Management Systems 9 etc.
Subscribers Messaging Infrastructure Ext. Source Agents Ext. Source Ext. Source Agents - Any number of concurrent subscribers. - E.g. SLA (A/R) Monitoring, Accounting, Billing, Operations portals etc. 10
Abstract subscriber Web UI Source Data Store Message Broker Network Source Sync Stream Components Consumers Abstract subscriber Source 11
The ARGO Framework View Layer Web Service + Distributed Data Store Batch Layer Hadoop Cluster Source VO Prefiltering Message Broker Network Source Sync Stream Components Consumers VO A/R Compute Engine A/R Compute Engine Source A/R Compute Engine VO 12 IT ES UK FR
ARGO A/R 13
A/R Compute Engine • A/R consumer: tap and listen to the Message Broker • Network � gather relevant messages • able to manage and keep relevant fields on • each message deliver data to the A/R Compute Engine � • • Sync components: topology information • established grouping of services • various weight information regarding grouping • downtime information • computation profiles • ...also deliver data to the A/R Compute • Engine 14
A/R Compute Engine • A/R Compute Engine: Picks up the data delivered • both metrics and supplementary sync • files Stages the data ( prefiltering process) • Executes Computations • Sending them as jobs to a Hadoop • Cluster or Locally if desired • Results are stored to MongoDB for delivery • through the REST API 15
A/R Compute Engine • Rest API: Supports a number of requests to view the • result data Retrieves data from the MongoDB • Authentication • Supports recomputation requests � • • WebUI: Provides a UI dashboard with various views • on the a/r report results Export of the results to various file formats • Transforms User Interaction to specific API • requests 16
17
Next steps • Messaging Infrastructure Introduce a REST API for publishing and consuming messages � • Provide fine grained ACL support • Messaging Infrastructure as a Service for end users � • • ARGO Standalone version of ARGO • Support for realtime monitoring at EGI scale • Improve the visualisation support • Custom of availability profiles for Cloud and Data Infrastructures • Support federated access through eduGAIN • 18
• More information at: • https://github.com/argoeu Thank you 19
Recommend
More recommend