Operational tools information
Overview
Operation tools page provides information about operation tools available in EGI.
Quick links
Tool | Link |
---|---|
Operations portal | https://operations-portal.egi.eu/ |
Service Availability Monitoring | SAM Instances |
GOCDB | https://goc.egi.eu/ |
GGUS | https://gus.fzk.de/pages/home.php |
Accounting portal | http://accounting.egi.eu/ |
Metrics portal | http://metrics.egi.eu/ |
Gstat | http://gstat.egi.eu/ |
GridView | http://gridview.cern.ch/GRIDVIEW/same_index.php |
Network monitoring | Network |
Deployment plans
Tools
Individual operation tools are described in sections below. Currently each tool is hosted on a different address. In the future all tools will be integrated into single Operations portal.
Operations portal
The operations portal consists of web pages providing information to various actors (NGI Operations Centres, VO managers, etc.) along with related facilities, such as the VO registration tool, the broadcast and downtime system, the periodic, operations report submission system, the regional dashboard, etc. The programme of work includes tool maintenance (bug fixing and enhancement for the failover configuration).
Main links:
Documentation
- Main Documentation Page for the CIC Portal
- Main Documentation Page for the Operations Portal
- Bug/task tracking system
- Installation of a Dashboard Regional Instance
Service Availability Monitoring
The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites. It includes the following components:
- probes: a test execution framework (based on the open source monitoring framework Nagios) and the Nagios Configuration Generator (NCG)
- the Aggregated Topology Provider (ATP), the Metrics Description Database (MDDB), and the Metrics Results Database (MRDB)
- the message bus to publish results and a programmatic interface
- the visualization portal (MyEGI).
Main links:
- SAM Instances
- NEW! Grid probes from org.SAM package
Documentation
Installation instructions:
- Installation Instruction -NEW Confluence page
- NAGIOS&NCG YAim Based Installation Instruction -OLD page with YAIM variables definition
- SAM/NAGIOS Reference Card for sitemanger
- SAM Administrators FAQ
- Setting NAGIOS to Monitor Uncertified Sites
Tests lists:
Tools information pages:
- A/R algorithms
- MyEGI documentation
- Multi Level Monitoring Overview
- NCG Component Overview
- Grid Monitoring Specific Ncg Recipes
- Validate ROC or NGI Nagios Procedures
- Deployed ROC and NGI Nagios
- Main EGEE OAT wiki
- MyEGEE Documentation
- Aggregated Topology Provider (ATP)
- JIRA SAM project tracking system
GOCDB
Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation but a regional package will be developed and deployed on the interested NGIs.
Main links:
Documentation
GGUS
The Global Grid User Support (GGUS) system is the primary means by which users request support when they are using the grid. The GGUS system is the main support access point for the EGI project. The GGUS system creates a trouble ticket to record the request and tracks the ticket from creation through to solve. There are two ways in which a user can submit at request via email or the web interface.
Main links:
Documentation
Accounting portal
The accounting infrastructure is a complex system that involves various sensors in different regions, all publishing data to a central repository. The data is processed, summarized and displayed in the accounting portal, which acts as a common interface to the different accounting record providers and presents a homogeneous view of the data gathered and a user-friendly access to understanding resource utilization.
Main links:
Documentation
Metrics portal
The Metrics Portal displays a set of metrics that will be used to monitor the performance of the infrastructure and the project, and to track their changes over time. The portal automatically collects all the required data and calculates these metrics before displaying them in the portal. The portal aggregates information from different sources such as GOCDB, GGUS, GridView, etc. using various connectors provided by the data provider. These connectors translate the information gathered from diverse producers and store it in a local database.
Main links:
Documentation
Network monitoring
A light-weight end-to-end network performance monitoring infrastructure is coordinated and its configuration support provided by EGI.eu. These tools are used to troubleshoot network connectivity issues, such as end-to-end network performance affecting Grid data transfers.
Main links:
Documentation
External tools
Gstat
The main aim of GStat is to display information about grid services, the grid information system itself and related metrics. Gstat provides a method to visualize a grid infrastructure from an operational perspective based on information found in the grid information system (BDII).
Main links:
Documentation
GridView
Gridview is a monitoring and visualization tool being developed to provide a high level view of various functional aspects of the Worldwide LHC Computing Grid (LCG). Currently it shows the statistics of data transfers, FTS file transfers, jobs running and service availability information for the WLCG.
Main links: