Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Operational tools information"

From EGIWiki
Jump to navigation Jump to search
(Redirected page to Tools)
 
(36 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== Overview ==
#REDIRECT [[Tools]]
 
Operation tools page provides information about operation tools available in EGI.
 
== Quick links ==
 
{| border="1" cellspacing="0" cellpadding="5" align="center"
!Tool
!Link
|-
| Operations portal
| https://operations-portal.egi.eu/
[http://cic.gridops.org/index.php?section=home The old CIC portal]
|-
| Service Availability Monitoring
| [[SAM Instances]]
|-
| GOCDB
| https://goc.egi.eu/
|-
| GGUS
| https://gus.fzk.de/pages/home.php
|-
| Accounting portal
| http://accounting.egi.eu/
|-
| Metrics portal
| http://metrics.egi.eu/
|-
| Gstat
| http://gstat.egi.eu/
|-
| GridView
| http://gridview.cern.ch/GRIDVIEW/same_index.php
|-
| Network monitoring
| [[Network]]
|}
 
== Tools ==
 
Individual operation tools are described in sections below. Currently each tool is hosted on a different address. In the future all tools will be integrated into single Operations portal.
 
=== Operations portal ===
 
The operations portal consists of web pages providing information to various actors (NGI Operations Centres, VO managers, etc.) along with related facilities, such as the VO registration tool, the broadcast and downtime system, the periodic, operations report submission system, the regional dashboard, etc. The programme of work includes tool maintenance (bug fixing and enhancement for the failover configuration).
 
'''Main links:'''
* [https://operations-portal.egi.eu/ Operations portal]
* [http://cic.gridops.org/index.php?section=home The old CIC portal]
 
==== Documentation ====
* [https://cic.gridops.org/index.php?section=roc&page=generaldoc Main Documentation Page for the CIC Portal]
* [https://forge.in2p3.fr/wiki/opsportaluser/ Main Documentation Page for the Operations Portal]
* [https://forge.in2p3.fr/projects/show/opsportaluser Bug/task tracking system]
* [https://cvs.in2p3.fr/operations-portal/package/installation-guide.pdf?revision=HEAD Installation of a Dashboard Regional Instance]
 
=== Service Availability Monitoring ===
 
The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites.
It includes the following components:
* probes: a test execution framework (based on the open source monitoring framework Nagios) and the Nagios Configuration Generator (NCG)
* the Aggregated Topology Provider (ATP), the Metrics Description Database (MDDB), and the Metrics Results Database (MRDB)
* the message bus to publish results and a programmatic interface
* the visualization portal (MyEGI).
 
'''Main links:'''
* [[SAM Instances]]
 
==== Documentation ====
 
'''Installation instructions:'''
* [https://tomtools.cern.ch/confluence/display/SAM/Clean+egee-NAGIOS+installation Installation Instruction -NEW Confluence page]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/GridMonitoringNcgYaim NAGIOS&NCG YAim Based Installation Instruction -OLD page with YAIM variables definition]
* [https://tomtools.cern.ch/confluence/display/SAMDOC/Service+reference+card+-+egee-NAGIOS SAM/NAGIOS Reference Card for sitemanger]
* [https://tomtools.cern.ch/confluence/display/SAMDOC/SAM+Administrators+FAQ SAM Administrators FAQ]
* [https://tomtools.cern.ch/confluence/display/SAM/Setting+Nagios+to+monitor+uncertified+sites Setting NAGIOS to Monitor Uncertified Sites]
 
'''Tests lists:'''
* [https://wiki.egi.eu/wiki/Operations:Operations_tests List of operations tests]
* [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List tests used for A/R calculations]
 
'''Tools information pages:'''
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf A/R algorithms]
* [https://tomtools.cern.ch/confluence/display/SAM/MyEGI/ MyEGI documentation]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MultiLevelMonitoringOverview Multi Level Monitoring Overview]
* [https://twiki.cern.ch/twiki/bin/view/LCG/SAMProbesMetrics SAM Probes and Metrics]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/GridMonitoringNcgOverview NCG Component Overview]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/GridMonitoringNcgRecipes Grid Monitoring Specific Ncg Recipes]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/ValidateROCNagios Validate ROC or NGI Nagios Procedures]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/ExternalROCNagios Deployed ROC and NGI Nagios]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/OAT_EGEE_III Main EGEE OAT wiki]
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MyEGEE MyEGEE Documentation]
* [https://twiki.cern.ch/twiki/bin/view/LCG/ATP Aggregated Topology Provider] (ATP)
* [https://tomtools.cern.ch/jira JIRA SAM project tracking system]
 
=== GOCDB ===
 
Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation but a regional package will be developed and deployed on the interested NGIs.
 
'''Main links:'''
* [https://goc.egi.eu/ GOCDB]
 
====Documentation====
* [[GOCDB Documentation Index]]
* [https://savannah.cern.ch/projects/gocdb/ GOCDB project in Savannah]
 
=== GGUS ===
 
The Global Grid User Support (GGUS) system is the primary means by which users request support when they are using the grid. The GGUS system is the main support access point for the EGI project. The GGUS system creates a trouble ticket to record the request and tracks the ticket from creation through to solve. There are two ways in which a user can submit at request via email or the web interface.
 
'''Main links:'''
* [https://gus.fzk.de/pages/home.php Helpdesk]
* [https://gus.fzk.de/stat/ttt.php GGUS ticket timeline tool]
* [https://gus.fzk.de/stat/stat.php GGUS report generator]
* [https://gus.fzk.de/pages/metrics/download_escalation_reports.php GGUS escalation reports]
 
====Documentation====
* [https://gus.fzk.de/pages/docu.php GGUS Main Documentation Page]
* [https://savannah.cern.ch/projects/esc GGUS Savannah shopping list]
* [https://gus.fzk.de/pages/owl.php GGUS wish list page]
 
=== Accounting portal ===
 
The accounting infrastructure is a complex system that involves various sensors in different regions, all publishing data to a central repository. The data is processed, summarized and displayed in the accounting portal, which acts as a common interface to the different accounting record providers and presents a homogeneous view of the data gathered and a user-friendly access to understanding resource utilization.
 
'''Main links:'''
* [http://accounting.egi.eu/ Accounting portal]
 
====Documentation====
* [http://www3.egee.cesga.es/gridsite/accounting/CESGA/links/acct_ibergrid06_final14.pdf Architecture and Implementation]
* [http://www3.egee.cesga.es/gridsite/accounting/CESGA/links/accounting_portal_installation.pdf Installation Instruction]
* [http://www3.egee.cesga.es/gridsite/accounting/CESGA/links/roadmap.pdf Accounting Portal Roadmap]
 
=== Metrics portal ===
 
The Metrics Portal displays a set of metrics that will be used to monitor the performance of the infrastructure and the project, and to track their changes over time. The portal automatically collects all the required data and calculates these metrics before displaying them in the portal. The portal aggregates  information from different sources such as GOCDB, GGUS, GridView, etc. using various  connectors provided by the data provider. These connectors translate the information gathered from diverse producers and store it in a local database.
 
'''Main links:'''
* [http://metrics.egi.eu/ Metrics portal]
 
====Documentation====
* [http://www3.egee.cesga.es/gridsite/accounting/CESGA/links/paper_metrics_iber2010.pdf Architecture and Implementation]
* [http://www3.egee.cesga.es/gridsite/accounting/CESGA/links/paper_metrics_iber2010.pdf General Documentation]
* [http://www3.egee.cesga.es/gridsite/accounting/CESGA/links/EGEE-III-SA1-Metrics_Portal_Roadmap_and_Requirements-v3.1.xls Metrics Portal Roadmap]
 
=== Network monitoring ===
 
A light-weight end-to-end network performance monitoring infrastructure is coordinated and its configuration support provided by EGI.eu. These tools are used to troubleshoot network connectivity issues, such as end-to-end network performance affecting Grid data transfers.
 
'''Main links:'''
* [[Network| Network monitoring]]
 
====Documentation====
 
== External tools ==
 
=== Gstat===
 
The main aim of GStat is to display information about grid services, the grid information system itself and related metrics. Gstat provides a method to visualize a grid infrastructure from an operational perspective based on information found in the grid information system (BDII).
 
'''Main links:'''
* [http://gstat.egi.eu/ Gstat]
 
====Documentation====
* [https://tomtools.cern.ch/confluence/display/IS/DM_Troubleshooting Using GStat 2.0]
 
=== GridView ===
 
Gridview is a monitoring and visualization tool being developed to provide a high level view of various functional aspects of the Worldwide LHC Computing Grid (LCG). Currently it shows the statistics of data transfers, FTS file transfers, jobs running and service availability information for the WLCG.
 
'''Main links:'''
* [http://gridview.cern.ch/GRIDVIEW/same_index.php GridView availability interface]
* [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Statistics historical reports]
* [http://gvdev.cern.ch/GVPC/Excel/ Availability Excel Reports]
 
====Documentation====

Latest revision as of 17:56, 11 February 2011

Redirect to: