Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

EGI-InSPIRE:Plan 2012 SA1.4

From EGIWiki
Jump to navigation Jump to search

Assessment of progress, 2011

Completed activities

Migration from gridops.org domain of EGI central tools to egi.eu domain

Migration was successfully finalized on July 4th 2011 when the the decommission of gridops.org domain was performed. Decommission did not cause problems to grid or operational tools or any external system. All EGI central tools are now using egi.eu domain and list can be found on the following page: Tools.

Definition of procedures relevant for operational tools

Two procedures relevant for operational tools were approved at the Operations Management Board on March 15th 2011:

  • Adding new probes to SAM (PROC07)
  • Management of the EGI OPS Availability and Reliability Profile (PROC08)

One manual relevant for operational tools was accepted at the Operations Management Board on July 26th 2011:

Central MyEGI deployement

Central MyEGI instance (http://grid-monitoring.cern.ch/myegi/) was deployed in February 2011 after SAM Update-09 release.

Regionalization of OPS VO

It was agreed that CERN will continue running the VOMRS service and that the management of VO will be transferred to EGI. At the Operations Management Board on January 25th 2011 it was agreed that VO managers will be Emir Imamagic and Peter Solagna. Furthermore since each NGI can have only 2 DNs registered in OPS VO it was decided all VO-related operations will be performed by the two VO managers.

Mechanisms for automating maintenance of ActiveMQ brokers

CERN has provided the tool for automatic generation of configuration files for all ActiveMQ brokers (mbcg). AUTH further modified generated packages in order to provide more generic solution. In addition AUTH provided certificate protected wiki pages where all the sensitive data and installation instructions are stored.

Additional activities

During the 2011 SAM infrastructure was fully distributed. At the end of the year the following SAM/Nagios instances were in production:

  • 26 NGI instances covering 37 EGI partners
  • 2 ROC instances covering 2 EGI partners
  • 1 project instances covering 1 EGI partners
  • 3 external ROC instances covering the following regions: Canada, IGALC and LA.

Detailed list of SAM/Nagios instances can be found on the following page: SAM Instances.

Metrics portal reached stable version and it was used in QR6 generation.

Ongoing activities

Monitoring of operations tools

Development of the new SAM instance for operational tools monitoring started in PQ5. The first step was reorganization of operational tools in the GOCDB:

Additional details can be found in the following slides: https://www.egi.eu/indico/conferenceDisplay.py?confId=549. This reorganization will enable automatic bootstrap of SAM instance for operational tools and integration with MyEGI web interface and ACE system for A/R calculation.

Security implementation in messaging system

CIC Portal decommissioning

Decommission of the old CIC portal (cic.egi.eu) was postponed and is on the roadmap for 2012. Postpone was caused by development of CIC features (VO ID cards, broadcast) in the Operations Portal.

High Availability implementation for Operational Tools

GOCDB failover implementation was postponed due to the GOCDB 4.1 development. The task is on the roadmap for Dec 2011 and 2012 (now depending on Fraunhofer Institute).

SAM release Update-13 will provides functionality of deploying secondary instance. Secondary SAM instance will be deployed depending on NGI size and resources.

Plans for 2012

Monitoring of operations tools

Security implementation in messaging system

CIC Portal decommissioning

High Availability implementation for Operational Tools

Central GOCDB failover in place at Fraunhofer, approx. end Dec 2011/Jan 2012. A DNS switch for the 'goc.egi.eu' domain between the production server and the failover server is in place (but not yet tested). Once installed, the failover will be readonly in order to prevent data-synchronization problems.