EGI-InSPIRE:SA1.4-QR10

From EGIWiki
(Redirected from SA1.4-QR10)
Jump to: navigation, search
EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports


Contents


1. Task Meetings

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome
30/08/2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1150 Meeting regarding monitoring of brokers At the meeting it was defined which tests should be run against broker network.
17/10/2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1245 Message brokers status update
24/10/2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1245 Discussion on PROD Broker network

2. Main Achievements

At the end of August there was a staff change at the messaging lead partner (AUTH). Due to the change, several meetings related to message broker network were held in order to define future actions related to stability and security.


GOCDB version 4.4 was released on September 10th. Detailed list of new features can be found in JRA1 section.

GOCDB read-only failover instance is deployed at the Fraunhofer institute in Germany (https://goc.itwm.fraunhofer.de/portal). The failover is intended to be read only to prevent data inconsistencies and is refreshed every 2hrs when it securely downloads and installs a db dmp file. It is currently a version behind so we will have to get them up to the latest version soon (but it shouldn't currently matter because the current version didn't introduce any new DB/PI changes).


One new version of Operations portal was deployed in this quarter: 2.9.6 on September 3rd. The major achievement in new version is implementation of probe for monitoring under-performing sites.. Detailed list of new features can be found in JRA1 section. At the end of the quarter there were four NGI instances: NGI_BY, NGI_CZ, NGI_GRNET and NGI_IBERGRID.

Transition from topic to virtual destination in order to improve synchronization between SAM instances and Operations portal is in progress.


SAM Update-17 staged rollout successfully finished on August 27th. By the end of the quarter 30 instances were upgraded to SAM Update-17. At the end of the quarter following SAM/Nagios instances were in production:

Detailed list of SAM/Nagios instances can be found on the following page: SAM Instances.

The new SAM instance for monitoring operational tools was deployed in October: https://ops-monitor.cern.ch/nagios. Integration with the central ACE was still in the progress at the end of the quarter.

3. Issues and Mitigation

Issue Description Mitigation Description
High availability of central operational tools is needed. GOCDB: dynamic loadbalancing DNS setup is provided for the address goc.egi.eu.

SAM: 4 SAM instances are officially using failover instances (NGI_FI, NGI_IT, NGI_RO, NGI_UK).

Monitoring of MW version As part of middleware upgrade campaign monitoring of MW versions was implemented on security monitoring instance. In the future this monitoring should be integrated into NGI part of Operations Portal. Possible approaches are integration into NGI SAM instances or deployment of dedicated SAM instance just for this purpose.

4. Plans for the next period

Track and perform planned tests of failover configurations of centralized tools.

Upgrade of ActiveMQ brokers to 5.5.1-fuse-08-15 is scheduled for beginning of November. Progress on security improvements on ActiveMQ brokers, in particular communication with clients (e.g. APEL, SAM) and clarification of client roadmaps.

Finalizing integration of operational tool monitoring instance with the central ACE. Definition of A/R report for operational tools.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export