Difference between revisions of "EGI-InSPIRE:SA1.4-QR10"
Revision as of 17:52, 6 January 2015
|Main||EGI.eu operations services||Support||Documentation||Tools||Activities||Performance||Technology||Catch-all Services||Resource Allocation||Security|
|Inspire reports menu:||Home •||SA1 weekly Reports •||SA1 Task QR Reports •||NGI QR Reports •||NGI QR User support Reports|
1. Task Meetings
|Date (dd/mm/yyyy)||Url Indico Agenda||Title||Outcome|
|30/08/2012||https://indico.egi.eu/indico/conferenceDisplay.py?confId=1150||Meeting regarding monitoring of brokers||At the meeting it was defined which tests should be run against broker network.|
|17/10/2012||https://indico.egi.eu/indico/conferenceDisplay.py?confId=1245||Message brokers status update|
|24/10/2012||https://indico.egi.eu/indico/conferenceDisplay.py?confId=1245||Discussion on PROD Broker network|
2. Main Achievements
At the end of August there was a staff change at the messaging lead partner (AUTH). Due to the change, several meetings related to message broker network were held in order to define future actions related to stability and security.
GOCDB version 4.4 was released on September 10th. Detailed list of new features can be found in JRA1 section.
GOCDB read-only failover instance is deployed at the Fraunhofer institute in Germany (https://goc.itwm.fraunhofer.de/portal). The failover is intended to be read only to prevent data inconsistencies and is refreshed every 2hrs when it securely downloads and installs a db dmp file. It is currently a version behind so we will have to get them up to the latest version soon (but it shouldn't currently matter because the current version didn't introduce any new DB/PI changes).
One new version of Operations portal was deployed in this quarter: 2.9.6 on September 3rd. The major achievement in new version is implementation of probe for monitoring under-performing sites.. Detailed list of new features can be found in JRA1 section. At the end of the quarter there were four NGI instances: NGI_BY, NGI_CZ, NGI_GRNET and NGI_IBERGRID.
Transition from topic to virtual destination in order to improve synchronization between SAM instances and Operations portal is in progress.
SAM Update-17 staged rollout successfully finished on August 27th. By the end of the quarter 30 instances were upgraded to SAM Update-17. At the end of the quarter following SAM/Nagios instances were in production:
- 28 NGI instances covering 39 EGI partners
- 3 ROC instances covering 3 EGI partners
- 3 external ROC instances covering the following regions: Canada, IGALC and LA.
Detailed list of SAM/Nagios instances can be found on the following page: SAM Instances.
The new SAM instance for monitoring operational tools was deployed in October: https://ops-monitor.cern.ch/nagios. Integration with the central ACE was still in the progress at the end of the quarter.
3. Issues and Mitigation
|Issue Description||Mitigation Description|
|High availability of central operational tools is needed.||GOCDB: dynamic loadbalancing DNS setup is provided for the address goc.egi.eu.
SAM: 4 SAM instances are officially using failover instances (NGI_FI, NGI_IT, NGI_RO, NGI_UK).
|Monitoring of MW version||As part of middleware upgrade campaign monitoring of MW versions was implemented on security monitoring instance. In the future this monitoring should be integrated into NGI part of Operations Portal. Possible approaches are integration into NGI SAM instances or deployment of dedicated SAM instance just for this purpose.|
4. Plans for the next period
Track and perform planned tests of failover configurations of centralized tools.
Upgrade of ActiveMQ brokers to 5.5.1-fuse-08-15 is scheduled for beginning of November. Progress on security improvements on ActiveMQ brokers, in particular communication with clients (e.g. APEL, SAM) and clarification of client roadmaps.
Finalizing integration of operational tool monitoring instance with the central ACE. Definition of A/R report for operational tools.