EGI-InSPIRE:SA1.4-QR6

From EGIWiki
(Redirected from SA1.4-QR6)
Jump to: navigation, search
EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports


Contents


1. Task Meetings

There are no specific SA1.4 meetings. It was agreed to discuss all deployment issues with operational tool representatives at the JRA1 meetings. Below is the list of JRA1 meetings and subjects relevant for SA1.4 which were discussed.

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome
01/09/2011 https://www.egi.eu/indico/conferenceDisplay.py?confId=577 InSPIRE-JRA1 phone conf Regionalization plans for all tools.
15/09/2011 https://www.egi.eu/indico/conferenceDisplay.py?confId=608 InSPIRE-JRA1 phone conf Metric portal status. Technical forum planning.
20/09/2011 https://www.egi.eu/indico/sessionDisplay.py?sessionId=78&confId=452#20110920 Operations Tools and Availability Calculation (EGI TF) Dedicated EGI TF session on operational tools monitoring and availability calculation.
30/09/2011 https://www.egi.eu/indico/conferenceDisplay.py?confId=648 A/R calculation TF session follow up Continued discussion on availability and reliability calculation.
19/10/2011 A/R probe meeting Discussion about probe for site A/R monitoring.
20/10/2011 https://www.egi.eu/indico/conferenceDisplay.py?confId=608 InSPIRE-JRA1 phone conf Status of VO SAM instance support.

2. Main Achievements

Operational tools progress

The new version of messaging broker ActiveMQ 5.5 was tested in October. For testing purposes additional broker network was set up. The testing network consisted of 4 brokers (2 at AUTH and one at CERN and SRCE) and passed all the tests. The main issue with the new broker is the lack of proper packaging and Yaim module which needs to be resolved prior to upgrade of production instances.

Metrics portal reached stable version and it was used in QR6 generation.

Two new versions of Operations portal were deployed in this quarter: 2.6.3 on August 5th and 2.6.4 on September 29th. Detailed list of new features can be found in JRA1 section. At the end of the quarter there were four NGI instances: NGI_BY, NGI_CZ, NGI_GRNET and NGI_IBERGRID. Decommission of the old CIC portal (cic.egi.eu) was postponed and is planned for the next quarter.

Two new versions of SAM were deployed in this quarter: SAM-Update13 on September 7th and SAM-Update14 on October 22th. SAM/Nagios deployment of NGI instances continued. As part of the NGI UK creation UKI ROC SAM instance was switched to NGI instance covering two NGIs: NGI_IE (Ireland) and the new NGI_UK. At the end of the quarter following SAM/Nagios instances were in production:

Detailed list of SAM/Nagios instances can be found on the following page: SAM Instances.

Starting from September 12th SAM uses the new test hr.srce.CADist-Check for monitoring EGI Trust Anchor version on WNs. The new test is included in operations tests and availability and reliability tests. The main new feature of the new CA test is: metadata provided in CA release is used so there is no need for manual update of CA probe package after CA releases.

Monitoring of core services and operational tools

Development of the new SAM instance for operational tools monitoring started. The first step was reorganization of operational tools in the GOCDB:

Additional details can be found in the following slides: https://www.egi.eu/indico/conferenceDisplay.py?confId=549. This reorganization will enable automatic bootstrap of SAM instance for operational tools and integration with MyEGI web interface and ACE system for A/R calculation.

Reorganization of NGI core services in the GOCDB was proposed at the OMB (https://www.egi.eu/indico/conferenceDisplay.py?confId=615). This reorganization will enable NGI-level A/R calculation.

EGI Technical Forum

During the EGI Technical Forum in Lyon several sessions related to operational tools were organized. The most important one was "Operations Tools and Availability Calculation" (https://www.egi.eu/indico/sessionDisplay.py?sessionId=78&confId=452#20110920). The main topics were:

Several side meetings were held at the EGI TF:

3. Issues and Mitigation

Issue Description Mitigation Description
High availability of central operational tools is needed. GOCDB: dynamic loadbalancing DNS setup is provided for the address goc.egi.eu. Secondary instance in Fraunhofer institute is still being deployed. Delay is caused by the development and deployment of the new GOCDB version.
Monitoring of underperforming sites. COD team has proposed monitoring of availability and reliability of sites. In case of decreased A/R alarm would be raised against the site. Such approach would enable sites to correct A/R figures before the end of the month and stay within OLA thresholds. Discussions have started on defining implementation details.
ActiveMQ broker is not fully packaged and Yaim module is missing. There is no support unit for ActiveMQ broker. Discussion with EMI messaging product team started in order to agree on package format. Once the package format is agreed, AUTH partner will provide additional documentation and secure SVN repository for storing configuration files. This approach will be used only for broker network used by operational tools. If any other EMI service requires messaging infrastructure, proper support unit and Yaim modules will need to be provided by EMI.

4. Plans for the next period

Decommission of the old CIC portal (cic.egi.eu) is planned for the next quarter.

Track and perform planned tests of failover configurations of centralized tools.

Deployment of the new SAM instance dedicated for monitoring operational tools with the new probes provided by operational tools developers.

Integration of DesktopGrids resources into EGI infrastructure.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export