Difference between revisions of "Agenda-12-06-2017"

From EGIWiki
Jump to: navigation, search
(yearly review of the information registered into GOC-DB)
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{TOC right}}  
+
{{Template:Op menubar}} {{Template:Doc_menubar}} {{TOC_right}}
 +
[[Category:Grid Operations Meetings]]
  
 
= General information  =
 
= General information  =
Line 8: Line 9:
  
 
== CMD ==
 
== CMD ==
 +
 +
CMD-OS 1.1.0 RC ready http://repository.egi.eu/sw/production/cmd-os/candidate/1/
  
 
== UMD ==
 
== UMD ==
Line 20: Line 23:
  
 
== ARGO/SAM ==
 
== ARGO/SAM ==
 +
 +
* ARC-CE probes are updated in order to mitigate the issue with missing jobs (https://ggus.eu/index.php?mode=ticket_info&ticket_id=126724)
 +
* FTS default port changed to 8446 and it is extracted from GOCDB service URL (https://ggus.eu/index.php?mode=ticket_info&ticket_id=128154)
 +
* New probes:
 +
** AAI CheckIn: HTTP checks of all URLs in GOCDB
 +
** NGI Argus: https://sccsec-egi-git.scc.kit.edu/EGI-CSIRT/nagios-plugins-egi.argus-ngi
 +
** WebDAV: https://gitlab.cern.ch/lcgdm/nagios-plugins-webdav
 +
** Internal ARGO probes: API queries, Nagios & ARC-CE monitor test freshness, Consumer & connectors, AMS
 +
* ARGO MON switched from UMD-3 to UMD-4
  
 
== Testing FedCloud sites  ==
 
== Testing FedCloud sites  ==
Line 49: Line 61:
 
To track the process, a [https://wiki.egi.eu/wiki/Verify_Configuration_Records series of tickets] have been opened.  
 
To track the process, a [https://wiki.egi.eu/wiki/Verify_Configuration_Records series of tickets] have been opened.  
  
'''2017-05-15 UPDATE''':
+
'''2017-06-12 UPDATE''':
 
+
*no feedback yet by: AfricaArabia, NGI_DE, NGI_FI, NGI_IL, NGI_NL, NGI_UA;  
*no feedback yet by: AfricaArabia, NGI_DE, NGI_IL, NGI_NL, NGI_UA;
 
*still reviewing: NGI_GRNET, NGI_HR, NGI_IBERGRID, NGI_IT, NGI_PL, NGI_RO, ROC_LA.
 
 
 
'''2017-06-12 UPDATE''': *no feedback yet by: AfricaArabia, NGI_DE, NGI_FI, NGI_IL, NGI_NL, NGI_UA;  
 
 
*still reviewing: NGI_IBERGRID, NGI_IT, ROC_LA.
 
*still reviewing: NGI_IBERGRID, NGI_IT, ROC_LA.
  
Line 63: Line 71:
 
** '''AsiaPacific'''
 
** '''AsiaPacific'''
 
*** TW-NCUHEP: site-bdii unstable for network issues with ARGO https://ggus.eu/index.php?mode=ticket_info&ticket_id=128083
 
*** TW-NCUHEP: site-bdii unstable for network issues with ARGO https://ggus.eu/index.php?mode=ticket_info&ticket_id=128083
***KR-UOS-SSCC: there were srm problems, now also CREAM failures, proposed the suspension after the end of this meeting https://ggus.eu/index.php?mode=ticket_info&ticket_id=127024
+
***KR-UOS-SSCC: there were srm problems, now also CREAM failures https://ggus.eu/index.php?mode=ticket_info&ticket_id=127024
 
** '''NGI_DE''' [https://ggus.eu/index.php?mode=ticket_info&ticket_id=125430 GGUS 125430]
 
** '''NGI_DE''' [https://ggus.eu/index.php?mode=ticket_info&ticket_id=125430 GGUS 125430]
 
***LRZ https://ggus.eu/index.php?mode=ticket_info&ticket_id=128087 site-bdii unreachable, GRAM5 failures; improving
 
***LRZ https://ggus.eu/index.php?mode=ticket_info&ticket_id=128087 site-bdii unreachable, GRAM5 failures; improving
Line 78: Line 86:
 
**NGI_FI (CSC) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128883 (SOLVED)
 
**NGI_FI (CSC) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128883 (SOLVED)
 
**NGI_FRANCE: https://ggus.eu/index.php?mode=ticket_info&ticket_id=128884 QoS violation
 
**NGI_FRANCE: https://ggus.eu/index.php?mode=ticket_info&ticket_id=128884 QoS violation
**NGI_IBERGRID (UNICAN) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128885
+
**NGI_IBERGRID (UNICAN) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128885 the site has just been decommissioned
 
**NGI_IL: https://ggus.eu/index.php?mode=ticket_info&ticket_id=128886 QoS violation
 
**NGI_IL: https://ggus.eu/index.php?mode=ticket_info&ticket_id=128886 QoS violation
**NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=128887 QoS violation
+
**NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=128887 QoS violation (SOLVED)
 
**NGI_PL (IFJ-PAN-BG) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128889
 
**NGI_PL (IFJ-PAN-BG) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128889
 
**NGI_RO (RO-11-NIPNE, RO-14-ITIM) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128890
 
**NGI_RO (RO-11-NIPNE, RO-14-ITIM) https://ggus.eu/index.php?mode=ticket_info&ticket_id=128890
Line 121: Line 129:
 
***'''NGIs/ROCs please start discussing with sites and provide suggestions for the overall plan'''
 
***'''NGIs/ROCs please start discussing with sites and provide suggestions for the overall plan'''
  
== Decommissioning of dCache 2.10 and 2.13 (to modify) ==
+
== Decommissioning of dCache 2.10 and 2.13 ==
  
 
* support for the '''dCache 2.10''' ended at December 2016, tickets opened by EGI Operations to track decommissioning
 
* support for the '''dCache 2.10''' ended at December 2016, tickets opened by EGI Operations to track decommissioning

Latest revision as of 14:26, 25 October 2017

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


General information

Middleware

CMD

CMD-OS 1.1.0 RC ready http://repository.egi.eu/sw/production/cmd-os/candidate/1/

UMD

Preview repository

Released on 2017-06-02:

  • Preview 1.12.0 AppDB info (sl6): ARC 15.03 u14, davix 0.6.6, DMLite 0.8.6, dpm-dsi 1.9.13, FTS 3.6.8, XRootD 4.6.1
  • Preview 2.12.0 AppDB info (CentOS 7): ARC 15.03 u14, davix 0.6.6, DMLite 0.8.6, dpm-dsi 1.9.13, FTS 3.6.8, XRootD 4.6.1, WN 4.0.5

Operations

ARGO/SAM

Testing FedCloud sites

Feedback from Helpdesk

yearly review of the information registered into GOC-DB

2017-04-07

On a yearly basis, the information registered into GOC-DB need to be verified. NGIs and RCs have been asked to check them. In particular:

  1. NGI managers should review the people registered and the roles assigned to them, and in particular check the following information:
    • E-Mail
    • ROD E-Mail
    • Security E-Mail
NGI Managers should also review the status of the "not certified" RCs, in according to the RC Status Workflow;
  1. RCs administrators should review the people registered and the roles assigned to them, and in particular check the following information:
    • E-Mail
    • telephone numbers
    • CSIRT E-Mail
RC administrators should also review the information related to the registered service endpoints.

The process should be completed by Apr 28th.

To track the process, a series of tickets have been opened.

2017-06-12 UPDATE:

  • no feedback yet by: AfricaArabia, NGI_DE, NGI_FI, NGI_IL, NGI_NL, NGI_UA;
  • still reviewing: NGI_IBERGRID, NGI_IT, ROC_LA.

Monthly Availability/Reliability

Decommissioning EMI WMS

As discussed at the February and April/May OMBs, we are making plans for decommissioning the WMS and moving to DIRAC.

NGIs provided WMS usage statistics, and in general the usage is relatively low, mainly for local testing

Moderate usage by few VOs:

  • NGI_CZ: eli-beams.eu
  • NGI_GRNET: see
  • NGI_IT: calet.org, compchem, theophys, virgo
  • NGI_PL: gaussian, vo.plgrid.pl, vo.nedm.cyfronet
  • NGI_UK: mice, t2k.org

EGI contacted these VOs to agree a smooth migration of their activities to DIRAC, only some of them replied till now:

  • compchem is already testing DIRAC
  • calet.org: discussing with the users the migration to DIRAC. Interested in a webinar on DIRAC.
  • mice: enabled on the GridPP DIRAC server

We need the VO feedback for better defining technical details and timeline:

  • NGIs with VOs using WMS (not necessarily limited to the VOs above), please contact them to ensure that these VOs have a back-up plan.

WMS servers can be decommissioned as soon as the supported VOs do not need them any more. The proposal is:

  • WMS will be removed from production starting from 1st January 2018.
    • VOs have 8 months to find alternatives or migrate to DIRAC
  • Considering that this is not an update, the decommission can be performed in few weeks.

IPv6 readiness plans

    • Resource Centres: assess the IPv6 readiness of the site infrastructure (real machines, cloud managers)
      • NGIs/ROCs please start discussing with sites and provide suggestions for the overall plan

Decommissioning of dCache 2.10 and 2.13

  • support for the dCache 2.10 ended at December 2016, tickets opened by EGI Operations to track decommissioning
  • dCache 2.13 decommissioning procedure started, in June the probes will get CRITICAL, support from dCache ends in July, upgrades to be performed by August
  • please upgrade to 2.16, whose support ends on May 2018, or to 3.0
    • take care that the dCache team does not support the upgrade from 2.10 directly to 2.16; only 2.10->2.13 and 2.13->2.16 transitions are supported.
  • decommissioning campaign will be started by EGI Operations to monitor the upgrade of the dCache 2.13 instances and follow up with the NGIs/sites at the beginning of August

Testing the new webdav probes

Site Host GGUSID note
CYFRONET-LCG2 se01.grid.cyfronet.pl https://ggus.eu/index.php?mode=ticket_info&ticket_id=128325 SOLVED
GRIF node12.datagrid.cea.fr https://ggus.eu/index.php?mode=ticket_info&ticket_id=128329
IGI-BOLOGNA darkstorm.cnaf.infn.it https://ggus.eu/index.php?mode=ticket_info&ticket_id=127930 SOLVED
INFN-T1 removed https://ggus.eu/index.php?mode=ticket_info&ticket_id=128326 SOLVED
NCG-INGRID-PT gftp01.ncg.ingrid.pt https://ggus.eu/index.php?mode=ticket_info&ticket_id=128327 SOLVED
UKI-NORTHGRID-LIV-HEP hepgrid11.ph.liv.ac.uk https://ggus.eu/index.php?mode=ticket_info&ticket_id=128328 SOLVED
egee.irb.hr lorienmaster.irb.hr

Missing steps:

Testing of the storage accounting

As discussed during the January OMB, the APEL team would need one site per NGI for testing the storage accounting. The eligible sites are the ones providing either dCache or DPM storage elements.

More information can be found in the following wiki: https://wiki.egi.eu/wiki/APEL/Storage

List of sites available for test.

2017-06-12 UPDATE:

  • 26 sites are sending storage accounting data (only from dCache and DPM SEs). The data has to be verified before deploying the script in production.
  • After the discussion at the March OMB, we are evaluating the creation of a new service type on GOC-DB that will be used for:
    • authorising the site/SE to publish the accounting data
    • making the site/SE appear in the portal
    • monitoring that the accounting data are regularly published

Currently the accounting service types are:

  1. glite-APEL: for authorizing the sending of the messages
  2. APEL: to monitor the accounting data publication

The proposed name is "APEL-SE"

AOB

Next meeting