Difference between revisions of "Agenda-2020-07-13"

From EGIWiki
Jump to: navigation, search
(Created page with "{{Template:Op menubar}} {{Template:Doc_menubar}} {{TOC_right}} Category:Grid Operations Meetings Back to https://wiki.egi.eu/wiki/Operations_Meeting = General informatio...")
(No difference)

Revision as of 09:50, 30 June 2020

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


Back to https://wiki.egi.eu/wiki/Operations_Meeting

General information

Middleware

UMD

Preview repository

  • released on 2020-05-08
    • Preview 1.27.0 AppDB info (sl6): ARC 6.5.0 and 6.6.0, CVMFS 2.7.2, dCache 5.2.20, frontier-squid 4.11.2, gfal2 2.17.2, xrootd 4.11.3
    • Preview 2.27.0 AppDB info (CentOS 7): ARC 6.5.0 and 6.6.0, CVMFS 2.7.2, dCache 5.2.20, frontier-squid 4.11.2, gfal2 2.17.2, xrootd 4.11.3

Operations

ARGO/SAM

  • new metrics added to ARGO_MON_OPERATORS profile on May 27th: eu.egi.CREAMCE-JobSubmit, eu.egi.CREAMCE.WN-Csh, eu.egi.CREAMCE.WN-Softver
    • results: 177 endpoints, 15 WARNING (Timeout occurred (900 sec) ), 53 CRITICAL. Success rate 70% (61.6% including the WARNING)
  • When eu.egi.CREAMCE.WN-Softver is successful:
CREAM JobOutput OK: retrieved outputSandbox: ['std.err', 'std.out']

**** std.err ****


**** std.out ****
egee01 has UMD 3.14.4

When it fails:

CREAM JobOutput ERROR [DONE-OK, exitCode=1 ]: retrieved outputSandbox: ['std.err', 'std.out']

**** std.err ****
 
**** std.out ****
ERROR: unable to find glite, EMI, LCG or UMD WN version on n1037-amd

FedCloud

Feedback from DMSU

Verify configuration records

On a yearly basis, the information registered into GOC-DB need to be verified. NGIs and RCs have been asked to check them. In particular:

  1. NGI managers should review the people registered and the roles assigned to them, and in particular check the following information:
    • E-Mail
    • ROD E-Mail
    • Security E-Mail
NGI Managers should also review the status of the "not certified" RCs, in according to the RC Status Workflow;
  1. RCs administrators should review the people registered and the roles assigned to them, and in particular check the following information:
    • E-Mail
    • telephone numbers
    • CSIRT E-Mail
RC administrators should also review the information related to the registered service endpoints.

The process should be completed by June 22nd.

List of tickets.

Monthly Availability/Reliability

IPv6 readiness plans

ARC Middleware 5 end of support, migration to ARC 6

  • Status
Date Number of endpoints in BDII Number of GGUS tickets Issues
2020-06-08 75 42 Some ARC endpoints publish a timestamp instead of a version like 5.X.Y; we can fairly assume they are ARC6 nightly builds, but we're going to close the corresponding tickets after explicit confirmation from the site admin.

LCGDM end of support and migration to / enabling of DOME

  • Deployment statistics (Jun 5th):
$ ldapsearch -x -LLL -H ldap://egee-bdii.cnaf.infn.it:2170 -b "GLUE2GroupID=grid,o=glue" '(&(objectClass=GLUE2Manager)(GLUE2ManagerProductName=DPM))' GLUE2ManagerProductVersion GLUE2ManagerID | grep GLUE2ManagerProductVersion | sort | uniq -c
     1 GLUE2ManagerProductVersion: 1.10.0
    65 GLUE2ManagerProductVersion: 1.13.0
     2 GLUE2ManagerProductVersion: 1.13.1
    11 GLUE2ManagerProductVersion: 1.13.2
     3 GLUE2ManagerProductVersion: 1.8.10
     1 GLUE2ManagerProductVersion: 1.8.9
     4 GLUE2ManagerProductVersion: 1.9.0


Liasing with WLCG to follow-up the upgrade. Opened GGUS tickets asking the following:

  • all the sites with older DPM versions than 1.12 are suggested to upgrade to the latest DPM version , following the guide DPM upgrade (chapter 1 Upgrade to DPM 1.10.0 "Legacy Flavour" and chapter 2 Upgrade to DPM 1.10.0 "Dome Flavour")
    • DOME and the old LCGDM (srm protocol) will coexist
  • Monitoring: sites should enable the monitoring of the HTTP/WebDav and/or GridFTP endpoints
    • register the storage service endpoint as webdav and/or globus-GRIDFTP service type, with production flag disabled, providing respectively the URL field and the Extension Properties information as explained in the HOWTO21
    • check if the tests are ok
    • switch the production flag to "yes"

List of tickets


Site Ticket Notes
AEGIS03-ELEF-LEDA https://ggus.eu/index.php?mode=ticket_info&ticket_id=143152 SE marked as not production due to some issues that need to be fixed
INDIACMS-TIFR https://ggus.eu/index.php?mode=ticket_info&ticket_id=142245 new dpm headnode installed with legacy mode, downtime for migration
OBSPM https://ggus.eu/index.php?mode=ticket_info&ticket_id=143169 they asked for quattor documentation...
TASK https://ggus.eu/index.php?mode=ticket_info&ticket_id=143174 problem with starting xrootd...
UA_ICYB_ARC https://ggus.eu/index.php?mode=ticket_info&ticket_id=143178
WCSS64 https://ggus.eu/index.php?mode=ticket_info&ticket_id=143182 they need some time to gain enough knowledge for doing the upgrade....
GR-07-UOI-HEPLAB https://ggus.eu/index.php?mode=ticket_info&ticket_id=143467 still on slc6, some problems with the upgrade; before the end of the year all the site will be migrated to CentOS 7
Hephy-Vienna https://ggus.eu/index.php?mode=ticket_info&ticket_id=143277 DPM will be replaced with EOS during Q1 2020
HK-LCG2 https://ggus.eu/index.php?mode=ticket_info&ticket_id=143471 April 2020
ICM https://ggus.eu/index.php?mode=ticket_info&ticket_id=143091 dpm in the newest version. Now we are setting quota tokens...
IN2P3-IPNL https://ggus.eu/index.php?mode=ticket_info&ticket_id=143082 migration to EOS, dpm should be dismissed by mid 2020
IN2P3-IRES https://ggus.eu/index.php?mode=ticket_info&ticket_id=143070 testing the upgrade on the test infrastructure
NIKHEF-ELPROD https://ggus.eu/index.php?mode=ticket_info&ticket_id=143286 We won't upgrade. we plan to migrate to dCache before the end of 2019.
PSNC https://ggus.eu/index.php?mode=ticket_info&ticket_id=143474
ru-PNPI https://ggus.eu/index.php?mode=ticket_info&ticket_id=143281
UKI-SCOTGRID-DURHAM https://ggus.eu/index.php?mode=ticket_info&ticket_id=143465 migration to CentOS7, first....
UKI-SCOTGRID-ECDF https://ggus.eu/index.php?mode=ticket_info&ticket_id=143077 upgrading to CentOS7 first without DOME
UKI-SCOTGRID-GLASGOW https://ggus.eu/index.php?mode=ticket_info&ticket_id=143076 plan in progress
UKI-SOUTHGRID-BRIS-HEP https://ggus.eu/index.php?mode=ticket_info&ticket_id=143083


SECMON failures

Several CEs are failing the job submission tests, preventing pakiti to check the vulnerabilities fixes on the WNs.

AOB

Next meeting

July 13th, 2020 https://indico.egi.eu/event/4901/