Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-2021-01-11"

From EGIWiki
Jump to navigation Jump to search
Line 45: Line 45:
*** '''HK-HKU-CC-01''': migrating DPM from sl6 to CenOS7
*** '''HK-HKU-CC-01''': migrating DPM from sl6 to CenOS7
*** '''TW-NCUHEP''': ARC-CE failures due to outdated CAs package, performance is now good
*** '''TW-NCUHEP''': ARC-CE failures due to outdated CAs package, performance is now good
** CERN-PROD: https://ggus.eu/index.php?mode=ticket_info&ticket_id=149351
** '''CERN-PROD''': https://ggus.eu/index.php?mode=ticket_info&ticket_id=149351
*** webdav failures which required a fix in the EOS services https://its.cern.ch/jira/browse/EOS-4515 ; some instability with the site-bdii
*** webdav failures which required a fix in the EOS services https://its.cern.ch/jira/browse/EOS-4515 ; some instability with the site-bdii
** NGI_HR: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148518
** NGI_HR: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148518
*** egee.irb.hr: in the process of a major upgrade from CentOS 6 to CentOS 7, some delays.
*** '''egee.irb.hr''': in the process of a major upgrade from CentOS 6 to CentOS 7, some delays.
** NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148957
** NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148957
*** INFN-CATANIA: SRM problems; the SRM service will be decommissioned
*** '''INFN-CATANIA''': SRM problems; the SRM service will be decommissioned
*** INFN-CATANIA-STACK: recovered
*** '''INFN-CATANIA-STACK''': recovered
*** INFN-PADOVA: decommissioning process
*** '''INFN-PADOVA''': decommissioning process
** NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=149352
** NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=149352
*** INFN-LECCE: authz failures on SRM; CREAM-CE to decommission
*** '''INFN-LECCE''': authz failures on SRM; CREAM-CE to decommission
*** TRIGRID-INFN-CATANIA: CREAM-CE to decommission
*** '''TRIGRID-INFN-CATANIA''': CREAM-CE to decommission
** NGI_IT https://ggus.eu/index.php?mode=ticket_info&ticket_id=149798
** NGI_IT https://ggus.eu/index.php?mode=ticket_info&ticket_id=149798
*** INFN-ROMA1-CMS: intermittent failures on SRM service; some failures on ARC-CE servers
*** '''INFN-ROMA1-CMS''': intermittent failures on SRM service; some failures on ARC-CE servers
**NGI_UK:
**NGI_UK:
***'''UKI-SOUTHGRID-SUSX''': https://ggus.eu/index.php?mode=ticket_info&ticket_id=144720 Migration from CREAM to ARC, WN migration to CentOS7; SRM to be decommissioned; ARC-CE was failing the IGTF test, then solved; site-bdii failures. new failures on ARC-CE.
***'''UKI-SOUTHGRID-SUSX''': https://ggus.eu/index.php?mode=ticket_info&ticket_id=144720 Migration from CREAM to ARC, WN migration to CentOS7; SRM to be decommissioned; ARC-CE was failing the IGTF test, then solved; site-bdii failures. new failures on ARC-CE.
** NGI_UA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148958
** NGI_UA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148958
*** UA-NSCMBR: IGTF outdated; new failures with ARC-CE and SRM/webdav
*** '''UA-NSCMBR''': IGTF outdated; new failures with ARC-CE and SRM/webdav
** ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148515
** ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148515
*** ATLAND: downtime due to powercut and quarantine
*** '''ATLAND''': downtime due to powercut and quarantine
** ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148956
** ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148956
*** CBPF: SRM failures due to information not properly published. Physical access to facilities restricted due to COVID measures; planned a DPM update in December.
*** '''CBPF''': SRM failures due to information not properly published. Physical access to facilities restricted due to COVID measures; planned a DPM update in December.
** ROC_LA https://ggus.eu/index.php?mode=ticket_info&ticket_id=149355
** ROC_LA https://ggus.eu/index.php?mode=ticket_info&ticket_id=149355
*** SUPERCOMPUTO-UNAM: scheduled a downtime for upgrading the site.
*** '''SUPERCOMPUTO-UNAM''': scheduled a downtime for upgrading the site.
*Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: ('''December 2020'''):
*Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: ('''December 2020'''):
** AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150109
** AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150109

Revision as of 13:28, 8 January 2021

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


Back to https://wiki.egi.eu/wiki/Operations_Meeting

General information

Middleware

UMD

  • UMD4 release in preparation
    • StoRM, VOMS, BDII update, dCache
    • VERY URGENT
  • feedback on software automation from the EGI Conference

Preview repository

  • released on 2020-10-09
    • Preview 1.29.0 AppDB info (sl6): ARC 6.8.0 and 6.8.1, BDII 5.5.26, CVMFS 2.7.4, dCache 5.2.31, DMLite/DPM 1.14.0, frontier-squid 4.13.1, glite-info-update-endpoints 3.0.2, lcg-info 1.12.5, STORM 1.11.18
    • Preview 2.29.0 AppDB info (CentOS 7): ARC 6.8.0 and 6.8.1, BDII 5.5.26, CVMFS 2.7.4, dCache 5.2.31, DMLite/DPM 1.14.0, frontier-squid 4.13.1, glite-info-update-endpoints 3.0.2, lcg-info 1.12.5, STORM 1.11.18
  • included in the upcoming release: DPM, VOMS

Operations

ARGO/SAM

  • HTCondor-CE probes
    • working on the probe for the host certificate validity check: GGUS 147386
    • integration with secmon and pakiti: GGUS 150006
  • CREAM-CE metrics removed from ARGO_MON, ARGO_MON_OPERATIONS and ARGO_MON_CRITICAL (GGUS 149778)
    • emi.cream.CREAMCE*
    • eu.egi.CREAM*

FedCloud

Feedback from DMSU

Monthly Availability/Reliability

  • sites suspended:
    • WCSS64 (NGI_PL)

IPv6 readiness plans

CREAM-CE Decommission

VOMS upgrade to CentOS 7

  • VOMS for CentOS 7 released Nov 23rd with UMD 4.12.13
    • VOMS Admin 3.8.0, VOMS Server 2.0.15
  • VOMS endpoints registered on GOCDB as production and monitored: 41
    • Provided by 33 sites
  • list of ticket opened: GGUS
  • the VOMS servers need to be published in the BDII in order to easily collect the deployed version

ARC Middleware 5 end of support, migration to ARC 6


  • Status
Date Number of endpoints in BDII Number of GGUS tickets Issues
2020-06-08 75 42 Some ARC endpoints publish a timestamp instead of a version like 5.X.Y; we can fairly assume they are ARC6 nightly builds, but we're going to close the corresponding tickets after explicit confirmation from the site admin.
2020-07-13 53 29 -
2020-09-14 34 18 -
2020-10-12 32 19 -
2020-11-16 26 16 -

Storage accounting

Many sites stopped the publication of storage accounting records. Opened 57 tickets to fix that.

AOB

Next meeting

In 2021