Difference between revisions of "Agenda-2020-10-12"
Jump to navigation
Jump to search
(4 intermediate revisions by 2 users not shown) | |||
Line 8: | Line 8: | ||
== UMD == | == UMD == | ||
* plans on CentOS8 | * plans on CentOS8 ONGOING | ||
** https://wiki.egi.eu/wiki/Next_middleware_release | ** https://wiki.egi.eu/wiki/Next_middleware_release | ||
* UMD-4. | * UMD-4.12.0 regular release is almost ready (testing RC) | ||
** CVMFS 2.7.3, ARCCE 6.7.0, gfal 2.18.1, davix 0.7.6, xrootd 4.12.3 | |||
** next releases: update for VOMS on C7, StoRM on C7, BDII C7/SL6 | |||
** | |||
** | |||
== Preview repository == | == Preview repository == | ||
Line 36: | Line 28: | ||
** '''(14th Sept)''' 70 endpoints, 14 CRITICAL, success rate is about 80% | ** '''(14th Sept)''' 70 endpoints, 14 CRITICAL, success rate is about 80% | ||
** '''Oct 1st: included in the [https://poem.egi.eu/ui/public_metricprofiles/ARGO_MON_CRITICAL ARGO_MON_CRITICAL] profile (A/R computation)''' | ** '''Oct 1st: included in the [https://poem.egi.eu/ui/public_metricprofiles/ARGO_MON_CRITICAL ARGO_MON_CRITICAL] profile (A/R computation)''' | ||
*** 71 endpoints, success rate (including WARNING) 85.9% | *** (Oct 12th) 71 endpoints, success rate (including WARNING) 85.9% | ||
** working on the probe for the host certificate validity check: [https://ggus.eu/index.php?mode=ticket_info&ticket_id=147386 GGUS 147386] | ** working on the probe for the host certificate validity check: [https://ggus.eu/index.php?mode=ticket_info&ticket_id=147386 GGUS 147386] | ||
Line 112: | Line 104: | ||
|- | |- | ||
| 2020-09-14 || 34 || 18 || - | | 2020-09-14 || 34 || 18 || - | ||
|- | |||
| 2020-10-12 || 32 || 19 || - | |||
|} | |} | ||
Line 117: | Line 111: | ||
Many sites stopped the publication of storage accounting records. Opened [https://ggus.eu/index.php?mode=ticket_search&show_columns_check%5B0%5D=TICKET_TYPE&show_columns_check%5B1%5D=AFFECTED_VO&show_columns_check%5B2%5D=AFFECTED_SITE&show_columns_check%5B3%5D=PRIORITY&show_columns_check%5B4%5D=RESPONSIBLE_UNIT&show_columns_check%5B5%5D=STATUS&show_columns_check%5B6%5D=DATE_OF_CHANGE&show_columns_check%5B7%5D=SHORT_DESCRIPTION&show_columns_check%5B8%5D=SCOPE&su_hierarchy=0&keyword=publishing+storage+accounting+records&specattrib=none&status=all&typeofproblem=all&ticket_category=all&date_type=creation+date&tf_radio=1&timeframe=any&from_date=10+Jul+2020&to_date=11+Jul+2020&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO%21&ticket_per_page=60 57 tickets] to fix that. | Many sites stopped the publication of storage accounting records. Opened [https://ggus.eu/index.php?mode=ticket_search&show_columns_check%5B0%5D=TICKET_TYPE&show_columns_check%5B1%5D=AFFECTED_VO&show_columns_check%5B2%5D=AFFECTED_SITE&show_columns_check%5B3%5D=PRIORITY&show_columns_check%5B4%5D=RESPONSIBLE_UNIT&show_columns_check%5B5%5D=STATUS&show_columns_check%5B6%5D=DATE_OF_CHANGE&show_columns_check%5B7%5D=SHORT_DESCRIPTION&show_columns_check%5B8%5D=SCOPE&su_hierarchy=0&keyword=publishing+storage+accounting+records&specattrib=none&status=all&typeofproblem=all&ticket_category=all&date_type=creation+date&tf_radio=1&timeframe=any&from_date=10+Jul+2020&to_date=11+Jul+2020&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO%21&ticket_per_page=60 57 tickets] to fix that. | ||
* | * 12 tickets not solved yet | ||
* page for checking when the records were published: http://goc-accounting.grid-support.ac.uk/storagetest/storagesitesystems.html | * page for checking when the records were published: http://goc-accounting.grid-support.ac.uk/storagetest/storagesitesystems.html | ||
* [http://accounting-devel.egi.eu/storage.php Accounting Portal Prototype view] | * [http://accounting-devel.egi.eu/storage.php Accounting Portal Prototype view] | ||
Line 125: | Line 119: | ||
== Next meeting == | == Next meeting == | ||
Nov 16th, 2020 https://indico.egi.eu/event/5100/ |
Latest revision as of 14:14, 12 October 2020
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Back to https://wiki.egi.eu/wiki/Operations_Meeting
General information
Middleware
UMD
- plans on CentOS8 ONGOING
- UMD-4.12.0 regular release is almost ready (testing RC)
- CVMFS 2.7.3, ARCCE 6.7.0, gfal 2.18.1, davix 0.7.6, xrootd 4.12.3
- next releases: update for VOMS on C7, StoRM on C7, BDII C7/SL6
Preview repository
- released on 2020-10-09
- Preview 1.29.0 AppDB info (sl6): ARC 6.8.0 and 6.8.1, BDII 5.5.26, CVMFS 2.7.4, dCache 5.2.31, DMLite/DPM 1.14.0, frontier-squid 4.13.1, glite-info-update-endpoints 3.0.2, lcg-info 1.12.5, STORM 1.11.18
- Preview 2.29.0 AppDB info (CentOS 7): ARC 6.8.0 and 6.8.1, BDII 5.5.26, CVMFS 2.7.4, dCache 5.2.31, DMLite/DPM 1.14.0, frontier-squid 4.13.1, glite-info-update-endpoints 3.0.2, lcg-info 1.12.5, STORM 1.11.18
Operations
ARGO/SAM
- HTCondor-CE probes included in the ARGO_MON_OPERATORS profile on May 13th: https://ggus.eu/index.php?mode=ticket_info&ticket_id=146949
- (14th Sept) 70 endpoints, 14 CRITICAL, success rate is about 80%
- Oct 1st: included in the ARGO_MON_CRITICAL profile (A/R computation)
- (Oct 12th) 71 endpoints, success rate (including WARNING) 85.9%
- working on the probe for the host certificate validity check: GGUS 147386
FedCloud
Feedback from DMSU
Monthly Availability/Reliability
- Under-performed sites in the past A/R reports with issues not yet fixed:
- AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=147748
- HK-HKU-CC-01: migrating DPM from sl6 to CenOS7
- TW-NCUHEP: ARC-CE failures due to outdated CAs package
- NGI_DE: https://ggus.eu/index.php?mode=ticket_info&ticket_id=146871
- GoeGRID: CREAM-CE intermittent failures not affecting ATLAS; failures with ARC-CE, now passing the tests
- NGI_DE: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148519
- LRZ-LMU: CE had problems due to the decommission of SharedFS; the other CE returns UNKNOWN in the IGTF test.
- NGI_HR: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148518
- egee.irb.hr: in the process of a major upgrade from CentOS 6 to CentOS 7, some delays.
- NGI_PL: https://ggus.eu/index.php?mode=ticket_info&ticket_id=147311
- WCSS64
- NGI_UK:
- UKI-NORTHGRID-SHEF-HEP: https://ggus.eu/index.php?mode=ticket_info&ticket_id=146455 ARC-CE re-installed, some condor problems to fix
- UKI-SOUTHGRID-SUSX: https://ggus.eu/index.php?mode=ticket_info&ticket_id=144720 Migration from CREAM to ARC, WN migration to CentOS7; SRM to be decommissioned; ARC-CE was failing the IGTF test, then solved; site-bdii failures.
- ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148515
- ATLAND: downtime due to powercut and quarantine
- AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=147748
- Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: (September 2020):
- NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148957
- INFN-CATANIA
- INFN-CATANIA-STACK
- INFN-PADOVA
- NGI_UA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148958
- UA-NSCMBR: IGTF outdated
- ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148956
- CBPF
- NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148957
- sites suspended:
IPv6 readiness plans
- please provide updates to the IPv6 assessment (ongoing) https://wiki.egi.eu/w/index.php?title=IPV6_Assessment
- if any relevant, information will be summarised at OMB
CREAM-CE Decommission
- End of Security Updates and Support: 31st Dec 2020 (Decommissioning deadline)
- Original broadcast: https://operations-portal.egi.eu/broadcast/archive/2293
- PROC16 Decommission of unsupported software
- Decommissioning start date: Oct 1st 2020
- a probe detecting CREAM-CE endpoints will be run, returning WARNING status
- GGUS ticket: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148715
- eu.egi.sec.CREAMCE
- Nov 1st: probe returns CRITICAL status, alarms created on the ROD dashboard, ROD teams start to create tickets
- 1st Jan 2021: EGI Ops will start chasing the sites still providing CREAM-CE endpoints
- By this time service end-points which couldn't be upgraded should be put into downtime by site admin or ROD:
ARC Middleware 5 end of support, migration to ARC 6
- EGI Operations Broadcast
- PROC16 Decommission of unsupported software
- deadline: end of July
- Catalin is in contact with ARC team to get a webinar on ARC administration, scheduled (to be confirmed) for July 6th please contact operations@ for information
- Status
Date | Number of endpoints in BDII | Number of GGUS tickets | Issues |
---|---|---|---|
2020-06-08 | 75 | 42 | Some ARC endpoints publish a timestamp instead of a version like 5.X.Y; we can fairly assume they are ARC6 nightly builds, but we're going to close the corresponding tickets after explicit confirmation from the site admin. |
2020-07-13 | 53 | 29 | - |
2020-09-14 | 34 | 18 | - |
2020-10-12 | 32 | 19 | - |
Storage accounting
Many sites stopped the publication of storage accounting records. Opened 57 tickets to fix that.
- 12 tickets not solved yet
- page for checking when the records were published: http://goc-accounting.grid-support.ac.uk/storagetest/storagesitesystems.html
- Accounting Portal Prototype view
AOB
Next meeting
Nov 16th, 2020 https://indico.egi.eu/event/5100/