Difference between revisions of "Agenda-08-02-2016"
Jump to navigation
Jump to search
Line 21: | Line 21: | ||
= Operational issues = | = Operational issues = | ||
== Aligning Fedcloud sites to the A/R procedures == | |||
* EGI Operations proposal to align Fedcloud sites to the A/R related procedures used for the grid sites | |||
** based on the availability reliability of monitored services in cloudmon, EGI Operations will start follow up with underperforming sites as we are doing for every grid sites | |||
** sites will NOT be suspended for a/r performance at least until end of May | |||
* in parallel EGI Operations will start [https://wiki.egi.eu/wiki/PROC08 PROC08] to include cloud probes in the EGI_CRITICAL and EGI profiles used for A/R computations (IN PROGRESS) | |||
The proposed timeline is: | |||
* February 2016: | |||
** EGI Operations will check the status of the production cloud services in order to understand which issues (if any) the site has and provide help to NGIs and sites; | |||
** Start of the integration of cloud probes in EGI CRITICAL profile(current set+openstack): To be agreed with the ARGO team, [https://wiki.egi.eu/wiki/PROC08 PROC08] will be followed | |||
* June 2016: | |||
** Starting notification of sites eligible for suspension | |||
== FedCloud status == | == FedCloud status == |
Revision as of 14:41, 8 February 2016
General information
- the Operations meeting will be on the 2nd Monday of the month
- the EGI Operations Meeting schedule for first half of 2016 is available on Indico: https://indico.egi.eu/indico/categoryDisplay.py?categId=32 and on the new summary page: https://wiki.egi.eu/wiki/Operations_Meeting
News from URT
UMD release
- Preparation of the UMD-4 SL6 release
Staged rollout updates
- dcache 2.13.17
- voms-admin 3.4.0 (soon)
- storm 1.11.10 (soon)
Next releases
Operational issues
Aligning Fedcloud sites to the A/R procedures
- EGI Operations proposal to align Fedcloud sites to the A/R related procedures used for the grid sites
- based on the availability reliability of monitored services in cloudmon, EGI Operations will start follow up with underperforming sites as we are doing for every grid sites
- sites will NOT be suspended for a/r performance at least until end of May
- in parallel EGI Operations will start PROC08 to include cloud probes in the EGI_CRITICAL and EGI profiles used for A/R computations (IN PROGRESS)
The proposed timeline is:
- February 2016:
- EGI Operations will check the status of the production cloud services in order to understand which issues (if any) the site has and provide help to NGIs and sites;
- Start of the integration of cloud probes in EGI CRITICAL profile(current set+openstack): To be agreed with the ARGO team, PROC08 will be followed
- June 2016:
- Starting notification of sites eligible for suspension
FedCloud status
Decommissioning SL5
Decommissioning dCache 2.6
- almost done, last server is se0002.m45.ihep.su @ RU-Protvino-IHEP https://ggus.eu/?mode=ticket_info&ticket_id=118256 (IN PROGRESS)
AOB
Monthly Availability/Reliability
- Last three months report availabile ARGO
- Problems follow-up:
- AfricaArabia: ticket
- Overall A/R: 12.67/12.67
- RCs eligible to suspension: EG-ZC-T3, ZA-CHPC, ZA-UJ
- CERN: ticket
- Overall A/R: 33.22/33.22
- there were problems on the regional SAM instances, solved in January
- NGI_ARMGRID
- Overall A/R: 77.43/77.43
- NGI_DE: ticket
- the underperforming RCs (SCAI, UNI-DORTMUND) are recovering from the issues
- NGI_GRNET:
- RC eligible for suspension: GR-04-FORTH-ICS
- NGI_IT: ticket
- the underperforming RC INFN-NAPOLI-PAMELA seems to be recovering, waiting for a confirmation
- NGI_MARGI: ticket
- no monitoring data available since January
- RC eligible for suspension: MK-03-FINKI
- NGI_MD:
- Overall A/R: 61.89/61.89
- the underperforming RC MD-02-IMI is recovering
- ROC_LA:
- no monitoring data available for CBPF
- RC eligible for suspension: UFAL
- AfricaArabia: ticket