Difference between revisions of "Agenda-14-03-2016"
Jump to navigation
Jump to search
Line 92: | Line 92: | ||
== Monthly Availability/Reliability == | == Monthly Availability/Reliability == | ||
== Next meeting == | == Next meeting == | ||
* '''14 Mar 2016''' https://indico.egi.eu/indico/conferenceDisplay.py?confId=2736 | * '''14 Mar 2016''' https://indico.egi.eu/indico/conferenceDisplay.py?confId=2736 |
Revision as of 14:46, 7 March 2016
General information
- the Operations meeting will be on the 2nd Monday of the month
- the EGI Operations Meeting schedule for first half of 2016 is available on Indico: https://indico.egi.eu/indico/categoryDisplay.py?categId=32 and on the new summary page: https://wiki.egi.eu/wiki/Operations_Meeting
News from URT
- A Critical bug which causes file loss.has been discovered on the DPM dmlite-shell new drain command released in DPM 1.8.10. One site in production has been affected https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Dev/Dmlite/Shell#Newfunctionality:Drain
- Preparation of the UMD-4 SL6/CentOS7 ongoing
- Preparation of the UMD for Cloud
Staged rollout updates
Next releases
Operational issues
Aligning Fedcloud sites to the A/R procedures
- EGI Operations proposal to align Fedcloud sites to the A/R related procedures used for the grid sites
- based on the availability reliability of monitored services in cloudmon, EGI Operations will start follow up with underperforming sites as we are doing for every grid sites
- sites will NOT be suspended for a/r performance at least until end of May
- in parallel EGI Operations will start PROC08 to include cloud probes in the EGI_CRITICAL and EGI profiles used for A/R computations (IN PROGRESS)
The proposed timeline is:
- February 2016:
- EGI Operations will check the status of the production cloud services in order to understand which issues (if any) the site has and provide help to NGIs and sites;
- Start of the integration of cloud probes in EGI CRITICAL profile(current set+openstack): To be agreed with the ARGO team, PROC08 will be followed
- June 2016:
- Starting notification of sites eligible for suspension
FedCloud status
Issues at cloud sites
Grouped by NGI, please follow up with sites.
- NGI_UK
- 100IT (OpenStack)
- vmcatcher issues https://ggus.eu/index.php?mode=ticket_info&ticket_id=116358#update#19
- BDII and GOCDB have different Endpoint URLs https://ggus.eu/index.php?mode=ticket_info&ticket_id=119002#update#5
- 100IT (OpenStack)
- NGI_PL
- CYFRONET-CLOUD (OpenStack)
- NGI_DE
- GoeGrid (OpenNebula)
- NGI_GRNET
- HG-09-Okeanos-Cloud (Synnefo)
- VMCatcher, issue with large metadata, on hold (it requires some development) https://ggus.eu/index.php?mode=ticket_info&ticket_id=116368
- HG-09-Okeanos-Cloud (Synnefo)
- NGI_IBERGRID
- IFCA-LCG2 (OpenStack)
- OCCI, endpoing published on sBDII is missing "/occi1.1/" https://ggus.eu/index.php?mode=ticket_info&ticket_id=119004
- IFCA-LCG2 (OpenStack)
- NGI_TR
- TR-FC1-ULAKBIM (OpenStack)
- Missing GLUE2DomainID and image description looks wrong https://ggus.eu/index.php?mode=ticket_info&ticket_id=119005#update#15
- TR-FC1-ULAKBIM (OpenStack)
Getting help on issues
- VMcatcher issues
- This page has a little number down right the site showing the number of images available at the site. If it's missing, it's very likely that the site has issues with vmcatcher.
- ACTION: Please check this documentation: https://wiki.egi.eu/wiki/MAN10#EGI_Image_Management_2 and https://github.com/hepix-virtualisation/vmcatcher. If you cannot figure out, please contact EGI Operations through the ticket, we will forward to vmcatcher devs.
Updating Federated_Cloud_Operation wiki
- Review your site's information on Federated_Cloud_Operation wiki, please sites reply asap!
- GoeGrid https://ggus.eu/?mode=ticket_info&ticket_id=118882
- MK-04-FINKICLOUD https://ggus.eu/?mode=ticket_info&ticket_id=118890
- CYFRONET-CLOUD https://ggus.eu/?mode=ticket_info&ticket_id=118878
Decommissioning Debian
- Debian support for squeeze (6.0) has been reached (Feb2016) https://www.debian.org/News/2016/20160212
Decommissioning SL5
- Tracked on SL5_retirement wiki
- No checks for dCache, DPM, ARC, UNICORE --> Action on NGIs/ROCs to follow up directly with sites
Decommissioning dCache 2.6
- DONE.