EGI-InSPIRE:Sa1 2012-11-14
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
SA1 weekly report
Progress of SA1 issues
Grid software maintenance and support. The Operations Management Board assessed the risk and the related affecting operations assets . The min risks identified are: the availability of specialized support, the commitment to a timely delivering of fixes in case of high or critical vulnerabilities affecting the production infrastructure, the end of support of the LFC catalogue. The UMD software provisioning processes will be revised in preparation to the decommissioning of the EMI repository. The involvement of EGI.eu into the some of the global tasks delivered by EMI and IGE such as technical and software release coordination is being discussed with Technology Providers
Milestones/Deliverables
- D4.7 Operations Sustainability. Operations service portfolio completed according to fedSM guidelines. Other sections are being written.
SA1.1 Activity Management
- Meeting with JRA1 for the VO A/R statistics calculation
- Meeting with EGI CSIRT to assess the status of MW monitoring and unresponsive sites
- Federated Clouds task force meeting
- Supervision of the middleware upgrade process
- Follow up of WN vulnerability
- Preparation of a presentation for the WLCG GDB, about the MW upgrade campaing
- Input to QR10
SA1.2 Security
- ongoing work on middleware migration campaign
- agreed escalation procedure for non-responsive sites
- all reachable WMS systems now fixed for vulnerabilities 4039 and 4073
- chasing few remaining WMS systems published in BDII but not reachable
- planning for EMI-1 end of life in first quarter of 2013
- several new issues being handled in SVG
SA1.3 Staged rollout
- Work towards umd 2.3.0
- Staged rollout of SAM/Nagios 19
SA1.3 Integration
- preparation of Globus Task Force meeting
- Mapper: ongoing discussion on ticket workflow
SA1.4 Central tools
- GOCDB:
- Downtime of GOCDB due to power cut at RAL on 07/11/2012.
- New service types approved (WLCG request): net.perfSONAR.bandwidth and net.perfSONAR.latency. More requests are being handled to support federated cloud activities: https://rt.egi.eu/rt/Ticket/Display.html?id=4625
- SAM:
- new package made available which will be included in Update-19 (that just entered SR). This package contains binary that works properly on all 64-bit WNs (SL5 & SL6) - http://www.sysadmin.hep.ac.uk/rpms/egee-SA1/centos5/x86_64/grid-monitoring-probes-org.sam-0.5.7-1.el5.noarch.rpm
- start of staged rollout of SAM Update 19
- SAM instance for monitoring operational tools (https://ops-monitor.cern.ch/nagios/) integrated with ACE (profile ch.cern.sam-OPS_MONITOR).
SA1.5 Accounting
Mainly reactive support work this week. Plus handling the fall-out from a site-wide power failure at RAL.
SA1.6 Helpdesk
- Preparing the presentation at the GDB meeting at CERN
- Working on the new features for the next release on 2012-11-28
SA1.7 Support
Software Support
GGUS #87929 may have broader impact, will be reported to Operations.
Currently there are no open high-priority tickets with swsupport and TPs.
DMSU tickets flow Nov 4--10 | |
---|---|
assigned | 21 |
back to tpm | 1 |
reassigned to 3rd level | 17 |
solved | 3 |
open DMSU tickets status | |
---|---|
assigned | 0 |
in progress | 5 |
waiting for reply | 4 |
on hold | 2 |
Network Support
SA1.8 Availability and core services
Catch All Core Services + A/R Report
- Migrated vomrs data for dteam VO to new (umd2 based) voms service endpoint
- migration of production service endpoint is scheduled for this week
- Handled 4 A/R recomputation requests for October 2012
- Published final A/R reports for October 2012
- Operation of dteam VO service
- Removed 67 entries with expired certificates (certificates signed by expired CAs)
Documentation
- "EGI wiki guide" introduced to the community
- ongoing work on:
- EGI service proftolio
- EGI OLA
- Service type decommission procedure