EGI-InSPIRE:Sa1 2012-12-18

From EGIWiki
Revision as of 17:19, 6 January 2015 by Krakow (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports




Progress of SA1 issues

Nothing new to report.

Milestones/Deliverables

  • D4.7 Operations Sustainability: started internal review

SA1.1 Activity Management

MEETINGS

  • TF-NOC meeting and contacts established with GEANT operations and EduPERT
  • OMB meeting: chairing and preparation of material
  • JRA1 meeting and requirement for change of availability computation
  • PC meeting
  • EGI-CSIRT meeting on operational implications of having central banning (according to the proposed change in the policy for service operations)
  • GDB meeting: update on status and progress of obsolete middleware decommissioning
  • Monday weekly meeting on coordination of support and tools for the mw upgrade campaign
  • EGI Champions meeting
  • contribution to TCB meeting (Tf-accounting task force, status of adoption of SSM, adoption of gridftp, advancement of actions on BDII)

ACTIVITIES

  • finalization of first D4.7 draft
  • handling of EGI.eu domain nameserver incident on (15-12-2012)
  • handling of incident concerning GGUS (17-12-2012)
  • planning of activities for testbed on central allocation of resources
  • preparation activities for NGI and Global operations task sustainability, for the Evolving EGI workshop
  • assessment of IGE release support plans
  • preparation of document for TCB about classification of products (integrated, contributed, community) and changes around the software provisioning activities of EGI
  • planning of ARC CE decommissioning campaign
  • coordination of issues around DPM version monitoring for mw decommissioning
  • assessment of status of VOMS upgrade, followup of sites not responding to requests of upgrades or failing to put service end points in downtime, and definition of list of sites eligible for suspension this week
  • preparation work for the extension of the NGI availability monthly reports
  • Assessment of QCG information system use cases

SA1.2 Security

  • Monthly team meeting was held on Thursday 13th Dec.
  • started defining SA1.2 detailed workplan for 2013
  • Planning for best-efforts CSIRT cover during the Christmas/New Year holidays
  • Participate in WLCG security meeting at FNAL (17/18 Dec)
  • Continue work on Operations/Infrastructure (including security) track at ISGC 2013
  • We will release an SVG advisory for a 'Low' risk issue soon as this is now fixed
  • Work on a procedure for handling compromised certificates
  • Organised and held meeting on central user banning (13th Dec) for presentation at OMB on 18th Dec

SA1.3 Staged rollout

  • Final release candidate of UMD 2.3.1, containing:
    • ige gridftp 5.2.2
    • dpm and lfc 1.8.5
    • dcache 2.2.5 (contains only the dcap library)
    • gridsite 1.7.24
  • Preparing the release UMD 2.4.0, taking into account what is left now from IGE 3.0 and the several EMI updates, as well as what should be out in the next emi2 updates of December 2012 and January 2013 updates. Components already already in Stage Rollout:
    • IGE.gridway.sl5.x86_64-5.12.0
    • EMI.wms.sl6.x86_64-3.4.0

SA1.3 Integration

no progress

SA1.4 Central tools

  • On Saturday from 4am to 10.30 am (CET) the *egi.eu domain was unreachable. That caused - among other issues - a GOCDB outage. This should not have been caused problems directly to service monitoring, based on my information.
  • Middlware monitoring
  • Presentation of operational tools and middleware monitoring instances at the OMB.
  • New InterNGI usage functionality released on Accounting Portal (https://operations-portal.egi.eu/broadcast/archive/id/840)
  • Central operational tools outages
    • the *egi.eu domain was unreachable on Saturday 15th from 4am to 10.30 am (CET)
    • GGUS was unreachable on Monday 17th from 10:40 to 17:20 (CET) due to network outage: The network failure yesterday (Monday, 17.12.2012) was caused by two independent, almost simultaneously occurring faults in the network of the KIT. Due to the interaction this resulted in a very unclear picture about the real reasons. On North Campus, there was a hardware failure in one of the core backbone router, the redundant hardware part rebooted completely

unexpected and without any event without any configuration. On South Campus, there was another fault that was caused by the network in a building. Because of the large impact the localization of the cause was very difficult. The causing network components in the wiring closets have been replaced in the afternoon of Dec 17th.

    • GOCDB and APEL were unreachable on Tuesday 18th from 07:50 to 11:00 (CET) due to network outage (

SA1.5 Accounting

Stop of republishing of user DN for historical information and instructions given to site administrators Request to move nikhef to SSM production deployment

SA1.6 Helpdesk

  • Shopping list meeting to prioritise requests for GGUS
  • Implementation and maintenance work on GGUS including report generator
  • Migration of GGUS mail boxes to new infrastructure, see https://rt.egi.eu/rt/Ticket/Display.html?id=4700
  • GGUS release

SA1.7 Support

  • preparation of proposal for revision of GOCDB business logic
  • preparation of wg about revision of nagios probes released by EMI

Software Support

  • no report received

Network Support

  • no report received

SA1.8 Availability and core services

  • A/R Recomputation requests handling
    • GGUS 89418 Informing sam nagios about suspended sites on the reports
    • Received final A/R reports from Sam Nagios SU. Communication with them according the removal of the test profile name from the title.
    • Communication with Sam Nagios SU, regarding some issues in the reports.
  • Issues with VOMS registration procedure (regarding Dteam VO migration) sent to VOMS development team
  • Setup of EGI Catch All CA Registration Authority in Nigeria
  • Changeover of EGI Catch ALL CA Registration Authority in Tirana, Albania
  • Issue in certification infrastructure resolved

Documentation

Meetings

  • Evolving EGI workshop