EGI DMSU Ticket Followup

From EGIWiki
Revision as of 15:50, 24 October 2017 by Apaolini (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Main operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security

DMSU menu: Home Interactions Ticket priorities Ticked followup Documentation Internals

Besides handling the incoming tickets software support also performs elementary followup of the tickets assigned to the 3rd line. This work is restricted (due to limited available effort) to checking high priority categories, and to basic aggregate checks on others.

Approach to the ticket is differentiated according to their priority:

Top Priority and Very Urgent

Experience shows that these priority levels are very rare, top priority occurs only once per several months, and there are only upto few dozens very urgent ones per year. Therefore it is feasible to give special care to these tickets, including manual one-by-one followup.

When such ticket is assigned to 3rd line, ETA is assigned, and TP are obliged, withing the reaction time given by SLA, to confirm or re-negotiate the ETA. Assignment of ETA is essential for EGI Operation to plan accordingly (e.g. whether to deploy emergency workarounds).

Status of these tickets is checked by the 2nd line support weekly, and summary reports are provided to EGI operations.

When the ETA time arrives, EGI support checks whether the fix was delivered. If not, TP is requested to provide a new estimate and an appropriate justification. If there are doubts, the tickets can be escalated to TCB.

Urgent and Less Urgent

On the contrary, there are typically 200 tickets of these priorities handled by DMSU every quarter. These numbers make any one-by-one followup in a centralized way not reasonable.

It's agreed that solving all submitted tickets may also reach beyond the capabilities of TP. Therefore the Fedora approach of closing low-priority tickets after a timeout, regardless of the fix availability, is taken. This is a tradeoff approach, avoiding the ever-increasing backlog of tickets. If the reported problems persist, and users are still affected, they are expected to submit new tickets.

More specifically, the following is expected from TP:

  1. When a fix is available in a revision or minor release, the ticket is closed as solved.
  2. Before a release, the TP is expected to run a pre-release campaign on all open tickets.
  3. Issues that can be solved with feasible effort are fixed in this campaign and the fixes are scheduled for the upcoming major release.
  4. Tickets older than 6 months which are not being solved in the upcoming release are closed as unsolved.

On UMD releases (typically every 6 weeks) DMSU checks the open low priority tickets.

  1. All low-priority tickets older than 6 months (counting the date of assignment from DMSU to TP) for which there was a release of the affected component in-between, are identified.
  2. A summary report is generated and sent to the TPs so that forgotten tickets can be closed as solved first.
  3. After a grace period of 2 weeks all remaining tickets on the list are closed as unsolved.

Altogether, the process guarantees that

  • as the worst case, there are no issues older than 6 months left opened after a major release
  • if there are minor and revision releases more frequently, the number of old opened issues is reduced gradually, even between major releases

The required technical support for this process is tracked in RT#3512 and RT#3518.