EGI Notebooks Availability and Continuity Plan
|Main||EGI.eu operations services||Support||Documentation||Tools||Activities||Performance||Technology||Catch-all Services||Resource Allocation||Security|
|Documentation menu:||Home •||Manuals •||Procedures •||Training •||Other •||Contact ►||For:||VO managers •||Administrators|
Back to main page: Services Availability Continuity Plans
This page reports on the Availability and Continuity Plan for EGI Notebooks and it is the result of the risks assessment conducted for this service: a series of risks and treats has been identified and analysed, along with the correspondent countermeasures currently in place. Whenever a countermeasure is not considered satisfactory for either avoiding or reducing the likelihood of the occurrence of a risk, or its impact, it is agreed with the service provider a new treatment for improving the availability and continuity of the service. The process is concluded with an availability and continuity test.
|Av/Co plan and test||X||X|
In the OLA it was agreed the following performances targets, on a monthly basis:
- Availability: 90%
- Reliability 90%
So far Notebooks didn't have any particular Av/Co issues highlighted by the performances that need to be further investigated.
Risks assessment and management
For more details, please look at the google spreadsheet. We will report here a summary of the assessment.
|Risk id||Risk description||Affected components||Established measures||Risk level||Expected duration of downtime / time for recovery||Comment|
|1||Service unavailable / loss of data due to hardware failure||All the service components||XXX||Medium|
|2||Service unavailable / loss of data due to software failure||All the service components||Low|
|3||service unavailable / loss of data due to human error||All the service components||Low|
|4||service unavailable for network failure (Network outage with causes external of the site)||All the service components||Low|
|5||Unavailability of key technical and support staff (holidays period, sickness, ...)||Low|
|6||Major disruption in the data centre. Fire, flood or electric failure for example||All the service components||Low|
|7||Major security incident. The system is compromised by external attackers and needs to be reinstalled and restored.||All the service components||Low||the measures already in place are considered satisfactory and risk level is acceptable|
|8||(D)DOS attack. The service is unavailable because of a coordinated DDOS.||All the service components||Low||the measures already in place are considered satisfactory and risk level is acceptable|
The level of all the identified risks is acceptable and the countermeasures already adopted are considered satisfactory
Availability and Continuity test