Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "SEC01"

From EGIWiki
Jump to navigation Jump to search
(49 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}
{{Template:Op menubar}}
{{Template:Doc_menubar}}
{{Template:Doc_menubar}}
{{Template:Under_construction}}
{{TOC_right}}  
{{TOC_right}}  
[[Category:Operations Procedures]]
[[Category:Operations Procedures]]


{{Ops_procedures
{{Ops_procedures
|Doc_title = EGI CSIRTSecurity Incident Handling Procedure  
|Doc_title = EGI CSIRT Security Incident Handling Procedure  
|Doc_link = https://documents.egi.eu/public/ShowDocument?docid=710
|Doc_link = https://wiki.egi.eu/wiki/SEC01
|Version = V5
|Version = v5, 25.02.2016
|Policy_acronym = EGI-CSIRT
|Policy_acronym = EGI-CSIRT
|Policy_name = EGI-CSIRT
|Policy_name = EGI-CSIRT
|Contact_group = irtf@mailman.egi.eu
|Contact_group = irtf@mailman.egi.eu
|Doc_status = DRAFT
|Doc_status = Approved
|Approval_date =  
|Approval_date = 31.03.2016
|Procedure_statement =  
|Procedure_statement = This procedure is aimed at minimising the impact of security incidents by encouraging post-mortem analysis and promoting cooperation between Resource Centers.
}}
}}


= Overview =
= Overview =


This procedure is aimed at minimising the impact of security incidents by encouraging post-mortem analysis and promoting cooperation between Resource Centers.<br />
This procedure is aimed at minimising the impact of security incidents by encouraging post-mortem analysis and promoting cooperation between Resource Centers.<br> It is based on the [https://documents.egi.eu/public/ShowDocument?docid=2935 Security Incident Response Policy].  
It is based on the [https://documents.egi.eu/public/ShowDocument?docid=82 Security Incident Response Policy].


This incident response procedure is aiming at complementing local security procedures. Unless specified otherwise in separate service level agreements, all times in this document refer to normal local working hours.
This incident response procedure is aiming at complementing local security procedures. Unless specified otherwise in separate service level agreements, all times in this document refer to normal local working hours.  


This document is intended for Resource Provicer site security contacts and site administrators and is primarily aimed at reporting, investigating and resolving security incidents.
This document is intended for Resource Center security contacts and administrators and is primarily aimed at reporting, investigating and resolving security incidents.  
 
 
 
[https://documents.egi.eu/public/ShowDocument?docid=710 Previous approved version of the procedure - ''approved'', July 2010 (MS405) ]&nbsp;


= Definitions =
= Definitions =
Line 36: Line 38:
* '''abuse at egi.eu''': Address to be used for reporting security Incident (In case of TLP:RED data, use GPG: A97F 3BDD F0EE 01A1 176C C13A 93BF 7F91 5696 F750)
* '''abuse at egi.eu''': Address to be used for reporting security Incident (In case of TLP:RED data, use GPG: A97F 3BDD F0EE 01A1 176C C13A 93BF 7F91 5696 F750)
* '''site-security-contacts at mailman.egi.eu''': Mailing list containing all Resource Center &quot;CSIRT E-Mail&quot; as defined in GOC-DB
* '''site-security-contacts at mailman.egi.eu''': Mailing list containing all Resource Center &quot;CSIRT E-Mail&quot; as defined in GOC-DB
* '''ngi-security-contacts at mailman.egi.eu''': Mailing list containing all NGI &quot;Security E-Mail&quot; as defined in GOC-DB


= Entities involved in the procedure =
= Entities involved in the procedure =
Line 43: Line 46:
* '''Resource Center''': RC CSIRT E-Mail as defined in GOC-DB
* '''Resource Center''': RC CSIRT E-Mail as defined in GOC-DB


= Resource Center Responsabilities =
= Resource Center Responsibilities =


The following table describes the actions to be taken when an incident potentially affecting EGI users, data, services, infrastructure is '''suspected'''. Administrators are recommended to take note of every action (with timestamp) they take, for later analysis or legal cases.
The following table describes the actions to be taken when an incident potentially affecting EGI users, data, services, infrastructure is '''suspected'''. Administrators are recommended to take note of every action (with timestamp) they take, for later analysis or legal cases.
Line 51: Line 54:
!Step
!Step
!Action
!Action
!SLA
!Deadline
|-
|-
|1
|1
|Inform your local security team, your NGI Security Officer and the EGI CSIRT via [mailto:abuse@egi.eu abuse@egi.eu]. You are encouraged to use the [[EGI_CSIRT:Incident_reporting|recommended templates]].
|Inform your local security team, your NGI Security Officer and the EGI CSIRT via [mailto:abuse@egi.eu abuse@egi.eu]. You are encouraged to use the [[EGI_CSIRT:Incident_reporting|recommended templates]].
|4 hours after discovery
|Within 4 hours of discovery
|-
|-
|2
|2
|Isolate the incident while keeping all forensic data: Do NOT reboot or power off hosts. Do NOT destroy VMs. Instead disconnect it from the network and if possible take a snapshot of the system. In case you need support, contact your local security team or the EGI CSIRT via [mailto:abuse@egi.eu abuse@egi.eu].
|In consultation with your local security team and the EGI CSIRT, act to isolate the compromised systems and contain the incident whilst preserving forensic data. Take a snapshot of affected VMs. Isolate at the network level if possible. Do NOT reboot or power off hosts. Do NOT destroy VMs. Physically disconnect systems from the network ONLY where other options are not available.
|1 day after discovery
|Within 1 day of discovery
|-
|-
|3
|3
|Confirm the incident, with assistance from your local security team and the EGI CSIRT.
|Together with your local security team and the EGI CSIRT decide if it is an incident that requires further investigation or action.
|
|
|-
|-
|4
|4
|If applicable, announce downtime for the affected services in accordance with the [https://documents.egi.eu/public/ShowDocument?docid=15 EGI Operational Procedures]
|If applicable, announce downtime for the affected services in accordance with the [https://documents.egi.eu/public/ShowDocument?docid=15 EGI Operational Procedures]
|1 day after isolation
|Within 1 day of isolation
|-
|-
|5
|5
|Perform appropriate analysis and take necessary corrective actions, see [[SEC01#Incident_Analysis_Guideline|Incident Analysis Guideline]]
|Perform appropriate analysis and take necessary corrective actions, see [[SEC01#Incident_Analysis_Guideline|Incident Analysis Guideline]]
|4 hours after any EGI CSIRT request
|Within 4 working hours of any EGI CSIRT request
|-
|-
|6
|6
|Coordinate with your local security team and the EGI CSIRT to send an incident closure report within 1 month following the incident to the EGI CSIRT via [mailto:abuse@egi.eu abuse@egi.eu], including lessons learnt and resolution. This report should be labelled AMBER or higher, according to the [[EGI_CSIRT:TLP|Traffic Light Protocol]].
|Coordinate with your local security team and the EGI CSIRT to send an incident closure report to the EGI CSIRT via [mailto:abuse@egi.eu abuse@egi.eu], including lessons learnt and resolution. This report should be labelled AMBER or RED, according to the [[EGI_CSIRT:TLP|Traffic Light Protocol]].
|1 month after incident resolution
|Within 1 month of incident resolution
|-
|-
|7
|7
Line 82: Line 85:
|}
|}


= EGI-CSIRT Responsabilities =
= EGI-CSIRT Responsibilities =


== EGI-CSIRT Security Officer on Duty ==
== EGI-CSIRT Security Officer on Duty ==
The EGI-CSIRT Security Officer on Duty tasks are:
The EGI-CSIRT Security Officer on Duty tasks are:


* Evaluate the initial incident report and determine whether it appears to be part of a multi-site incident, in particular, whether it is related to a previously known incident (e.g. do the same attacking IP addresses appear, are the attacker's tools and methodology strongly similar):
* Evaluate the initial incident report and determine whether it appears to be part of an incident covering multiple RCs, in particular, whether it is related to a previously known incident (e.g. do the same attacking IP addresses appear, are the attacker's tools and methodology strongly similar):
** If this is a new, unrelated incident, assign an identifying tag (of the format [EGI-YYYYMMDD-NN]) to the incident and announce it to all sites via [mailto:site-security-contacts@mailman.egi.eu site-security-contacts@mailman.egi.eu] using the [[EGI_CSIRT:Incident_reporting|recommended templates]].
** If this is a new, unrelated incident, assign an identifying tag (of the format [EGI-YYYYMMDD-NN]) to the incident and announce it to all RCs via [mailto:site-security-contacts@mailman.egi.eu site-security-contacts@mailman.egi.eu], all NGIs via [mailto:ngi-security-contacts@mailman.egi.eu ngi-security-contacts@mailman.egi.eu] and the EGI CSIRT via [mailto:csirt@egi.eu csirt@egi.eu] using the [[EGI_CSIRT:Incident_reporting|recommended templates]].
** If the incident is part of a multi-site incident, the incident coordinator MAY choose not to announce each incident separately, but instead issue regular updates on the overall multi-site incident.
** If the incident is part of an incident covering multiple RCs, the incident coordinator MAY choose not to announce each incident separately, but instead issue regular updates on the overall incident.
* Whenever and as often as necessary, send updated:
* Take any appropriate actions in order to:
** summary reports to all sites ([mailto:site-security-contacts@mailman.egi.eu site-security-contacts@mailman.egi.eu]), containing the status of the incident and possibly details needed to search
** Contact affected parties to obtain accurate information at an appropriate level of detail and in a timely manner.
** detailed reports to the RCs directly involved and affected by the incident, containing interesting findings or possible leads that could be used to resolve the incident
** Investigate to determine the cause and extent of the incident, what assets have been compromised (credentials etc.), and how to resolve the incident.
* Whenever an obviously malicious behaviour or policy violation can be linked to a user account or identity:
** Help involved RCs to resolve the incident by providing recommendations, promoting collaboration with other RCs and periodically checking their statuses.
** Maintain communications with any other involved parties inside and outside EGI.
* When appropriate, send updated:
** Summary reports to all RCs, NGIs and the EGI-CSIRT ([mailto:site-security-contacts@mailman.egi.eu site-security-contacts@mailman.egi.eu], [mailto:ngi-security-contacts@mailman.egi.eu ngi-security-contacts@mailman.egi.eu] and [mailto:csirt@egi.eu csirt@egi.eu]), containing the status of the incident and indicators of compromise that can be used by RCs to evaluate their implication
** Detailed reports to the RCs directly involved and affected by the incident, containing interesting findings or possible leads that could be used to resolve the incident
* If malicious behaviour or a policy violation can be linked to a user account or identity:
** Add the account or identity to the emergency suspension list following the [[SEC04|appropriate procedure]].
** Add the account or identity to the emergency suspension list following the [[SEC04|appropriate procedure]].
** If applicable, report the incident to the VO providing access
** If applicable, report the incident to the VO providing access. Coordinate any user suspension and job termination with the VO.
** Verify the legimity of the activity with the owner of the account or identity
** Without hindering the investigation, verify the legitimacy or otherwise of the activity with the owner of the account or identity
* If user credentials have been exposed or compromised, report it to the relevant credential provider. In particular, CA contacts are available on https://www.eugridpma.org/showca.
* When suspended accounts or identities no longer represent a threat, typically when the incident is resolved and compromised credentials have been re-issued, remove them from the emergency suspension list
* When suspended accounts or identities no longer represent a threat, typically when the incident is resolved and compromised credentials have been re-issued, remove them from the emergency suspension list
* When a virtual appliance is identified as being vulnerable or malicious, ensure that:
** Its endorsement is revoked on [https://appdb.egi.eu/ APP-DB]
** All instantiated and running VMs using this virtual appliance are properly handled
* Based on the incident closure report received from the affected RC, send a closure report with the relevant information to all partners.
* Based on the incident closure report received from the affected RC, send a closure report with the relevant information to all partners.
* Take any appropriate actions in order to:
** Actively stimulate and probe the affected parties to obtain accurate information at an appropriate level of detail and in a timely manner.
** Understand the exact cause and extent of the incident, what assets have been compromised (credentials etc.), and how to resolve the incident.
** Help involved RCs resolve the incident by providing recommendations, promoting collaboration with other sites and periodically checking their status.
** Maintain communications with any other involved parties inside and outside EGI.


== EGI-CSIRT Security Incident Coordinator ==
== EGI-CSIRT Security Incident Coordinator ==
Line 110: Line 116:
In addition, the EGI CSIRT appoints a security incident coordinator for each incident, responsible for:
In addition, the EGI CSIRT appoints a security incident coordinator for each incident, responsible for:


* Ensuring that the incident does not become stale due to the duty rotation.
* Ensuring that the investigation does not stall
* When the incident is closed, based on the site incident closure report and in coordination with the on-duty security officers, conduct a debriefing
* Ensuring that information is properly logged
* Conducting a debriefing after the investigation is complete.


= Incident Analysis Guideline =
= Incident Analysis Guideline =


As part of the security incident resolution process, sites are expected to produce the following information:
As part of the security incident resolution process, RCs are expected to produce the following information:


* Who/how detected or reported the incident
* Who/how detected or reported the incident
* Host(s) affected (ex: compromised hosts, hosts running suspicious user code)
* Host(s) affected (ex: compromised hosts, hosts running suspicious user code)
* If applicable, the virtual appliance used to instantiate the affected virtual machine.
* Evidence of the compromise, including timestamps (ex: suspicious files, log entry or network activity)
* Possibly affected X509 certificate DNs of the user(s), operator(s), consumer(s)
* If relevant, host(s) used as a local entry point to the site (ex: UI or WMS IP address)
* If available, Remote IP address(es) of the attacker
* Evidence of the compromise, including timestamps (ex: suspicious files,log entry or network activity)
* If available, what was lost, details of the attack (ex: compromised credentials, (root) compromised host)
* If available and relevant, the list of other sites possibly affected
* If available and relevant, possible vulnerabilities exploited by the attacker
* The actions taken to resolve the incident
* The actions taken to resolve the incident
* When applicable/available:
** Possibly affected X509 certificate DNs of the user(s), operator(s), consumer(s)
** Host(s) used as a local entry point to the RC (ex: UI or WMS IP address)
** Remote IP address(es) of the attacker
** The virtual appliance used to instantiate any affected virtual machine.
** What was lost, details of the attack (ex: compromised credentials, (root) compromised host)
** Any remote IP you suspect to be affected
** Vulnerabilities possibly exploited by the attacker
** Details of malicious jobs, in particular: submitter, submission time, start time, local job ID, hostname/IP
RCs are also expected to:
* Report any action taken to the EGI CSIRT as often as necessary
* Identify and kill suspicious process(es) as appropriate, but aim at preserving the information they could have generated, both in memory and on disk by dumping them beforehand, see [[Forensic_Howto|Forensic_Howto]]
* Identify and kill suspicious process(es) as appropriate, but aim at preserving the information they could have generated, both in memory and on disk by dumping them beforehand, see [[Forensic_Howto|Forensic_Howto]]
* If it is suspected that some grid credentials have been abused or compromised, you MUST ensure the relevant accounts have been suspended centrally according to the [https://documents.egi.eu/secure/ShowDocument?docid=1018 EGI CSIRT Operational Procedure for Compromised Certificates and Central Security Emergency suspension]. Contact the EGI CSIRT for this.
* If it is suspected that any credentials have been abused or compromised, you MUST inform the EGI CSIRT who take appropriate action. Inform the EGI CSIRT of any direct contact with the involved VO, CA or any other credential provider.
* If it is suspected that some grid credentials have been abused, you MUST ensure that the relevant VO manager(s) have been informed. VO contacts are available from: https://cic.gridops.org/index.php?section=vo. Contact the EGI CSIRT for this.
* If it is suspected that a virtual appliance used to instantiate an affected virtual machine is vulnerable or malicious, you MUST report it to the EGI CSIRT.
* If it is suspected that some grid credentials have been compromised, you MUST ensure that the relevant CA has been informed. CA contacts are available from: https://www.eugridpma.org/showca. Contact the EGI CSIRT for this.
* Seek help from your local security team, from your NGI Security Officer or from the EGI CSIRT
* If it is suspected that a virtual appliance used to instantiate the affected virtual machine, you MUST ensure that the virtual appliance is de-endorsed. Contact the EGI CSIRT for this.
* If needed, seek help from your local security team, from your NGI Security Officer or from the EGI CSIRT
* If relevant, additional reports containing suspicious patterns, IP addresses, files or evidence that may be of use to other infrastructure parties SHOULD be sent to the EGI CSIRT.
* If relevant, additional reports containing suspicious patterns, IP addresses, files or evidence that may be of use to other infrastructure parties SHOULD be sent to the EGI CSIRT.


RCs are also recommended to:
* Locally suspend any credential that has been directly reported to be violating the AUP. Unless emergencies, serious risks or damages, non-responsiveness of the VO or recommended otherwise, sites are not recommended to immediately suspend VO's pilot DNs indirectly (user payload) violating the AUP.


As part of the investigations, sites MUST be able to provide the relevant logging information produced by local services. Logging information such as IP addresses, timestamps and identities involved etc., concerning the source of any suspicious successful connection, must meet the following minimal requirements, if possible according to local laws:
 
As part of the investigations, RCs MUST be able to provide the relevant logging information produced by local services. Logging information such as IP addresses, timestamps and identities involved etc., concerning the source of any suspicious successful connection, must meet the following minimal requirements, if possible according to local laws:


* 6 months prior to the discovery of the incident for successful SSH connections against EGI services, and for the originating submission host for grid jobs or virtual machines
* 6 months prior to the discovery of the incident for successful SSH connections against EGI services, and for the originating submission host for grid jobs or virtual machines
* 3 months prior to the discovery of the incident for all other EGI related services.
* 3 months prior to the discovery of the incident for all other EGI related services.
For example, should an incident be detected and reported on 1st of September, it is expected that sites can produce the relevant logging information for suspicious SSH connections from 1st of March.


= Revision History =
= Revision History =
Line 174: Line 185:
|5
|5
|Vincent Brillault/CERN
|Vincent Brillault/CERN
|Ongoing
|25/02/2016
|Adapted to the wiki format, several adjustment
|Adapted to the wiki format, several adjustment
|}
|}

Revision as of 15:10, 11 December 2018

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


Title EGI CSIRT Security Incident Handling Procedure
Document link https://wiki.egi.eu/wiki/SEC01
Last modified v5, 25.02.2016
Policy Group Acronym EGI-CSIRT
Policy Group Name EGI-CSIRT
Contact Group irtf@mailman.egi.eu
Document Status Approved
Approved Date 31.03.2016
Procedure Statement This procedure is aimed at minimising the impact of security incidents by encouraging post-mortem analysis and promoting cooperation between Resource Centers.
Owner Owner of procedure


Overview

This procedure is aimed at minimising the impact of security incidents by encouraging post-mortem analysis and promoting cooperation between Resource Centers.
It is based on the Security Incident Response Policy.

This incident response procedure is aiming at complementing local security procedures. Unless specified otherwise in separate service level agreements, all times in this document refer to normal local working hours.

This document is intended for Resource Center security contacts and administrators and is primarily aimed at reporting, investigating and resolving security incidents.


Previous approved version of the procedure - approved, July 2010 (MS405)  

Definitions

Please refer to the EGI Glossary for the definitions of the terms used in this procedure.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Contact points

  • abuse at egi.eu: Address to be used for reporting security Incident (In case of TLP:RED data, use GPG: A97F 3BDD F0EE 01A1 176C C13A 93BF 7F91 5696 F750)
  • site-security-contacts at mailman.egi.eu: Mailing list containing all Resource Center "CSIRT E-Mail" as defined in GOC-DB
  • ngi-security-contacts at mailman.egi.eu: Mailing list containing all NGI "Security E-Mail" as defined in GOC-DB

Entities involved in the procedure

  • EGI-CSIRT Security Officer on Duty: irtf at mailman.egi.eu
  • NGI Security Officer: NGI Security E-Mail as defined in GOC-DB
  • Resource Center: RC CSIRT E-Mail as defined in GOC-DB

Resource Center Responsibilities

The following table describes the actions to be taken when an incident potentially affecting EGI users, data, services, infrastructure is suspected. Administrators are recommended to take note of every action (with timestamp) they take, for later analysis or legal cases.

SEC01-RC.jpg
Step Action Deadline
1 Inform your local security team, your NGI Security Officer and the EGI CSIRT via abuse@egi.eu. You are encouraged to use the recommended templates. Within 4 hours of discovery
2 In consultation with your local security team and the EGI CSIRT, act to isolate the compromised systems and contain the incident whilst preserving forensic data. Take a snapshot of affected VMs. Isolate at the network level if possible. Do NOT reboot or power off hosts. Do NOT destroy VMs. Physically disconnect systems from the network ONLY where other options are not available. Within 1 day of discovery
3 Together with your local security team and the EGI CSIRT decide if it is an incident that requires further investigation or action.
4 If applicable, announce downtime for the affected services in accordance with the EGI Operational Procedures Within 1 day of isolation
5 Perform appropriate analysis and take necessary corrective actions, see Incident Analysis Guideline Within 4 working hours of any EGI CSIRT request
6 Coordinate with your local security team and the EGI CSIRT to send an incident closure report to the EGI CSIRT via abuse@egi.eu, including lessons learnt and resolution. This report should be labelled AMBER or RED, according to the Traffic Light Protocol. Within 1 month of incident resolution
7 Restore the service and, if needed, update the service documentation and procedures to prevent recurrence as necessary.

EGI-CSIRT Responsibilities

EGI-CSIRT Security Officer on Duty

The EGI-CSIRT Security Officer on Duty tasks are:

  • Evaluate the initial incident report and determine whether it appears to be part of an incident covering multiple RCs, in particular, whether it is related to a previously known incident (e.g. do the same attacking IP addresses appear, are the attacker's tools and methodology strongly similar):
  • Take any appropriate actions in order to:
    • Contact affected parties to obtain accurate information at an appropriate level of detail and in a timely manner.
    • Investigate to determine the cause and extent of the incident, what assets have been compromised (credentials etc.), and how to resolve the incident.
    • Help involved RCs to resolve the incident by providing recommendations, promoting collaboration with other RCs and periodically checking their statuses.
    • Maintain communications with any other involved parties inside and outside EGI.
  • When appropriate, send updated:
    • Summary reports to all RCs, NGIs and the EGI-CSIRT (site-security-contacts@mailman.egi.eu, ngi-security-contacts@mailman.egi.eu and csirt@egi.eu), containing the status of the incident and indicators of compromise that can be used by RCs to evaluate their implication
    • Detailed reports to the RCs directly involved and affected by the incident, containing interesting findings or possible leads that could be used to resolve the incident
  • If malicious behaviour or a policy violation can be linked to a user account or identity:
    • Add the account or identity to the emergency suspension list following the appropriate procedure.
    • If applicable, report the incident to the VO providing access. Coordinate any user suspension and job termination with the VO.
    • Without hindering the investigation, verify the legitimacy or otherwise of the activity with the owner of the account or identity
  • If user credentials have been exposed or compromised, report it to the relevant credential provider. In particular, CA contacts are available on https://www.eugridpma.org/showca.
  • When suspended accounts or identities no longer represent a threat, typically when the incident is resolved and compromised credentials have been re-issued, remove them from the emergency suspension list
  • When a virtual appliance is identified as being vulnerable or malicious, ensure that:
    • Its endorsement is revoked on APP-DB
    • All instantiated and running VMs using this virtual appliance are properly handled
  • Based on the incident closure report received from the affected RC, send a closure report with the relevant information to all partners.

EGI-CSIRT Security Incident Coordinator

In addition, the EGI CSIRT appoints a security incident coordinator for each incident, responsible for:

  • Ensuring that the investigation does not stall
  • Ensuring that information is properly logged
  • Conducting a debriefing after the investigation is complete.

Incident Analysis Guideline

As part of the security incident resolution process, RCs are expected to produce the following information:

  • Who/how detected or reported the incident
  • Host(s) affected (ex: compromised hosts, hosts running suspicious user code)
  • Evidence of the compromise, including timestamps (ex: suspicious files, log entry or network activity)
  • The actions taken to resolve the incident
  • When applicable/available:
    • Possibly affected X509 certificate DNs of the user(s), operator(s), consumer(s)
    • Host(s) used as a local entry point to the RC (ex: UI or WMS IP address)
    • Remote IP address(es) of the attacker
    • The virtual appliance used to instantiate any affected virtual machine.
    • What was lost, details of the attack (ex: compromised credentials, (root) compromised host)
    • Any remote IP you suspect to be affected
    • Vulnerabilities possibly exploited by the attacker
    • Details of malicious jobs, in particular: submitter, submission time, start time, local job ID, hostname/IP

RCs are also expected to:

  • Report any action taken to the EGI CSIRT as often as necessary
  • Identify and kill suspicious process(es) as appropriate, but aim at preserving the information they could have generated, both in memory and on disk by dumping them beforehand, see Forensic_Howto
  • If it is suspected that any credentials have been abused or compromised, you MUST inform the EGI CSIRT who take appropriate action. Inform the EGI CSIRT of any direct contact with the involved VO, CA or any other credential provider.
  • If it is suspected that a virtual appliance used to instantiate an affected virtual machine is vulnerable or malicious, you MUST report it to the EGI CSIRT.
  • Seek help from your local security team, from your NGI Security Officer or from the EGI CSIRT
  • If relevant, additional reports containing suspicious patterns, IP addresses, files or evidence that may be of use to other infrastructure parties SHOULD be sent to the EGI CSIRT.

RCs are also recommended to:

  • Locally suspend any credential that has been directly reported to be violating the AUP. Unless emergencies, serious risks or damages, non-responsiveness of the VO or recommended otherwise, sites are not recommended to immediately suspend VO's pilot DNs indirectly (user payload) violating the AUP.


As part of the investigations, RCs MUST be able to provide the relevant logging information produced by local services. Logging information such as IP addresses, timestamps and identities involved etc., concerning the source of any suspicious successful connection, must meet the following minimal requirements, if possible according to local laws:

  • 6 months prior to the discovery of the incident for successful SSH connections against EGI services, and for the originating submission host for grid jobs or virtual machines
  • 3 months prior to the discovery of the incident for all other EGI related services.

Revision History

Version Authors Date Comments
1 Mingchao Ma/STFC 28/06/2011 First Draft based on MS405, a few minor update, added appendix C and appendix D
2 Mingchao Ma/STFC 28/07/2011 Addressed comments from Dorine
3 Mingchao Ma/STFC 11/10/2011 Corrected a reference in the incident response check list (appendix D)
4 Sven Gabriel/Nikhef/FOM 26/05/2015 Adjustments for IR in EGI Cloud environments
5 Vincent Brillault/CERN 25/02/2016 Adapted to the wiki format, several adjustment