EGI CSIRT:Security challenges
- 1 What is expecting from sites ?
- 2 General Information on SSC
- 3 Security Drills Framework
What is expecting from sites ?
Information to be gathered at the sites
For an initial response and first directions answers to the following questions might be useful.
- Are there any other suspicious connections open? If so to which IPs - Is network monitoring data (e.g. netflows) available?
- Does the process belong to a batch job or an interactive login? - From where was the login/job submission done? - In case it is a Grid-Job, the following questions are important: -To which VO is the user/certificate affiliated? - Which grid-certificates (DN) are involved in this test-incident? # Example: DN-1: CN=John Doe, O=<SomeInstitute>,O=<Something>, ..." - Since when were the jobs running? # Example: YYYY:MM:DD hh:mm Date:
The sites should provide the security teams asap with this information at latest within one working day. The time needed to pass this information to EGI-CSIRT by replying to the alarm mail will be measured and evaluated. Replying to the alarm mail will automatically use the above sketched RTIR system.
Evaluation - Report generation
General Information on SSC
Terms of reference
The Security Service Challenges (SSC) are executed under the authority of the EGI-CSIRT.
The goal of the EGI-CSIRT Security Drills, is to investigate whether sufficient information is available to be able conduct an audit trace as part of an incident response, and to ensure that appropriate communications channels are available.
More specifically, the SSC will address the following security aspects of the Grid:
- Compliance with, and understanding of, the Audit Requirements for EGI;
- Compliance with, and understanding of, the EGI-CSIRT Incident Handling and Response Guide;
- The overall execution of the incident handling procedures.
EGI-CSIRT conducts the SSC semi-regularly across all the EGI Grid Sites. The particulars of the challenge will evolve over time. Additional information, including historical information, about the SSC is available from the SSC Wiki. Outline of the Challenge
The test job is a program which is launched by means of the published methods applicable to the Grid. It will be submitted under unobtrusive credentials that will be retained for the duration of the test. After 72 wall-clock-hours, the job will terminate itself. During this time, the job will mostly lie dormant, but it will wake up occasionally to report its presence through an out-of-band logging channel. While the job is active, the Security Service at the target Site is asked to make certain investigations and to take actions. The events are recorded. Launching the Challenge
The Test Operator (TOP) submits a Grid job to a Computing Element (CE) located on the Site under test. The job is submitted under valid credentials, i.e. Distinguished Name (DN) and Virtual Organization (VO). Requirements on the challenged sites
The sites contacted for a challenge are asked to follow the normal security incident response procedure, and react exactly as if the incident was real, with the two following exceptions:
1. No sanctions must be applied against the Virtual Organization (VO) that was used to submit the job. 2. All "multi-destination" alerts must be addressed to the e-mail list which has been designated for the test: firstname.lastname@example.org DO NOT use: email@example.com for Security Service Challenges. Instead, insert the originally intended "multi-destination" address(es) in the body of your message.
Alerting and Reporting
All e-mail exchanges related to the SSC incident, MUST:
1. Include the text “[THIS IS A TEST][<NAME OF YOUR SITE>]” in the “Subject” field; 2. Show the following text as the first part of the message: This e-mail is an alert about a TEST incident. It is executed under the supervision of EGEE/LCG Operational Security Coordination Team (OSCT) as part of the OSCT Security Services Challenge (SSC). More information about the SSC can be found at https://wiki.egi.eu/wiki/EGI_CSIRT:SSC You are asked to following the normal incident procedure, but you MUST_NOT take any collective action against the VO of the offending user. 3. When the out-of-band log shows that the job is established at the designated CE, then TOP alerts the Computer Security Incident Response Team (CSIRT) at the target Site with the following message: Consider any activity from the following user as malicious. The distinguished name (DN) of the user is: <The user DN> Please handle this test incident according to the normal incident response procedure with the two exceptions listed below: 1. No sanctions must be applied against the Virtual Organization (VO) that was used to submit the job. 2. All "multi-destination" alerts must be addressed to the e-mail list which has been designated for the test: firstname.lastname@example.org DO NOT use email@example.com for Security Challenges. Instead, insert the originally intended "multi-destination" address(es) in the body of your message.
The CSIRT at the target Site is expected to respond appropriately.
Post processing, clean up
As part of the incident handling, Grid authorizations may have been withdrawn from the DN that was used to submit the job. When the incident response procedure is complete, TOP will explicitly request restoration of any such authorizations to their original state.
When the challenge has been completed on a representative number of Sites, TOP will ask for de-briefing input from the participating Sites. Material submitted will be used to edit a report. The report will be circulated to the contributors for comments before being presented to the EGI-CSIRT.
|Parts of this article came from the OSCT wiki, this was written by the EGEE Operational Security Coordination Team.|
Security Drills Framework
The framework has been developped to automate the operation of EGI security challenges.
The release of may 2011 contains: the panda frameworkk for job submission, a prototype of the new EGI-CSIRT ticketing system based on RTIR.
RT-IR: Alerting - Reporting - Communication
This SSC5 workflow diagram is a subset of the EGI-CSIRT Incident-Response flowchart, adapted for using RTIR in the handling process of the SSC5 class incidents. As you can see the SSC Launcher is opening a Incident and a Investigation for alarming a site about an issue that has been found at a particular site. According to our poicies and following the EGI Incident Response Procedure, the site should react promptly, and try to discover the activities on the sites resources related to the reported incident. As soon as all required information is gathered, the site security officer should report this information back to EGI- and NGI-CSIRT, as its NGI Security officer will be on the Cc of the Investigation, the site will be able to ask for help, in case of its needs. The NGI-Security officer should take care that the proper actions are taken at the sites in his NGI and that the information flow to/from EGI-CSIRT is working properly.
Analysing the information sent by the site, the EGI-CSIRT Incident-Handler will take the proper actions for mitigating the incident, and solving it. For example, the EGI-CSIRT member could contact the CERT responsible for the WMS (a site hosting this service, or a VO running a VO-Job-submission-framework) to determine if the malicious job run at the site, was also submitted to other Grid sites of the EGI, and possibly other Grid infrastructures. In case of yes, the EGI-CSIRT Incident-Handler will be contact the likely affected sites and its NGI-Security-Officer, asking to react accordingly. He/She could also ask for taking the needed actions for bouncing the DN of the job from the VO. In a real case scenario all sites would be asked to check for activities related to a particular user certificate and adapt their user management settings accordingly.