Jump to navigation Jump to search
- Clean up formatting DONE 2012-09-19 --Ocalladw 16:44, 20 September 2012 (CEST)
- Deal with data management
Original V1 text of this policy
1 Grid Policy on the Handling of User-Level Job Accounting Data: Introduction
This document presents the minimum requirements and policy framework for the handling of user-level accounting data created, stored, transmitted, processed and analysed as a result of the execution of jobs on the Grid.
Each job executed on a Grid resource produces an accounting record. The schema for this accounting record is based on GFD.98 (http://www.ogf.org/documents/GFD.98.pdf) as defined by the Usage Record Working Group (UR-WG) in OGF. This schema includes the X.509 certificate Subject Distinguished Name of the submitting user and this information is therefore classified as personal information and has to be properly handled to meet the legal requirements related to data protection.
This document addresses the handling of accounting data resulting from the execution of jobs on the Grid. It does not cover any other forms of accounting or monitoring data.
The document is aimed at EU-based Grids and more specifically at:
- Site Managers to allow them to share user-level job accounting with the Grid, for the purposes described below.
- VO Resource Managers and the Grid Operations personnel who have access to the user-level accounting from more than one site.
3 The Purpose and Reasons for the Collection and Storage of the Information
Accounting data is required:
- For the VOs to find out how much of their resource allocation has been used in total and by which group or role within the VO. This allows the VO to monitor, plan and control the use of their resource allocation.
- For the sites to find out how the resources they provide to the Grid are being used and by whom. This allows them to (re-)assign their resources properly and plan purchases in a timely fashion.
- By the Grid management and/or VOs to find out if any pledged resources have indeed been provided and properly used by the VOs. This allows for better monitoring, control and planning.
- For VOs, Grid Management and Sites to report on usage to their funding bodies.
- For operational and scientific analysis only anonymised and aggregated accounting data will be used.
User-level accounting is required:
- For the VOs to understand and control how many and which individuals within the VO, group or role are using resources.
- For VOs, Grid Management and Sites to report on anonymised and aggregated statistics to their funding bodies.
- For Grid Operations during operational troubleshooting and debugging.
- For Grid Security Operations in forensic analysis of security incidents.
All other uses of the accounting data are forbidden.
4 Accounting Data Storage
Each site collects and stores an accounting record for each job executed at their site. These records are stored locally at the site according to national data privacy laws.
Each site is responsible for sending its accounting records on a regular basis, e.g. daily, with at least user DNs encrypted in transport, to a central data base defined by the Grid. This database is located at an Accounting Data Centre (ADC). The location of the ADC needs to be chosen carefully according to data privacy laws.
The ADC securely stores all the individual job records from each of the sites submitting such records.
The ADC generates and securely stores aggregated statistics from the data. Levels of aggregation are defined by the Grid, e.g. per Site, per VO, per Month, per User, etc.
There may be more than one ADC in the Grid, e.g. one per country. Accounting records and aggregated data may be transferred between ADCs within a Grid or between ADCs belonging to different Grids, providing both Grids have adopted this policy. User DNs must always be encrypted in transport between ADCs. Whenever this policy refers to "the ADC" this should be interpreted to refer to all ADCs under the control of the Grid.
The specification of which accounting data needs to be transferred to which ADC, and the various access control requirements, is subject to agreement between the Grid and the VO.
5 Informing the User
The user is informed about the collection and handling of accounting data during their first registration or subsequent renewal with their VO. During registration users must accept the conditions of the Grid Acceptable Use Policy (AUP). This AUP has a clause on the use of logged information.
6 Control of and Access Rights to the Information
The local accounting record for a job is controlled by the site at which the job is executed. The submitting user’s DN may be unencrypted in this information and access is restricted to the local resource administrators or other authorised persons.
Copies of individual job accounting records and aggregated data in the ADC central database are controlled by the Grid.
ADC staff, according to their role or job responsibilities, may be authorised to have access to the individual job records. All other persons have no access to the database.
Access to aggregated data at the VO level may be public information if the VO agrees but otherwise appropriate access control will be required.
Access to VO group/role aggregated data, if required by the VO, is restricted to members of that VO.
The aggregated data of a user must be properly protected. All user data in the database is anonymous in the sense that the user data cannot be connected to a user name. Access to this anonymised data, if requested by the VO, must be restricted to members of that VO.
Access to a portal that allows the decoding of the anonymised name into a person’s DN is restricted to individuals in the VO appointed to be VO Resource Managers.
Appropriately authorised individuals of the Grid Operations and Grid Security Operations teams have read access to the user-level accounting data for the purposes of operational troubleshooting, debugging and/or security incident response.
7 The Period of Retention
The Sites are responsible for deleting the local accounting records according to local personal data retention policy. This needs to be long enough to ensure that all records have been successfully transferred to the ADC database.
The ADC is responsible for deleting the copies of the individual accounting records in the central database, or for removing or anonymising personal identifying information, e.g. the CommonName or e-mail components of subject DNs, from these records, at the latest one year after receipt of the data in the ADC. Personal identifying information, e.g. the CommonName component, contained in aggregated data must be treated in the same way.
8 Publication of the Information
The ADC publishes accounting data on its web portal. Appropriate access control for all published data must be agreed between the Grid and the VO.
The ADC provides aggregated data on CPU usage per Group and Role as defined in the Virtual Organization Management Service.
The ADC publishes user-level accounting data to the authorised VO Resource Managers, if requested by the VO.
9 Protection of the Information
The Site managers and resource administrators are responsible for the secure storage of the local accounting data. Appropriate access control mechanisms must be used to prevent unauthorised access.
The ADC must implement appropriate technical and organisational measures to protect the accounting database and the accounting web portal.
10 Transfer of the Information Across International Borders
The individual job accounting records are transferred between the sites and the central database at the ADC. Many of these transfers cross international borders. Personal identifying information must be encrypted before it is sent across the network, so it is not possible to derive the identity of the user. Multiple records from jobs from the same user contain different cipher text for the DN.
Individual job accounting records and/or aggregated data may be transferred between ADCs, if this is required by the Grid(s). Many of these transfers will cross international borders. Personal identifying information must be encrypted before it is sent across the network, so it is not possible to derive the identity of the user.
The ADC database must be located in a country that has adequate protection of personal data as defined by the EU Directive 95/46/EC (http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:EN:HTML), e.g. the European Union, The European Economic Area and some other countries. There must be no transfer of this data to countries outside of this area.
11 Access by the User to their own Information
A user has the right to access her/his own accounting records and a similar mechanism must be implemented as for the VO Resource Manager to access the aggregated user information. It must also be possible to correct that data if she/he can justify/prove that the stored data is wrong. In that case she/he must contact the corresponding Site Manager who is responsible for notifying any third party to whom the data has been sent, including the ADC. In case of agreement the data in the database must be corrected.
12 Signature of an Agreement Related to the Handling of the Information
Any person having access to the user-level details from more than one site must sign a copy of this document to confirm that she/he understands what can and can not be done with the user related information from the database.
Information that they gain from such access must not be disclosed to anyone who does not have legitimate access to such data.