Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "GOCDB/Transfer Mechanism"

From EGIWiki
Jump to navigation Jump to search
(Replaced content with 'Page moved, see: GOCDB/Documentation_Index')
Line 1: Line 1:
= Introduction<br>  =
Page moved, see: [[GOCDB/Documentation_Index]]
 
The mechanism to transfer data from a regional GOCDB&nbsp;to a central GOCDB is separated into two parts: regional instance tasks and central instance tasks.<br>
 
= Regional Instance Tasks<br>  =
 
Whenever a CRUD (create, replace, update, delete) operation is performed on a fundamental GOCDB&nbsp;object (site, service endpoint, downtime, user, user role) the regional GOCDB&nbsp;will record the event. This record will include the object Id, the type of operation&nbsp;(CRUD)&nbsp;and the date and time at which the object was modified. The record will also show whether the object has been successfully synchronised with the central GOCDB via a synchronisation boolean. By default the synchronisation field will be false.<br>
 
This table will be used to create a PI&nbsp;query exposing EGI data that has changed in the regional GOCDB including the date the change was made. This data will be in chronological order (oldest first):  
 
*gocdbpi/?method=synchronise_data&amp;scope=EGI
 
The central GOCDB&nbsp;will periodically read from this queries and acknowledge successfully synchronised data by posting the time of the last successfully synchronised object back to the regional portal.<br>
 
e.g. the central GOCDB&nbsp;will perform an HTTP&nbsp;GET&nbsp;of https://regional.gocdb.ngi.com/gocdbpi/confirm_sync.php?last_date=11.12.01%2010-07-2012&amp;scope=EGI
 
When the regional GOCDB receives a request for this URL it will change the synchronised boolean in the synchronisation table as true provided that the modification date was before or at the time provided in the last_date field and that the object is within the passed scope. This URL&nbsp;will be secured to only accept requests if the request has identified itself with the central GOCDB&nbsp;certificate.
 
= Central Instance Tasks  =
 
*Periodically poll all known regional instances
*Check remote certificate is who we think it should be
*Retrieve all unsynchronised data
*(Read this in chunks to avoid loading all XML&nbsp;into memory at once)
*Validate Data
**Check that each object's primary_key (e.g. 212G3) has a grid id (G3) that matches the cert they present (e.g. grid ID 3 = france = in2p3 GOCDB cert)
**Check each item (site, SE, downtime, user, user role) is formatted correctly as per&nbsp;XML&nbsp;schema
**Validate fields against gocdb_schema.xml (potentially we can do this using the XML schema)
*Insert each item (site, se, downtime, user, user role) atomically
**Note the modification date of the object we're inserting<br>
**Roll operation back if it fails
***HTTP&nbsp;GET https://regional.gocdb.ngi.com/gocdbpi/confirm_sync.php?last_date=10.00.12%2011.12.01&amp;scope=EGI (with the modification date of the last successfully inserted item)
***Stop the automatic sync for this regional GOCDB
***Send alert e-mail
***Stop the process
*HTTP&nbsp;GET https://regional.gocdb.ngi.com/gocdbpi/confirm_sync.php?last_date=10.00.12%2011.12.01&amp;scope=EGI (with the modification date of the last successfully inserted item)
 
= Security  =
 
When the central GOCDB synchronises with a regional GOCDB:
 
*The Central will check that the regional GOCDB&nbsp;has a trusted certificate
*If the certificate is trusted the DN is inspected against a list of DNs allowed to publish data
*Each of these allowed DNs will be associated with a grid ID (e.g. G2, G6 etc)
*When reading each block of data (site, SE, downtime, user, role) the central GOCDB&nbsp;will check that the data's primary key corresponds to the allowed grid ID for the given DN&nbsp;(e.g. /C=UK/O=eScience/OU=CLRC/L=RAL/CN=/Regional GOCDB will only be allowed to publish data with a primary key ending in G2 e.g. 373G2).

Revision as of 12:52, 23 June 2011

Page moved, see: GOCDB/Documentation_Index