GOCDB/Release4/Regionalisation/Transfer Mechanism

From EGIWiki
< GOCDB‎ | Release4‎ | Regionalisation
Revision as of 12:33, 18 December 2012 by Krakow (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


GOC DB menu: Home Documentation Index


Back to GOCDB/Release4/Regionalisation

Introduction

The mechanism to transfer data from a regional GOCDB to a central GOCDB is separated into two parts: regional instance tasks and central instance tasks.

Regional Instance Tasks

Whenever a CRUD (create, replace, update, delete) operation is performed on a fundamental GOCDB object (site, service endpoint, downtime, user, user role) the regional GOCDB will record the event. This record will include the object Id, the type of operation (CRUD) and the date and time at which the object was modified. The record will also show whether the object has been successfully synchronised with the central GOCDB via a synchronisation boolean. By default the synchronisation field will be false.

This table will be used to create a PI query that exposes data according the requested tag (scope) that has also changed in the regional GOCDB, including the date the change was made. This data will be in chronological order (oldest first).

  • gocdbpi/?method=synchronise&scope=EGI.EU

The central GOCDB will periodically read from this queries and acknowledge successfully synchronised data by posting the time of the last successfully synchronised object back to the regional portal.

e.g. the central GOCDB will perform an HTTP GET of https://regional.gocdb.ngi.com/gocdbpi/confirm_sync.php?last_date=11.12.01%2010-07-2012&scope=EGI.EU

When the regional GOCDB receives a request for this URL it will change the synchronised boolean in the synchronisation table as true provided that the modification date was before or at the time provided in the last_date field and that the object is within the passed scope. This URL will be secured to only accept requests if the request has identified itself with the central GOCDB certificate.

Central Instance Tasks

  • Periodically poll all known regional instances
  • Check remote certificate is who we think it should be
  • Retrieve all unsynchronised data
  • (Read this in chunks to avoid loading all XML into memory at once)
  • Validate Data
    • Check that each object's primary_key (e.g. 212G3) has a grid id (G3) that matches the cert they present (e.g. grid ID 3 = france = in2p3 GOCDB cert)
    • Check each item (site, SE, downtime, user, user role) is formatted correctly as per XML schema (XML Well formed, valid)
    • Validate fields against gocdb_schema.xml (potentially we can do this using the XML schema)
  • Insert each item (site, se, downtime, user, user role) atomically
  • HTTP GET https://regional.gocdb.ngi.com/gocdbpi/confirm_sync.php?last_date=10.00.12%2011.12.01&scope=EGI (with the modification date of the last successfully inserted item)

Security

When the central GOCDB synchronises with a regional GOCDB:

  • The Central will check that the regional GOCDB has a trusted certificate
  • If the certificate is trusted the DN is inspected against a list of DNs allowed to publish data
  • Each of these allowed DNs will be associated with a grid ID (e.g. G2, G6 etc)
  • When reading each block of data (site, SE, downtime, user, role) the central GOCDB will check that the data's primary key corresponds to the allowed grid ID for the given DN (e.g. /C=UK/O=eScience/OU=CLRC/L=RAL/CN=/Regional GOCDB will only be allowed to publish data with a primary key ending in G2 e.g. 373G2).