GOCDB/Release4/Development/MultipleEndpointsPerService
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
GOC DB menu: | Home • | Documentation Index • |
<< Back to GOCDB/Release4/Development
Multiple Endpoints
Introduction
Define multiple Endpoints per Service. Original requirements:
- RT: https://rt.egi.eu/rt/Ticket/Display.html?id=3347
- GGUS: https://ggus.eu/ws/ticket_info.php?ticket=93966
- JRA1 discussion and detailed proposal: https://indico.egi.eu/indico/conferenceDisplay.py?confId=1922
- https://indico.cern.ch/getFile.py/access?resId=1&materialId=minutes&confId=278660
GOCDB could be extended so that a single service could define multiple Endpoint objects to allow:
- Multiple GRIS endpoint URLs per service. This will allow the Top-BDII to directly retrieve information about the endpoint.
- Separating tape from disk endpoints in a single SRM service. This is required to put selected endpoints of the service into downtime.
To address these requirements, we propose adding support for multiple <ENDPOINT> objects/elements per Service. Each Endpoint can define a different URL that is related to the service, for example, one Endpoint could define the actual service URL, a second could define the GRIS ldap URL, whilst a third could define an admin portal URL for the service (and so on). The Endpoint.InterfaceName (taken from GLUE2) would be used to distinguish between the different service endpoint types.
Implementation
- Add zero or more <ENDPOINT> objects per service (based on the GLUE2 Endpoint entity).
- When creating a downtime, an extra step would be required to select *which* Endpoints from the selected Services should be put into downtime (all, some or zero). All the services' endpoints would be selected by default allowing specific endpoints to be deselected as required.
- The GOCDB-PI output would be extended for nesting multiple child <ENDPOINT> elements (see below).
- Each <ENDPOINT> nests child <URL> and <InterfaceName> elements (and other GLUE2 attributes).
- Core InterfaceName values would be documented similar to service types.
It would be enforced that a Service always has at least one endpoint - deletion of a Service's last endpoint would be dissallowed.- We need to maintain backward compatibility for EGI/SAM by maintaining a single/root 'Service->URL' value that is rendered in the PI:
- To maintain the single root Service->URL, we need a one time DB update:
- Create a new one-to-many join between Service and Downtime and for each existing Service, populate this relationship for the existing data.
- For each Service entity create a new URL field (Service->URL) and copy up the existing (single/hidden) endpoint->url value into this field.
* When a new service is created, a single endpoint is always created (the endpoint->interfaceName would be the same as the serviceType):
- If the user provides a Service->URL value when creating the Service, this value is also used to seed the endpoint->url.
- If the user doesn’t provide a Service->URL value(perfectly valid), we add an empty endpoint with no url (this empty endpoint is still required so we can link downtimes to the service).
- At least one endpoint is required per Service (disallow deletion of Service's last endpoint)
- Endpoints can be deleted from a service, even the endpoints that are linked to (historic) downtimes. This removes the Endpoint-to-Downtime association. However, the Service will always remain linked to the Downtime because of the direct Service-to-Downtime association.
- When a Service is deleted, associations with all linked downtimes are also deleted. In addition, downtimes that are orphaned are also deleted (downtimes that are linked to other services are not deleted).
Testing
- Deployed on gocdb-test instance for testing (note, gocdt-test data is stale):
- PI: https://gocdb-test.esc.rl.ac.uk/gocdbpi/public/?method=get_service_group (substitute PI method)
- Portal: https://gocdb-test.esc.rl.ac.uk/portal
- Period of acceptance testing required first before deployment to production
PI Changes
- Changes to PI methods and output are detailed below.
get_service_endpoint, get_service
- get_service is an alias to get_service_endpoint
- Current XML output: https://wiki.egi.eu/wiki/GOCDB/PI/get_service_endpoint_method
- Rename the root <SERVICE_ENDPOINT> element to <SERVICE>.
- Add new <ENDPOINTS/> element
<?xml version="1.0" encoding="UTF-8"?>
<results>
<SERVICE PRIMARY_KEY="50257G0"> <!-- RENAME. <SERVICE_ENDPOINT> element to <SERVICE> -->
<PRIMARY_KEY>50257G0</PRIMARY_KEY>
<HOSTNAME>dgiref-globus.fzk.de</HOSTNAME>
<GOCDB_PORTAL_URL>https://goc.egi.eu/portal/....</GOCDB_PORTAL_URL>
<HOST_OS>SL5</HOST_OS>
<BETA>N</BETA>
<SERVICE_TYPE>SRM</SERVICE_TYPE>
<CORE></CORE>
<IN_PRODUCTION>Y</IN_PRODUCTION>
<NODE_MONITORED>Y</NODE_MONITORED>
<SITENAME>SomeSITE</SITENAME>
<COUNTRY_NAME>SomeLand</COUNTRY_NAME>
<COUNTRY_CODE>XX</COUNTRY_CODE>
<ROC_NAME>NGI_XX</ROC_NAME>
<URL>https://some.serviceurl.eu:8443/services/se<URL/> <!-- Maintain existing SE URL element for SAM/OpsPortal notifications -->
<ENDPOINTS> <!-- NEW. <ENDPOINTS> element wraps zero or more new <ENDPOINT> elements per Service -->
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>ldap://sBDII.grid.openu.ac.il:2170/mds-vo-name=LCG-IL-OU,o=grid</URL>
<InterfaceName>RIS</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.nearline.url</URL>
<InterfaceName>SRM.nearline</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.online.url</URL>
<InterfaceName>SRM.online</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
</ENDPOINTS>
<EXTENSIONS>
<EXTENSION>
<LOCAL_ID>1</LOCAL_ID>
<KEY>TEST_CHARGE</KEY>
<VALUE>10</VALUE>
</EXTENSION>
</EXTENSIONS>
</SERVICE>
</results>
get_downtime
- Current XML output: https://wiki.egi.eu/wiki/GOCDB/PI/get_downtime_method
- Rename Downtime element's <ENDPOINT> element to <SE_ENDPOINT>
- Add New <AFFECTED_ENDPOINTS/> element
<?xml version="1.0" encoding="UTF-8"?>
<results>
<DOWNTIME ID="1" PRIMARY_KEY="10G0" CLASSIFICATION="UNSCHEDULED">
<PRIMARY_KEY>10G0</PRIMARY_KEY>
<HOSTNAME>somewhere.dl.ac.uk</HOSTNAME>
<SERVICE_TYPE>SE2</SERVICE_TYPE>
<SE_ENDPOINT>service.dl.ac.ukSE2</SE_ENDPOINT> <!-- RENAME. <ENDPOINT> to <SE_ENDPOINT> -->
<HOSTED_BY>TestSite</HOSTED_BY>
<GOCDB_PORTAL_URL>https://goc.egi.eu/portal/index.php?Page_Type=Downtimeampid=1</GOCDB_PORTAL_URL>
<AFFECTED_ENDPOINTS> <!-- NEW. All SEs affected by DT (*may not* inc. all endpoints of a service) -->
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.nearline.url</URL>
<InterfaceName>SRM.nearline</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.online.url</URL>
<InterfaceName>SRM.online</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
</AFFECTED_ENDPOINTS>
<SEVERITY>OUTAGE</SEVERITY>
<DESCRIPTION>sample</DESCRIPTION>
<INSERT_DATE>1384526939</INSERT_DATE>
<START_DATE>1384531200</START_DATE>
<END_DATE>1384621200</END_DATE>
<FORMATED_START_DATE>2013-11-15 16:00</FORMATED_START_DATE>
<FORMATED_END_DATE>2013-11-16 17:00</FORMATED_END_DATE>
</DOWNTIME>
</results>
get_downtime_to_broadcast
- Current XML output: https://wiki.egi.eu/wiki/GOCDB/PI/get_downtime_to_broadcast_method
- Add New <AFFECTED_ENDPOINTS> element
<?xml version="1.0"?>
<results>
<DOWNTIME ID="57205437" PRIMARY_KEY="14101G0" CLASSIFICATION="SCHEDULED">
<PRIMARY_KEY>14101G0</PRIMARY_KEY>
<SITENAME>wuppertalprod</SITENAME>
<HOSTNAME/>
<SERVICE_TYPE/>
<HOSTED_BY>TestSite</HOSTED_BY>
<SEVERITY>OUTAGE</SEVERITY>
<DESCRIPTION>dCache upgrade</DESCRIPTION>
<GOCDB_PORTAL_URL>https://goc.egi.eu/portal/index.php?Page_Type=Downtimeampid=1</GOCDB_PORTAL_URL>
<AFFECTED_ENDPOINTS> <!-- NEW. All SEs affected by DT (*may not* inc. all endpoints of a service) -->
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.nearline.url</URL>
<InterfaceName>SRM.nearline</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.online.url</URL>
<InterfaceName>SRM.online</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
</AFFECTED_ENDPOINTS>
<INSERT_DATE>1263908942</INSERT_DATE>
<START_DATE>1264154400</START_DATE>
<END_DATE>1264158000</END_DATE>
<REMINDER_START_DOWNTIME>3155760000</REMINDER_START_DOWNTIME>
<BROADCASTING_START_DOWNTIME/>
</DOWNTIME>
</results>
get_downtime_nested_services
- Current XML output: https://wiki.egi.eu/wiki/GOCDB/PI/get_downtime_nested_services_method
- Rename Service's <ENDPOINT> element to <SE_ENDPOINT>
- Add New <AFFECTED_ENDPOINTS> to Service
<?xml version="1.0" encoding="UTF-8"?>
<results>
<DOWNTIME ID="1" PRIMARY_KEY="10G0" CLASSIFICATION="UNSCHEDULED">
<SEVERITY>OUTAGE</SEVERITY>
<DESCRIPTION>sample</DESCRIPTION>
<INSERT_DATE>1384526939</INSERT_DATE>
<START_DATE>1384531200</START_DATE>
<END_DATE>1384621200</END_DATE>
<FORMATED_START_DATE>2013-11-15 16:00</FORMATED_START_DATE>
<FORMATED_END_DATE>2013-11-16 17:00</FORMATED_END_DATE>
<GOCDB_PORTAL_URL>https://goc.egi.eu/portal/index.php?Page_Type=DowntimeAMPid=1</GOCDB_PORTAL_URL>
<SERVICES>
<SERVICE>
<PRIMARY_KEY>1</PRIMARY_KEY>
<HOSTNAME>somehost.dl.ac.uk</HOSTNAME>
<SERVICE_TYPE>SE2</SERVICE_TYPE>
<SE_ENDPOINT>service.dl.ac.ukSE2</SE_ENDPOINT> <!-- RENAME. <ENDPOINT> to <SE_ENDPOINT> -->
<HOSTED_BY>TestSite</HOSTED_BY>
<AFFECTED_ENDPOINTS> <!-- NEW. SEs affected by DT (*may not* inc. all endpoints of a service) -->
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.nearline.url</URL>
<InterfaceName>SRM.nearline</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.online.url</URL>
<InterfaceName>SRM.online</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
</AFFECTED_ENDPOINTS>
</SERVICE>
<SERVICE>
<PRIMARY_KEY>2</PRIMARY_KEY>
<HOSTNAME>somehost2.dl.ac.uk</HOSTNAME>
<SERVICE_TYPE>SE2</SERVICE_TYPE>
<SE_ENDPOINT>somehost2.dl.ac.ukSE2</SE_ENDPOINT> <!-- RENAME. <ENDPOINT> to <SE_ENDPOINT> -->
<HOSTED_BY>TestSite2</HOSTED_BY>
<AFFECTED_ENDPOINTS> <!-- NEW. All SEs affected by DT (*may not* inc. all endpoints of a service)-->
<ENDPOINT>
<ID>1234</ID>
<NAME>endpointname</NAME>
<EXTENSIONS/>
<URL>some.srm.online.url</URL>
<InterfaceName>SRM.online</InterfaceName>
<!-- To Add new GLUE2 type attributes here -->
</ENDPOINT>
</AFFECTED_ENDPOINTS>
</SERVICE>
</SERVICES>
</DOWNTIME>
</results>
get_service_group
- Current XML output: https://wiki.egi.eu/wiki/GOCDB/PI/get_service_group
<?xml version="1.0" encoding="UTF-8"?>
<results>
<SERVICE_GROUP PRIMARY_KEY="57654G0">
<NAME>OPSTOOLS</NAME>
<DESCRIPTION>All EGI Operational Tools</DESCRIPTION>
<MONITORED>Y</MONITORED>
<CONTACT_EMAIL>gocdb-admins@mailtalk.ac.uk</CONTACT_EMAIL>
<GOCDB_PORTAL_URL> https://elided </GOCDB_PORTAL_URL>
<SERVICE> <!-- RENAME. <SERVICE_ENDPOINT> to <SERVICE> -->
<HOSTNAME>goc.egi.eu</HOSTNAME>
<GOCDB_PORTAL_URL>https://elided</GOCDB_PORTAL_URL>
<SERVICE_TYPE>egi.GOCDB</SERVICE_TYPE>
<HOST_IP/>
<HOSTDN>/C=UK/O=eScience/OU=CLRC/L=RAL/CN=goc.egi.eu</HOSTDN>
<IN_PRODUCTION>Y</IN_PRODUCTION>
<NODE_MONITORED>Y</NODE_MONITORED>
<ENDPOINTS> <!-- NEW. <ENDPOINTS> wraps service's <ENDPOINT>s-->
<ENDPOINT>
<URL>some url </URL>
<InterfaceName>RIS</InterfaceName>
<EXTENSIONS/>
</ENDPOINT>
<ENDPOINT>
<URL>some endpoint url</URL>
<InterfaceName>eg.SRM.nearline</InterfaceName>
<EXTENSIONS/>
</ENDPOINT>
<ENDPOINT>
<URL>some endpoint url</URL>
<InterfaceName>eg.SRM.online</InterfaceName>
<EXTENSIONS/>
</ENDPOINT>
<ENDPOINTS>
<EXTENSIONS/>
</SERVICE>
<SERVICE>
...elided...
</SERVICE>
<EXTENSIONS/>
</SERVICE_GROUP>
</results>
Impact on other tools
- Tools such as SAM and Ops portal would also need to support multiple endpoints, i.e. for monitoring and downtime notification (an implementation in GOCDB alone wouldn’t be much use).
- We have received a mail from Marian Babik (Wed 09/05/2012 17:09) explaining that SAM needs time to develop support for multiple endpoints before this can be introduced in GOCDB.
- For detailed discussion see mins of JRA1 meeting: https://indico.egi.eu/indico/conferenceDisplay.py?confId=1922
Monitoring
SAM has to be updated to run tests against endpoint and not services. For each service SAM has to read all the endpoints registered in GOCDB and run the suitable tests according to the endpoint type (i.e. Endpoint.InterfaceName). It is not clear how we have to compute the service availability. How can we consider a service with, for example, one endpoint running and the other in downtime? Should we consider each endpoint as a service for the availability? Then, a discussion inside EGI.eu is needed to understand how to manage the multiple endpoints in a service for the availability computation.
Ops portal
The new feature has an impact on the downtime notification module. The Ops portal product team stated that this a legacy module and that should be rewritten to be updated. Considering that adding these activities in the product roadmaps is not easy at this stage (mainly for SAM that is involved in the migration of central services), we would like to know the EGI.eu position about that and the priority we should give to this requirement.