New Availability Reporting
Use case 1: NGI availability
We would like to extend the availability OPS reporting system to measure the performance of the services operated by an NGI. For example:
- the VOMS service
- the top-BDII service
- the WMS service
- the operational services including
- the NGI SAM service
- the accounting portal and repositories (where available)
- the NGI operations dashboard (where available)
- the NGI helpdesk (where available)
VOMS, top-BDII, WMS etc. when deployed in cluster mode are a logical service comprising N physical instances (tB1, tB2, ..., tBN):
- each deployed potentially in a different physical site.
- the node can be part of a different NGI. For example: the SAM service of NGI_CH is actually operated by NGI_DE.
NGI middleware availability
The NGI middleare logical site includes all core middleware services operated by the NGI: WMS, top-BDII, VOMS etc. regardless of their physical location.
For example top-BDII is OK IF (tB1 is OK) OR (tB2 is ok) OR .... (tBN is ok).
NGI middleware service is UP IF (VOMS is UP) AND (top-BDII is UP) AND .... (WMS is UP)
NGI operations tool availability
The NGI operations logical site includes all operations tools operated by the NGI: helpdesk, SAM, ops dashboard etc.
NGI operations services is UP IF (Helpdesk is UP) AND ... AND (SAM is UP)
Use case 2: EGI.eu availability
We would like to measure the overall availability of EGI.eu services.
For example EGI.eu operations service is UP if (GGUS is UP) AND (Operations Portal is UP) AND ... AND (GOCDB is UP)
NGI is UP if (operations services is UP) AND (middleware services is UP)
Use case 3
The usage of the "logical" site could be used to represent a distributed Resource Centre (like the NDGF T1). At the moment it is a single site in GOCDB associated to country X. This use case is mentioned here for the records of the discussion. It is not a crticial use case for the moment.