Cloud SAM tests

From EGIWiki
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Tools menu: Main page Instructions for developers AAI Proxy Accounting Portal Accounting Repository AppDB ARGO GGUS GOCDB
Message brokers Licenses OTAGs Operations Portal Perun EGI Collaboration tools LToS EGI Workload Manager


IMPORTANT: Description of metrics is not maintained on this page anymore. Please use POEM directly to find relevant information:

Alert.png This article is Deprecated and should no longer be used, but is still available for reasons of reference.



This table lists tests used for monitoring fedcloud resources tools. Tests are executed on central ARGO instances:

Alarms for these tests are opened directly in the Operations Portal Dashboard. POEM profile used on this instance is CLOUD-MON.

Nagios test Frequency Description
eu.egi.*-IGTF 24 hours Probe check CA distribution installed at TLS-enabled endpoint by retrieving the list of CA DNs available. List of available CA DNs is compared to the valid list (http://repository.egi.eu/sw/production/cas/1/current/meta/ca-policy-egi-core.subjectdn) and obsolete list (http://repository.egi.eu/sw/production/cas/1/current/meta/ca-policy-egi-core.obsoleted-subjectdn) both for current and previous official versions. In case previous version is detected probe will return CRITICAL if the official version is older than 8 days and WARNING if the official version is older than 3 days. In case discrepancies are found with the both previous and current official versions, probe will return CRITICAL. In case endpoint is not TLS-enabled probe will return OK.

In case of OCCI endpoints probe will check if the endpoint returns HTTP 401 unauthorized response with WWW-Authenticate header set to appropriate Keystone server. CA check will be performed against the Keystone server. This modification was needed because some endpoints use HTTPS but do not require client authentication and therefore list of DNs is not returned.

eu.egi.cloud.APEL-Pub 12 hours Check looks at the http://goc-accounting.grid-support.ac.uk/cloudtest/cloudsites2.html and checks if the site is there. It also checks lastupdate field and raise:
  • WARNING: if lastupdate is older than 7 days
  • CRITICAL: if lastupdate is older than 30 days

When searching the web site probe uses name provided in the URL in GOCDB entry.

eu.egi.cloud.CDMI-CRUD 1 h Firstly, endpoint is being asked for a nearby keystone server. It returns HTTP 401 unauthorized response with WWW-Authenticate header set to appropriate keystone server which is then asked for a scoped keystone token for an “ops” tenant. Token will be used in a further HTTP queries on an endpoint that are part of CDMI API.

At it’s core, it’s a simple CRUD test that, with the help of CDMI API, performs creation of a container and data object with some random value on an endpoint. Afterward it tries to fetch object and update it with some new random value performing a data integrity check. If all prior tests are successful, then it tries to delete the allocated resources.

eu.egi.cloud.OCCI-VM 1 h Probe uses OCCI interface to create VM, waits for the VM to become active and then destroys it. In order for the probe to work properly sites need to provide information in the GOCDB URL. URL format is defined here.
eu.egi.cloud.OCCI-Context 1 h Probe uses OCCI interface to check:
  • checks the existence of OCCI Infra kinds (compute, network, storage)
  • checks the existence of OCCI Infra mixins (os_tpl, resource_tpl)
  • checks the existence of our contextualization mixins (user_data, public_key).
eu.egi.cloud.OCCI-AppDB-Sync 1 h Probe checks consistency between a published ops VO-wide image list and appliances available at the site (via AppDB).

Probe runs command:

nagios-promoo appdb sync --endpoint $OCCI_ENDPOINT --token $APPDB_TOKEN --vo ops

Further info: https://appdb.egi.eu/store/software/nagios.promoo/releases/1.3.x/

eu.egi.cloud.OCCI-Categories 1 h Probe uses OCCI interface to check:
  • Run a probe checking for mandatory OCCI category definitions
  • Verify declared REST locations for INFRA resources

Probe runs command:

nagios-promoo occi categories --endpoint $OCCI_ENDPOINT --check-location

Further info: https://appdb.egi.eu/store/software/nagios.promoo/releases/1.3.x/

eu.egi.cloud.OpenStack-VM 1 h Probe uses OpenStack native APIs to:
  • Discover the image identifier of the EGI monitoring image
  • Discover the smallest flavour that fits the image
  • Discover available networks
  • Create a VM with the discovered image, flavour and network
  • Wait for the VM to become active
  • Destroy the VM

In order for the probe to work properly sites need to provide Keystone URL in the GOCDB URL. Probe source is available here: https://github.com/ARGOeu/nagios-plugins-fedcloud. Command executed is:

/usr/libexec/argo-monitoring/probes/fedcloud/novaprobe.py --endpoint $KEYSTONE_ENDPOINT --appdb-image 1017
eu.egi.cloud.OpenStack-VM-OIDC 1 h Probe runs the same test as eu.egi.cloud.OpenStack-VM with OIDC token.
eu.egi.cloud.OpenStack-VM-VOMS-OIDC 1 h Probe runs the same test as eu.egi.cloud.OpenStack-VM with OIDC token and failover to X509 credential.
org.nagios.Broker-TCP 15 min Checks if Broker port (defined in GOCDB URL) is open. Additional documentation: http://nagiosplugins.org/man/check_tcp.
org.nagios.CDMI-TCP 15 min Checks if CDMI port (defined in GOCDB URL) is open. Additional documentation: http://nagiosplugins.org/man/check_tcp.
org.nagios.Keystone-TCP 15 min Checks if Keystone port (defined in GOCDB URL) is open. Additional documentation: http://nagiosplugins.org/man/check_tcp.
org.nagios.OCCI-TCP 15 min Checks if OCCI port (defined in GOCDB URL) is open. Additional documentation: http://nagiosplugins.org/man/check_tcp.