Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

NGI DE CH Operations Center:Monitoring

From EGIWiki
Jump to navigation Jump to search


NGI-DE NGI-CH Monitoring

Mailinglist

ngi-de-monitoring@lists.kit.edu

Participants

Dimitri Nilsen (KIT) Foued Jrad (KIT) Alessandro Usai (SWITCH) Andres Aeschlimann (SWITCH)

Plan for ARC Testing set up in Nagios 15.9.11

  1. Customize the file /etc/grid-monitoring/org.ndgf.conf with the NGI services.
  2. NorduGrid Logging Improvement:
    1. Edit the xrls templates files (in /usr/share/grid-monitoring/org.ndgf/<SERVICE>/xrsl) for all the services and add (gmlog = "gmlog") to them e.g.
more /usr/share/grid-monitoring/org.ndgf/lfc/xrsl

(executable = "testjob.sh")
(jobname = "lfc")
(stdout = "testjob.out")
(gmlog = "gmlog")
(stderr = "testjob.err")
(inputfiles = ("testjob.sh" "/usr/share/grid-monitoring/org.ndgf/lfc/testjob.sh")
              ("file" "%LFC_TESTFILE%"))
(outputfiles = ("testjob.out" "")("testjob.err" "")
               ("outfile" "%LFC_STORAGE_W%/%HOST%-lfc-%TIME%"))
(walltime = "15 min")
(memory = "256")</nowiki>

This will ensure that in case of error the gmlog (useful for debugging) is sent back as part of the outputsandbox. Files in /usr/share/grid-monitoring/org.ndgf : gridftp/xrsl, jobsubmit/xrsl, lfc/xrsl, rls/xrsl, srm/xrsl

  1. Data management requirements for NorduGrid: the testfile used for the LFC/SRM/GridFTP tests must be created/managed manually.

It is important that the file/LFC entry be created with the same credentials used by the Nagios monitoring node! A robot certificate will be used in the near future: checks to be carried out with it and the dCache and LFC nodes used by NGI_DE. For the time being, for the ARC tests in the test system (this only affects NGI_CH!), feronia.switch.ch (DPM) and lodur.switch.ch (LFC) are used instead.

Status 21.9.2011

Working on rocmon-fzk.gridka.de, File /etc/grid-monitoring/org.ndgf.conf Dmitry will provide LFC and SRM services to be used by the probes. ops should be working there. See Skype chat for details.

Current status:

# This file allows customization of the ARCCE probes

# The location of the Nagios plugin utilities - both
# /usr/lib/nagios/plugins/utils.sh and /usr/lib64/nagios/plugins/utils.sh
# will be checked if not set)
#NAGIOS_PLUGIN_UTILS=/usr/lib/nagios/plugins/utils.sh

# Where to report results of passive tests
#NAGIOS_CMD=/var/spool/nagios/cmd/nagios.cmd
NAGIOS_CMD=/var/nagios/rw/nagios.cmd

# Certificate and key used to generate proxy for the tests
SAM_USER_CERT=/etc/grid-monitoring/usercert.pem
SAM_USER_KEY=/etc/grid-monitoring/userkey.pem
SAM_USER_PROXY=/tmp/samproxy

# The tests will use a voms proxy from this VO
VO=ops.ndgf.org
HOST=`hostname`
SE_HOST=gridka-dCache.fzk.de
LFC_HOST=lfc-fzk.gridka.de
#RLS_HOST=grid.tsl.uu.se
SRM_HOST=gridka-dCache.fzk.de

# Add nordugrid client to PATH
export PATH=/opt/nordugrid/bin:$PATH
export LD_LIBRARY_PATH=/opt/nordugrid/lib64:/opt/nordugrid/lib:$LD_LIBRARY_PATH 

# For ARCCE-gridftp
GRIDFTP_TESTFILE=gsiftp://$SE_HOST/storage/sam-$HOST-$VO/testfile
GRIDFTP_STORAGE=gsiftp://$SE_HOST/storage/sam-$HOST-$VO 

# For ARCCE-lfc
LFC_TESTFILE=lfc://$LFC_HOST//grid/$VO/sam-$HOST-$VO/testfile
LFC_STORAGE_W=lfc://gsiftp://$SE_HOST/storage/sam-$HOST-$VO@$LFC_HOST//grid/$VO/sam-$HOST-$VO
LFC_STORAGE_R=lfc://$LFC_HOST//grid/$VO/sam-$HOST-$VO

# For ARCCE-rls
RLS_TESTFILE=rls://$RLS_HOST/sam-rls-$HOST-$VO-testfile
RLS_STORAGE_W=rls://gsiftp://$SE_HOST/storage/sam-$HOST-$VO@$RLS_HOST
RLS_STORAGE_R=rls://$RLS_HOST

# For ARCCE-srm
SRM_TESTFILE=srm://$SRM_HOST/ops/sam-$HOST-$VO/testfile
SRM_STORAGE=srm://$SRM_HOST/ops/sam-$HOST-$VO