SAMUpdate23

From EGIWiki
Revision as of 10:18, 17 December 2014 by Eimamagi (talk | contribs) (Staged Rollout)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Tools menu: Main page Instructions for developers AAI Proxy Accounting Portal Accounting Repository AppDB ARGO GGUS GOCDB
Message brokers Licenses OTAGs Operations Portal Perun EGI Collaboration tools LToS EGI Workload Manager


Major changes

Major changes in SAM Update-23:

  • Probes are moved to the UMD-3 repository. This decision was approved by the OMB in order to enable probe developers to update probes more frequently and independently from SAM releases.
  • Removal of the SAM GridMon (sam-gridmon) and its dependencies. SAM Update-23 supports only SAM Nagios (sam-nagios). In the future version SAM GridMon will be replaced with the ARGO engine.


Detailed list of all new features and bug fixes can be found here: ARGO/SAM github- Milestone Update23.

Installation

This guide is based on the previous SAM Administration guide: [1].

Prerequisites

Install your host certificate to secure the Nagios portal:

$ ls -l /etc/grid-security/host*
-rw-r--r-- 1 root root 2286 Oct 28 19:26 /etc/grid-security/hostcert.pem
-r-------- 1 root root  887 Oct 28 19:25 /etc/grid-security/hostkey.pem
 
$ openssl x509 -in /etc/grid-security/hostcert.pem -noout -purpose | grep "SSL client"
SSL client : Yes

SELINUX needs to be disabled to proceed with the installation. If it is enabled, follow the instructions below and reboot the machine:

$ setenforce 0
$ sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

Generate MyProxy credential (only needed if you are not using robot certificates), steps should be perform on an UI box:

$ ls -l .globus/
total 16
-rw-r--r-- 1 root root 4908 Sep 18 14:44 usercert.pem
-rw------- 1 root root 4836 Sep 18 14:44 userkey.pem

$ myproxy-init -c 4320 -k NagiosRetrieve-<hostname>-<VO name> -s <MYPROXY-name> -l nagios -x -Z <host DN>

YUM repositories

OS/EPEL repos

  • Add the following config to all CentOS/SL base repositories:
exclude=mysql51*

Production

Follow the instructions for installation of UMD-3 and EPEL repositories: http://repository.egi.eu/category/umd_releases/distribution/umd-3/. In this manual we assume that priority of the UMD-3 is 1 as it is defined in the umd-release package.

Add SAM repository from here: http://repository.egi.eu/sw/production/sam/1/repofiles/sam.repo

Package installation

Perform the following installation steps:

$ yum -y install ca-policy-egi-core httpd mysql51
$ yum -y install nagios.x86_64
$ yum install sam-nagios

Configuration

SAM uses Yaim for configuration. A detailed specification of all SAM configuration parameters is available in the SAM documentation:

Check the Yaim variables changes below: SAMUpdate23#Yaim_variable_changes.

In addition, check the FAQs for common configurations and problems: [2]

Once the site-info.def is ready, run Yaim:

$ /opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n NAGIOS -n SAM_NAGIOS

Validation

Check the Nagios web interface and SAM portal are up

  • https://<hostname>/nagios
  • http://<hostname>/myegi

Check MyProxy credentials

$ nagios-run-check <hostname> hr.srce.GridProxy-Get-<VO-name>

Upgrade

Upgrade from Update-22 is fully supported and it does not require SAM box reinstall. Procedure is the following:

  • remove UMD-2 repo
 yum remove umd-release
 rm -rf /etc/yum.repos.d/UMD-2-*
  • add UMD repositories:
    • install UMD-3 repo. For the StagedRollout sites please use the repos reccomended above
 wget http://repository.egi.eu/sw/production/umd/3/sl5/x86_64/updates/umd-release-3.0.1-1.el5.noarch.rpm
 yum --nogpgcheck localinstall umd-release-3.0.1-1.el5.noarch.rpm
  • add the following config to all EPEL base repositories:
 exclude=perl-DateTime
  • update everything
 yum update
  • configuration
 /opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n NAGIOS -n SAM_NAGIOS

Upgrade from release older than Update-22 is not supported and it requires clean installation.

Package changes

Updated packages from the SAM repo:

  • atp-1.27.19-1.el5.noarch.rpm
  • grid-monitoring-config-gen-0.95.0-1.el5
  • grid-monitoring-probes-eu.egi.sec-1.0.11-24.el5
  • glite-yaim-nagios-1.11.3-1.el5
  • msg-nagios-bridge-1.1.0-1.el5
  • mrs-1.8.0-1.el5
  • mywlcg-1.5.6-3.el5
  • nagios-gocdb-downtime-0.25.0-1.el5
  • ncg-metric-config-1.5.1-1.el5
  • poem-0.9.91-1.el5
  • poem-sync-0.9.91-1.el5
  • sam-nagios-1.23.0-2.el5
  • sam-release-1.23.0-1.el5

Packages moved/added to the UMD-3 repo:

  • emi-cream-nagios-1.0.1-6.el5.sam
  • emi.dcache.srm-probes-1.0.1-1
  • egi-mpi-nagios-0.0.7-4.1
  • emi-wms-nagios-3.5.0-3.sl5
  • glue-validator-2.0.25-0
  • grid-monitoring-org.activemq-probes-0.15-1.el5
  • grid-monitoring-org.nagiosexchange-probes-0.19-1.el5
  • grid-monitoring-probes-cadist-0.6.0-1.el5
  • grid-monitoring-probes-ch.cern.sam-1.6.15-1.el5
  • grid-monitoring-probes-hr.srce-0.38.1-1.el5
  • nagios-plugins-argus-1.1.0-2.el5
  • nagios-plugins-emi.glexec-0.3.0-1.sl5
  • nagios-plugins-dg-1.0.1-1.el5
  • nagios-plugins-emi.glexec-0.3.0-1.sl5
  • nagios-plugins-emi.glexec-config-1.0.0-2.el5
  • nagios-plugins-fts-3.2.30-1.el5
  • nagios-plugins-lfc-0.9.5-2.el5.sam
  • nordugrid-arc-nagios-plugins-1.8.1-1
  • nordugrid-arc-nagios-plugins-egi-1.8.1-1
  • perl-GridMon-1.0.73-1.el5
  • qcg-broker-nagios-probe-3.4.0-3
  • qcg-comp-nagios-probe-3.4.0-9
  • qcg-ntf-nagios-probe-3.4.0-2
  • unicore-nagios-plugins-2.3.2-0.sl5

Obsoleted packages:

  • nagios-plugins-wn-rep
  • gstat-validation

NCG config changes

  • Because of removal of org.sam.WN-Rep* tests, running Yaim will delete config file /etc/ncg/ncg-localdb.d/jobsubmit. On the existing SAM installations, remove all custom configuration of emi.cream.*-JobState test's parameters:
--wn-lfc
--wn-se-rep
--wn-se-rep-file
--wn-bdii

Modifications can be found in directories /etc/ncg/ncg-localdb.d/jobsubmit and /etc/ncg-metric-config.d/.

Yaim variable changes

Default values changed:

Variables obsoleted:

  • JOBSUBMIT_WN_LFC
  • JOBSUBMIT_WN_SE_REP
  • JOBSUBMIT_WN_SE_REP_FILE

Test changes

Tests added:

  • ch.cern.FTS3-Service
  • ch.cern.FTS3-StalledTransfers
  • org.bdii.GLUE2-Validate

Tests removed:

  • org.nordugrid.ARC-CE-LFC-result
  • org.nordugrid.ARC-CE-lfc
  • org.nordugrid.ARC-CE-LFC-submit
  • org.sam.WN-RepDel
  • org.sam.WN-RepISenv
  • org.sam.WN-RepFree
  • org.sam.WN-RepCr
  • org.sam.WN-RepGet
  • org.sam.WN-RepRep
  • org.sam.WN-Rep