Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "SAMUpdate23"

From EGIWiki
Jump to navigation Jump to search
 
(36 intermediate revisions by 3 users not shown)
Line 11: Line 11:




Detailed list of all tickets can be found here: [https://github.com/ARGOeu/sam-probes/issues?q=is%3Aissue+milestone%3AUpdate-23].
Detailed list of all new features and bug fixes can be found here: [https://github.com/ARGOeu/sam-probes/issues?q=is%3Aissue+milestone%3AUpdate-23 ARGO/SAM github- Milestone Update23].


== Installation ==
== Installation ==


This guide is based on the previous SAM Administration guide: [https://tomtools.cern.ch/confluence/display/SAMDOC/SAM-Nagios+Administrator+Guide].  
This guide is based on the previous SAM Administration guide: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/SAM-Nagios%20Administrator%20Guide.html].  


=== Prerequisites ===
=== Prerequisites ===
Line 37: Line 37:
  -rw------- 1 root root 4836 Sep 18 14:44 userkey.pem
  -rw------- 1 root root 4836 Sep 18 14:44 userkey.pem
   
   
  $ /opt/globus/bin/myproxy-init -c 4320 -k NagiosRetrieve-<hostname>-<VO name> -s MYPROXY -l nagios -x -Z <host DN>
  $ myproxy-init -c 4320 -k NagiosRetrieve-<hostname>-<VO name> -s <MYPROXY-name> -l nagios -x -Z <host DN>


=== YUM repositories ===
=== YUM repositories ===


==== OS/EPEL repos ====
* Add the following config to all CentOS/SL base repositories:
exclude=mysql51*
* If you don't have it already please install [http://download.fedoraproject.org/pub/epel/5/i386/repoview/epel-release.html  The newest version of 'epel-release' for EL5].
* If you have priority set on EPEL repository, make sure the priority value is '''higher than 10'''.
==== Production ====
Follow the instructions for installation of UMD-3 and EPEL repositories: http://repository.egi.eu/category/umd_releases/distribution/umd-3/. In this manual we assume that priority of the UMD-3 is 1 as it is defined in the umd-release package.
Follow the instructions for installation of UMD-3 and EPEL repositories: http://repository.egi.eu/category/umd_releases/distribution/umd-3/. In this manual we assume that priority of the UMD-3 is 1 as it is defined in the umd-release package.


Add the following config to all CentOS/SL base repositories:
Add SAM repository from here: http://repository.egi.eu/sw/production/sam/1/repofiles/sam.repo
exclude=mysql51*
 
If you have priority set on EPEL repository, make sure it is lower than the SAM one.


=== Package installation ===
=== Package installation ===


Perform the following installation steps:
Perform the following installation steps:
  $ yum -y install ca-policy-egi-core httpd mysql51 yum-plugin-replace
  $ yum -y install ca-policy-egi-core httpd mysql51
  $ yum -y install nagios.x86_64
  $ yum -y install nagios.x86_64
  $ yum install sam-nagios
  $ yum install sam-nagios
Line 58: Line 63:


SAM uses Yaim for configuration. A detailed specification of all SAM configuration parameters is available in the SAM documentation:
SAM uses Yaim for configuration. A detailed specification of all SAM configuration parameters is available in the SAM documentation:
* Common configuration options: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/SAM+Configuration+via+YAIM#SAMConfigurationviaYAIM-Common]
* Common configuration options: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/SAM%20Configuration%20via%20YAIM.html#SAMConfigurationviaYAIM-Common SAM Configuration via YAIM-Common]
* SAM-Nagios specific options: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/SAM+Configuration+via+YAIM.html#SAMConfigurationviaYAIM-SAMNagios]
* SAM-Nagios specific options: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/SAM%20Configuration%20via%20YAIM.html#SAMConfigurationviaYAIM-SAMNagios SAM Configuration via YAIM-SAMNagios]
 
Check the Yaim variables changes below: [[SAMUpdate23#Yaim_variable_changes]].


In addition, check the FAQs for common configurations and problems: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/FAQs.html]
In addition, check the FAQs for common configurations and problems: [http://argoeu.github.io/samdoc/confluence/display/SAMDOC/FAQs.html]
Line 73: Line 80:
      
      
Check MyProxy credentials
Check MyProxy credentials
  $ nagios-run-check <hostname> hr.srce.GridProxy-Get-VO
  $ nagios-run-check <hostname> hr.srce.GridProxy-Get-<VO-name>


== Upgrade ==
== Upgrade ==


Upgrade from Update-22 is fully supported and it '''does not''' require SAM box reinstall. Procedure is the following:
Upgrade from Update-22 is fully supported and it '''does not''' require SAM box reinstall. Procedure is the following:
  yum update
* remove UMD-2 repo
/opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n NAGIOS -n SAM_NAGIOS
  yum remove umd-release
  rm -rf /etc/yum.repos.d/UMD-2-*
* add UMD repositories:
** install UMD-3 repo. For the '''StagedRollout sites''' please use the repos reccomended [[#Staged Rollout|above]]  
  wget http://repository.egi.eu/sw/production/umd/3/sl5/x86_64/updates/umd-release-3.0.1-1.el5.noarch.rpm
  yum --nogpgcheck localinstall umd-release-3.0.1-1.el5.noarch.rpm
* add the following config to all EPEL base repositories:
  exclude=perl-DateTime
* update everything
  yum update
* configuration
  /opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n NAGIOS -n SAM_NAGIOS


Upgrade from release older than Update-22 is not supported and it requires clean installation.
Upgrade from release older than Update-22 is not supported and it requires clean installation.
Line 86: Line 104:


Updated packages from the SAM repo:  
Updated packages from the SAM repo:  
 
*atp-1.27.19-1.el5.noarch.rpm
*glite-yaim-nagios-1.11.2-1.el5  
*grid-monitoring-config-gen-0.95.0-1.el5
*grid-monitoring-probes-eu.egi.sec-1.0.11-24.el5
*glite-yaim-nagios-1.11.3-1.el5
*msg-nagios-bridge-1.1.0-1.el5
*mrs-1.8.0-1.el5  
*mrs-1.8.0-1.el5  
*mywlcg-1.5.6-3.el5
*nagios-gocdb-downtime-0.25.0-1.el5  
*nagios-gocdb-downtime-0.25.0-1.el5  
*ncg-metric-config-1.5.0-1.el5
*ncg-metric-config-1.5.1-1.el5
*poem-0.9.91-1.el5
*poem-sync-0.9.91-1.el5
*sam-nagios-1.23.0-2.el5
*sam-release-1.23.0-1.el5


Packages moved/added to the UMD-3 repo:  
Packages moved/added to the UMD-3 repo:  


*emi-cream-nagios-1.0.1-5.el5.sam  
*emi-cream-nagios-1.0.1-6.el5.sam  
*emi.dcache.srm-probes-1.0.1-1  
*emi.dcache.srm-probes-1.0.1-1  
*egi-mpi-nagios-0.0.7-4.1  
*egi-mpi-nagios-0.0.7-4.1  
Line 103: Line 129:
*grid-monitoring-probes-cadist-0.6.0-1.el5  
*grid-monitoring-probes-cadist-0.6.0-1.el5  
*grid-monitoring-probes-ch.cern.sam-1.6.15-1.el5  
*grid-monitoring-probes-ch.cern.sam-1.6.15-1.el5  
*grid-monitoring-probes-hr.srce-0.38.0-1.el5  
*grid-monitoring-probes-hr.srce-0.38.1-1.el5  
*nagios-plugins-argus-1.1.0-2.el5  
*nagios-plugins-argus-1.1.0-2.el5  
*nagios-plugins-emi.glexec-0.3.0-1.sl5  
*nagios-plugins-emi.glexec-0.3.0-1.sl5  
Line 131: Line 157:
  --wn-se-rep-file
  --wn-se-rep-file
  --wn-bdii
  --wn-bdii
Modifications can be found in directories '''/etc/ncg/ncg-localdb.d/jobsubmit''' and '''/etc/ncg-metric-config.d/'''.


== Yaim variable changes ==
== Yaim variable changes ==

Latest revision as of 09:18, 17 December 2014

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Tools menu: Main page Instructions for developers AAI Proxy Accounting Portal Accounting Repository AppDB ARGO GGUS GOCDB
Message brokers Licenses OTAGs Operations Portal Perun EGI Collaboration tools LToS EGI Workload Manager


Major changes

Major changes in SAM Update-23:

  • Probes are moved to the UMD-3 repository. This decision was approved by the OMB in order to enable probe developers to update probes more frequently and independently from SAM releases.
  • Removal of the SAM GridMon (sam-gridmon) and its dependencies. SAM Update-23 supports only SAM Nagios (sam-nagios). In the future version SAM GridMon will be replaced with the ARGO engine.


Detailed list of all new features and bug fixes can be found here: ARGO/SAM github- Milestone Update23.

Installation

This guide is based on the previous SAM Administration guide: [1].

Prerequisites

Install your host certificate to secure the Nagios portal:

$ ls -l /etc/grid-security/host*
-rw-r--r-- 1 root root 2286 Oct 28 19:26 /etc/grid-security/hostcert.pem
-r-------- 1 root root  887 Oct 28 19:25 /etc/grid-security/hostkey.pem
 
$ openssl x509 -in /etc/grid-security/hostcert.pem -noout -purpose | grep "SSL client"
SSL client : Yes

SELINUX needs to be disabled to proceed with the installation. If it is enabled, follow the instructions below and reboot the machine:

$ setenforce 0
$ sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

Generate MyProxy credential (only needed if you are not using robot certificates), steps should be perform on an UI box:

$ ls -l .globus/
total 16
-rw-r--r-- 1 root root 4908 Sep 18 14:44 usercert.pem
-rw------- 1 root root 4836 Sep 18 14:44 userkey.pem

$ myproxy-init -c 4320 -k NagiosRetrieve-<hostname>-<VO name> -s <MYPROXY-name> -l nagios -x -Z <host DN>

YUM repositories

OS/EPEL repos

  • Add the following config to all CentOS/SL base repositories:
exclude=mysql51*

Production

Follow the instructions for installation of UMD-3 and EPEL repositories: http://repository.egi.eu/category/umd_releases/distribution/umd-3/. In this manual we assume that priority of the UMD-3 is 1 as it is defined in the umd-release package.

Add SAM repository from here: http://repository.egi.eu/sw/production/sam/1/repofiles/sam.repo

Package installation

Perform the following installation steps:

$ yum -y install ca-policy-egi-core httpd mysql51
$ yum -y install nagios.x86_64
$ yum install sam-nagios

Configuration

SAM uses Yaim for configuration. A detailed specification of all SAM configuration parameters is available in the SAM documentation:

Check the Yaim variables changes below: SAMUpdate23#Yaim_variable_changes.

In addition, check the FAQs for common configurations and problems: [2]

Once the site-info.def is ready, run Yaim:

$ /opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n NAGIOS -n SAM_NAGIOS

Validation

Check the Nagios web interface and SAM portal are up

  • https://<hostname>/nagios
  • http://<hostname>/myegi

Check MyProxy credentials

$ nagios-run-check <hostname> hr.srce.GridProxy-Get-<VO-name>

Upgrade

Upgrade from Update-22 is fully supported and it does not require SAM box reinstall. Procedure is the following:

  • remove UMD-2 repo
 yum remove umd-release
 rm -rf /etc/yum.repos.d/UMD-2-*
  • add UMD repositories:
    • install UMD-3 repo. For the StagedRollout sites please use the repos reccomended above
 wget http://repository.egi.eu/sw/production/umd/3/sl5/x86_64/updates/umd-release-3.0.1-1.el5.noarch.rpm
 yum --nogpgcheck localinstall umd-release-3.0.1-1.el5.noarch.rpm
  • add the following config to all EPEL base repositories:
 exclude=perl-DateTime
  • update everything
 yum update
  • configuration
 /opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n NAGIOS -n SAM_NAGIOS

Upgrade from release older than Update-22 is not supported and it requires clean installation.

Package changes

Updated packages from the SAM repo:

  • atp-1.27.19-1.el5.noarch.rpm
  • grid-monitoring-config-gen-0.95.0-1.el5
  • grid-monitoring-probes-eu.egi.sec-1.0.11-24.el5
  • glite-yaim-nagios-1.11.3-1.el5
  • msg-nagios-bridge-1.1.0-1.el5
  • mrs-1.8.0-1.el5
  • mywlcg-1.5.6-3.el5
  • nagios-gocdb-downtime-0.25.0-1.el5
  • ncg-metric-config-1.5.1-1.el5
  • poem-0.9.91-1.el5
  • poem-sync-0.9.91-1.el5
  • sam-nagios-1.23.0-2.el5
  • sam-release-1.23.0-1.el5

Packages moved/added to the UMD-3 repo:

  • emi-cream-nagios-1.0.1-6.el5.sam
  • emi.dcache.srm-probes-1.0.1-1
  • egi-mpi-nagios-0.0.7-4.1
  • emi-wms-nagios-3.5.0-3.sl5
  • glue-validator-2.0.25-0
  • grid-monitoring-org.activemq-probes-0.15-1.el5
  • grid-monitoring-org.nagiosexchange-probes-0.19-1.el5
  • grid-monitoring-probes-cadist-0.6.0-1.el5
  • grid-monitoring-probes-ch.cern.sam-1.6.15-1.el5
  • grid-monitoring-probes-hr.srce-0.38.1-1.el5
  • nagios-plugins-argus-1.1.0-2.el5
  • nagios-plugins-emi.glexec-0.3.0-1.sl5
  • nagios-plugins-dg-1.0.1-1.el5
  • nagios-plugins-emi.glexec-0.3.0-1.sl5
  • nagios-plugins-emi.glexec-config-1.0.0-2.el5
  • nagios-plugins-fts-3.2.30-1.el5
  • nagios-plugins-lfc-0.9.5-2.el5.sam
  • nordugrid-arc-nagios-plugins-1.8.1-1
  • nordugrid-arc-nagios-plugins-egi-1.8.1-1
  • perl-GridMon-1.0.73-1.el5
  • qcg-broker-nagios-probe-3.4.0-3
  • qcg-comp-nagios-probe-3.4.0-9
  • qcg-ntf-nagios-probe-3.4.0-2
  • unicore-nagios-plugins-2.3.2-0.sl5

Obsoleted packages:

  • nagios-plugins-wn-rep
  • gstat-validation

NCG config changes

  • Because of removal of org.sam.WN-Rep* tests, running Yaim will delete config file /etc/ncg/ncg-localdb.d/jobsubmit. On the existing SAM installations, remove all custom configuration of emi.cream.*-JobState test's parameters:
--wn-lfc
--wn-se-rep
--wn-se-rep-file
--wn-bdii

Modifications can be found in directories /etc/ncg/ncg-localdb.d/jobsubmit and /etc/ncg-metric-config.d/.

Yaim variable changes

Default values changed:

Variables obsoleted:

  • JOBSUBMIT_WN_LFC
  • JOBSUBMIT_WN_SE_REP
  • JOBSUBMIT_WN_SE_REP_FILE

Test changes

Tests added:

  • ch.cern.FTS3-Service
  • ch.cern.FTS3-StalledTransfers
  • org.bdii.GLUE2-Validate

Tests removed:

  • org.nordugrid.ARC-CE-LFC-result
  • org.nordugrid.ARC-CE-lfc
  • org.nordugrid.ARC-CE-LFC-submit
  • org.sam.WN-RepDel
  • org.sam.WN-RepISenv
  • org.sam.WN-RepFree
  • org.sam.WN-RepCr
  • org.sam.WN-RepGet
  • org.sam.WN-RepRep
  • org.sam.WN-Rep