Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

NGI DE:Join as resource centre

From EGIWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

NGI-DE/NGI-CH Operations Center

Ngi-de-logo-trans.gif


NGI-DE wiki


Join as Resource Centre


NGI-DE Site Registration and Certification Procedure

In the National GRID Initiative Germany(NGI-DE) there are three types of middleware resources suppported. They are gLite, UNICORE and Globus. To be a NGI-DE site, at least one of the three middlewares has to be supported and the site has to register in GOCDB. To be an UNICORE site, UNICOREX and unicore-gateway of UNICORE6 must be supported. To be the Globus site, globus-GRIDFTP and GRAM5 of Globus version 5 must be supported. The gLite site should have the site-BDII, the Storage Element (SE) or/and Computing Element (CE) with at least eight Worker Node cores.

Below you can find a procedure on how to contribute resources to NGI-DE. In case of doubts, please contact us by e-mail: ngi-de-admin@lists.kit.edu


STEP 1 - Registration

Requirements:

  • willingness to set-up the site
  • Primary Site Manager - the person entitled to represent the Resource Centre in NGI-DE that owns a personal certificate

Actions by Primary Site Managers:

  • send a digitally signed e-mail to ngi-de-admin@lists.kit.edu that include the following data and statements:
   Personal Data of Primary Site Manager:
   Name:

   Email:

   Telephone:

   Hours:

   Certificate DN:

   I'm the Primary Site Manager of the site described below.

   Site (GIIS) Name:

   Official Name of Hosting Institution:

   Domain: 
   Site Email Address:

   Site Telephone Number:

   Site Emergency Number:

   Country:

   All administrators and other necessary personnel at the site will be informed of and agree to abide by all Grid operating policies described at:

   the Grid Site Operations Security Policy

   The Site Security Contact and the team members will be informed of and agree to

    the Security Incident Response Policy 

ROC Staff Actions:

  • open a ggus ticket to follow up the procedure
  • register the site to GOC DB as a candidate site;
  • confirm registration to Primary Site Manager.
  • create dteam group(optional) /dteam/NGI_DE/..your_resource_center.


STEP 2: Preparation

Actions by Primary Site Manager:

  • register as a user in the GOCDB
  • apply for the 'Site Manager' role of your newly created site:
  • after approval of your role by a regional manager:
  • fill all missing information in GOC DB about the site including names of machines. The most critical are: GIISURL and section "Nodes".
  • add other site administrators and security officers to GOCDB and assign the appropriate roles to them. There should be at least one 'Security Officer'.
  • create a site admin contact list and a security incident response (CSIRT) list and add to your GOCDB entry. The mailing lists should reach at least two people. They should be willing to react quickly to requests, in particular to security incidences.

Actions by All Admins:

ROC staff Actions:

  • test site and especially security contacts
  • switch site to non-certified status
  • enable monitoring of the site
  • Add the new site to the configuration file of the NGI_DE Nagios monitoring test instance (rocmon-fzk.gridka.de). After reconfiguration the UNCERTIFIED sites will appear in (https://rocmon-fzk.gridka.de/nagios/). To access the page you need to register as vo dteam member

STEP 3: Installation

Site Admins Actions:

  • check which version of middleware is obligatory for production installations:
  • gLite middleware page:

http://glite.web.cern.ch/glite/packages/latestRelease.asp

  • UNICORE middleware page

http://www.unicore.eu/index.php

  • install the grid middleware according to the documentation on the release pages.
  • a minimum set of services is:
  • for gLite:

one computing element (CE), and/or one storage element (SE), eight worker nodes (WNs), as well as a SiteBDII and a monitoring box (MON)

  • for UNICORE

UNICOREX and gateway

  • for Globus

globus-GRIDFTP and GRAM

  • use this topBDII during the certification process: BDII_HOST= bdii-fzk.gridka.de

Important notes:

STEP 4: Certification

Actions by Site Admins:

  • inform the ROC that site is fully installed and configured properly;
  • fix issues raised by ROC staff.
  • register your site in the NGI-DE helpdesk as support staff under

https://helpdesk.ngi-de.eu/index.php?mode=register

  • subscribe all site admins to your site admins mailing list, and register your site mailing list to NGI-DE operations mailing list (NGI-DE-OPERATIONS@LISTSERV.DFN.DE). There you can ask questions, share your expertise and get known about recent news relevant for NGI-DE.

ROC staff actions:

  • check if the site is fully functional and inform the site managers about detected issues;
  • if everything is OK, switch site to certified status and schedule "Initial maintenance" for five working days due to necessity to check if the site is working properly in production environment (some features can be verified only in production mode. After at latest three hours since the site is switched to Certified the site should be visible in the NGI-DE production instance, https://ngi-de-nagios.gridka.de);
  • if everything is OK and initial scheduled downtime is over, the site is fully certified!

FINAL REMARKS

Some important information for certified sites concerning operation in NGI-DE below:

  1. If your site fails Nagios tests you may receive a ticket after 24 hours since the alarm notification. Please be pro-active - monitor your site and fix problems before tickets are raised.
  2. There is a statistic check of site availability and reliability once per month. Sites which have an availability less than 70% and reliability less than 75% are requested through a GGUS to motivate the poor performance provided. Sites which have an availability of less than 70% for three consecutive months will be suspended.