Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Talk:SiteCertMan"

From EGIWiki
Jump to navigation Jump to search
 
(22 intermediate revisions by the same user not shown)
Line 5: Line 5:
=== Site registration procedure ===
=== Site registration procedure ===


Is the "site working hours" attribute mandatory?
https://cic.gridops.org/index.php?section=rc&page=configuration has been checked, reviewed and integrated wherever appropriate.


Is it still necessary to open a GGUS ticket asking for the site to be registered in GGUS?
Is the "site working hours" attribute mandatory? No. GOCDB carries only site's ''Operatinghours Start'' and ''Operatinghours End'' attributes for historical reasons. Editing of these fields is prohibited.
From what we have seen the sites appear already after a certain time after they have been registered - as candidate.
 
Alessandro would propose to modify the point 3 in this way. --done.
 
3) site-admins side:
    a) register in GOC-DB requesting the admin role for the own site
    b) at least one person (non necessarily a site-admin) have to request the Security Officer role
    c) after the roles approvation, fill in any missing information, including the services to be monitored
    d) notify the NGI managers when done
 
The site will automatically appear in GGUS. Check whether this also works in the case of NGI'zed GGUS instances.
 
Gonçalo wouldn't include the registration of the site in the helpdesk system (GGUS) in the registration procedure since, it may well happen that, a site may give up and never reach certification. There is no point in populating other tools rather than GOCDB at this stage


Eventually expand a bit on:
Eventually expand a bit on:
How to apply for ops VO?
How to apply for dteam VO?
How to apply for any VO? -- less urgent
Tiziana: some security aspects could be checked. For example, that the local security mailing list is open to third party posting, and that its archives are not accessible to the wide public. -- done, s. https://wiki.egi.eu/wiki/SiteCertMan/Required_information


How to apply for ops VO?
Gonçalo: Security contacts entry shouldn't be empty. The address should work. Check Security blacklists. -- done, s. https://wiki.egi.eu/wiki/SiteCertMan/Required_information


How to apply for dteam VO?
Andres: recommend to have only one address in the security contacts field. Usually more than one person should get this mail in the end, so it can make sense to put the name of a mailinglist in there. -- done, s. https://wiki.egi.eu/wiki/SiteCertMan/Required_information


How to apply for any VO?
Tiziana: a site needs to sign the site-NGI OLA (the ex egee SLD document)


=== Site certification procedure ===
=== Site certification procedure ===
Line 24: Line 40:
?
?


How long should an initial maintenance downtime last? 2 weeks? Maybe shorter?
Gonçalo would like to include some extra suggestions to what Alessandro proposes:
 
### lcg-CE checks ###
 
    1./ Normally, we also check if the lcg-CE gridftp server is working (with a globus-url-copy). We also test uberftp.
            globus-url-copy -dbg -v -vb file:/home/csys/goncalo/teste.txt gsiftp://ce02.lip.pt/tmp/txt
            uberftp ce02.lip.pt
 
    2./ We test first the fork Job Manager before sending a job via the LRMS Job Manager. This will give you a faster answer regarding the correct mapping of the user.
            globus-job-run ce02.lip.pt:2119/jobmanager-fork /bin/pwd
 
### CREAM-CE ###
 
    1./ Same as 1) for the lcg-CE
 
### SE ###
 
    1./ We normally check if srm commands do work (ex: srmls, srmcp and srmrm)
 
    2./ We test lcg_utils tools to store date in the SE by adding the ldap string of the site in a dummy top-bdii, and pointing the UI env var LCG_GFAL_INFOSYS to that dummy BDII.
 
 
How long should an initial maintenance downtime last? 2 weeks? Maybe shorter? For experienced sites/NGI's a couple of days for the whole procedure could be sufficient. And additional maintenance downtime can always be defined. -- the lenght of the initial downtime should not be specified furter.
 
Before the end of the maintenance downtime, check eg. gridview for availability/reliability forecast. -- included.
 
Should the GSTAT requirement be removed? Rather not (Goncalo in response to Alessandro) it should be regarded as an aid and not as a requirement.
 
Expand on how to properly publish accounting data? -- done
 
=== Other references ===
 
https://wiki.egi.eu/wiki/Operations:Site_Certification


Before the end of the maintenance downtime, check eg. gridview for availability/reliability forecast.
https://www.italiangrid.org/grid_operations/site_manager/register_new_site/detailed_information


Should the GSTAT requirement be removed? Rather not (Goncalo in response to Alessandro)
https://www.italiangrid.org/grid_operations/site_manager/getting_started/form


Expand on how to properly publishing accounting data?
Meeting URL: https://www.egi.eu/indico/conferenceDisplay.py?confId=213

Latest revision as of 09:58, 20 December 2010


Questions

Site registration procedure

https://cic.gridops.org/index.php?section=rc&page=configuration has been checked, reviewed and integrated wherever appropriate.

Is the "site working hours" attribute mandatory? No. GOCDB carries only site's Operatinghours Start and Operatinghours End attributes for historical reasons. Editing of these fields is prohibited.

Alessandro would propose to modify the point 3 in this way. --done.

3) site-admins side:
    a) register in GOC-DB requesting the admin role for the own site
    b) at least one person (non necessarily a site-admin) have to request the Security Officer role
    c) after the roles approvation, fill in any missing information, including the services to be monitored
    d) notify the NGI managers when done

The site will automatically appear in GGUS. Check whether this also works in the case of NGI'zed GGUS instances.

Gonçalo wouldn't include the registration of the site in the helpdesk system (GGUS) in the registration procedure since, it may well happen that, a site may give up and never reach certification. There is no point in populating other tools rather than GOCDB at this stage

Eventually expand a bit on: How to apply for ops VO? How to apply for dteam VO? How to apply for any VO? -- less urgent

Tiziana: some security aspects could be checked. For example, that the local security mailing list is open to third party posting, and that its archives are not accessible to the wide public. -- done, s. https://wiki.egi.eu/wiki/SiteCertMan/Required_information

Gonçalo: Security contacts entry shouldn't be empty. The address should work. Check Security blacklists. -- done, s. https://wiki.egi.eu/wiki/SiteCertMan/Required_information

Andres: recommend to have only one address in the security contacts field. Usually more than one person should get this mail in the end, so it can make sense to put the name of a mailinglist in there. -- done, s. https://wiki.egi.eu/wiki/SiteCertMan/Required_information

Tiziana: a site needs to sign the site-NGI OLA (the ex egee SLD document)

Site certification procedure

expand like in https://twiki.cnaf.infn.it/twiki/bin/view/Sandbox/SiteCertification ?

Gonçalo would like to include some extra suggestions to what Alessandro proposes:

      1. lcg-CE checks ###
   1./ Normally, we also check if the lcg-CE gridftp server is working (with a globus-url-copy). We also test uberftp.
           globus-url-copy -dbg -v -vb file:/home/csys/goncalo/teste.txt gsiftp://ce02.lip.pt/tmp/txt
           uberftp ce02.lip.pt
   2./ We test first the fork Job Manager before sending a job via the LRMS Job Manager. This will give you a faster answer regarding the correct mapping of the user.
           globus-job-run ce02.lip.pt:2119/jobmanager-fork /bin/pwd
      1. CREAM-CE ###
   1./ Same as 1) for the lcg-CE
      1. SE ###
   1./ We normally check if srm commands do work (ex: srmls, srmcp and srmrm)
   2./ We test lcg_utils tools to store date in the SE by adding the ldap string of the site in a dummy top-bdii, and pointing the UI env var LCG_GFAL_INFOSYS to that dummy BDII.


How long should an initial maintenance downtime last? 2 weeks? Maybe shorter? For experienced sites/NGI's a couple of days for the whole procedure could be sufficient. And additional maintenance downtime can always be defined. -- the lenght of the initial downtime should not be specified furter.

Before the end of the maintenance downtime, check eg. gridview for availability/reliability forecast. -- included.

Should the GSTAT requirement be removed? Rather not (Goncalo in response to Alessandro) it should be regarded as an aid and not as a requirement.

Expand on how to properly publish accounting data? -- done

Other references

https://wiki.egi.eu/wiki/Operations:Site_Certification

https://www.italiangrid.org/grid_operations/site_manager/register_new_site/detailed_information

https://www.italiangrid.org/grid_operations/site_manager/getting_started/form

Meeting URL: https://www.egi.eu/indico/conferenceDisplay.py?confId=213