Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-14-03-2011"

From EGIWiki
Jump to navigation Jump to search
(Created page with '==Detailed Agenda== ===1 - Information (Mario)=== * Staged rollout wiki and procedures updated: ** https://wiki.egi.eu/wiki/Staged-Rollout ** https://wiki.egi.eu/wiki/Staged-ro…')
 
 
(19 intermediate revisions by 4 users not shown)
Line 1: Line 1:
==Detailed Agenda==
[[Category:Grid Operations Meetings]]


===1 - Information (Mario)===
=Detailed Agenda=


* Staged rollout wiki and procedures updated:
==1 - Information (Mario)==
** https://wiki.egi.eu/wiki/Staged-Rollout
** https://wiki.egi.eu/wiki/Staged-rollout-procedures
* New Early Adopters ''portal'' (updated): https://www.egi.eu/earlyAdopters/
** https://www.egi.eu/earlyAdopters/table
** https://www.egi.eu/earlyAdopters/teams
* Work in progress to implement all needed features in the '''staged-rollout''' RT queue, for the staged rollout tests
* '''All components of glite 3.2 are presently covered by EAs, except the tarballs WN and UI, thus need teams to cover them'''


===2 - Staged Rollout (Mario)===
===1.1 - Update on status of the EMI 1.0 release===
C. Aiftimiei (EMI release manager) will update the operations community about the current status of the EMI 1.0 release ([https://www.egi.eu/indico/conferenceDisplay.py?confId=428 slides])


====2.1 - gLite 3.1====
===1.2 - Transition from lcg-CE to CREAM-CE===
* LFC and DPM 1.8.0-1 : all patches will be rejected since they have voms-api-c(pp) with mem leak, and new ones will be produced
* FTS/FTM/FTA: are in certification.


====2.2 - gLite 3.2====
Mail from Gergely Sipos March 1 (snippet):
* Ready for Certification, in Certification - Upcoming in the next few days to staged rollout:
 
** L&B 2.1.10
''LCG-CE hosts are gradually being replaced by CREAM-CEs within EGI the gLite developer consortium has decided to focus its effort on the CREAM-CE and terminate support for the LCG-CE software. The exact schedule is still
** Torque (utils, server, client)
under discussion, but the LCG-CE software will likely reach its end of support during the summer of 2011.''
** UI 3.2.9
 
** VOBOX 3.2.11
''In order to help EGI user communities prepare to work with CREAM-CEs, the Operations and User Support teams of EGI.eu have prepared an information page about the CREAM-CE software and about the transfer of applications from LCG-CE to CREAM-CE:''
** FTA/FTM/FTS 2.2.5
 
* In staged rollout
[[FAQ:_lcg-ce_to_cream-ce]],
** WN 3.2.10 : No major problems found in staged rollout
Support team: ucst@egi.eu
** glite-CLUSTER 3.2.2 : No major problems found in staged rollout
 
* Urgent fix in CREAM
===1.3 - gLite 3.1 CREAM reached end of life and DPM===
** Problem reported by LHCb with CREAM 1.6.4 - there was an urgent fix that went through staged rollout in a very short time (less then 1 day). New version CREAM 1.6.5
 
Mail from Tiziana March 1 for the CREAM in glite 3.1
 
For DPM in glite 3.1, the following NGI's showed interest: Italy, UK, Greece, Geórgia, Poland (1 site).
In order for us to make any request to the developers to produce new rpm with the dependencies
in voms-api version that have the mem leak fixed, we need to have early adopter sites for the test. Currently there is only one EA from the UK.
 
===1.4 - phasing out of EGEE Grid name===
 
Mail from Tiziana March 3
 
Instructions are available at:
[[MAN1_How_to_publish_Site_Information]]
 
As of March 10: 168 sites still publish EGEE (http://gstat.egi.eu/gstat/summary/GRID/EGEE/)
 
Every NGI is requested to follow-up this issue internally with the respective sites.
 
===1.5 - Decomissioning of the old SAM infrastructure===
 
EGI Broadcast March 7
 
''Decommission the old SAM infrastructure by the end of June 2011.''
 
''Document which describes the new programmatic interface:''
 
https://tomtools.cern.ch/confluence/display/SAM/Migration+to+the+new+SAM+portal+and+programmatic+interface
 
List of machines which are contacting old SAM (Nagioses excluded):


====2.3 - Operational Tools====
https://tomtools.cern.ch/confluence/display/SAM/List+of+hosts+using+the+old+system
* Staged rollout of SAM update 9 https://rt.egi.eu/rt/Ticket/Display.html?id=1281
** Problems found in the test, in contact with developers.


===3 - Operational Issues (all)===
Every NGI is requested to follow-up this issue internally with the respective sites.


====3.1 - IGTF CA 1.38 and VOMS-ADMIN (Mario)====
Problem found in production VOMS-ADMIN server after update to the latest IGTF-CA 1.38 with the new egi policy format.
A Post-Mortem analysis is in progress and will be documented soon.


====3.2 - Where to report problems/issues for SW in staged rollout (Mario)====
==2 - Staged Rollout (Mario)==


====3.3 - TopBDII has http://goc.gridops.org/gocdbpi/... hardwired, but this will be decommissioned (Mario)====
===2.1 - gLite 3.1===
* FTS/FTM/FTA 2.2.5: are in certification.
* Torque_Utils: in staged rollout


* Followed in GGUS ticket: https://gus.fzk.de/dmsu/dmsu_ticket.php?ticket=67544
===2.2 - gLite 3.2===
* The new endpoint is: https://goc.egi.eu/gocdbpi/...
* Ready for Certification, in Certification - Upcoming in the next few days to staged rollout:
* Discussions and progress with the developers on one side. Ops tools and Operations on the the other side
** L&B 2.1.11
** Torque_utils: re certification because of configuration issues
** FTS/A/M 2.2.5
* In staged rollout
** TopBDII 5.1.22: under test - '''Contains the updated GOCDB url'''.
** SiteBDII 5.1.22: under test
** VOBOX 3.2.11: in staged rollout
** UI 3.2.9: in staged rollout
** Torque clients/server: The Torque utils has configuration issues, and are being solved by the developer.


====3.4 - EGI BROADCAST spam must be culled (Ulf Tigerstedt, NGI_NDGF and NGI_FI)====
===2.3 - Operational Tools===
* Staged rollout of SAM update 9 https://rt.egi.eu/rt/Ticket/Display.html?id=1281
** SAM-U9 successfully released on 08/03/2011


As '''everything''' gets spammed with EGI BROADCASTs nowadays, the people getting them in 10-folds are
==3 - Operational Issues (all)==
beginning to treat them as spam. Could the system be improved to not spam every mailing list
available for every single thing? As a ROD I'm currently getting 4 copies, some others are getting 14...


From Cyril Lorphelin
===3.1 [[Jobs_work_directory_and_temportary_directory| Jobs work directory and temportary directory]]===
We (EGI.eu) would like to discuss and finalize a long-standing issue which affects the LRMS administrators, concerning the management of the job work directory and the usage of temporary directories. A proposal is presented for discussion.


Hi Mario , Ulf
===3.2 NGI wide WMS and top-level BDII used by Nagios===
I would like to discuss setup of NGI Nagios services in terms of WMS/BDII used. Is there a recommended way what WMS should be used for good performance and reliability of nagios tests?
Is it a good practice to set up a dedicated WMS for nagios tests with BDII limited to NGI resources?
My concern comes from the fact that our CEs are sometimes marked as having problem because of non-responsive BDII/WMS.
See ticket https://gus.fzk.de/dmsu/dmsu_ticket.php?ticket=67631 (-- Tomas Kouba (NGI_CZ))


This problem is quite difficult to solve.
===3.3 EGI broadcast: issue of to many notifications===
We propose a tool which permits to send emails widely.
If people don't choose properly the target everybody will be informed .


We have added the option to create models // templates  with specific subjects and specific targets , this is probably a solution to reduce spams .
Actively followed in RT ticket:
It means that people should properly select the right targets when they are creating a model .


I have no obvious solution .
https://rt.egi.eu/rt/Ticket/History.html?id=1409


===4 - AOB===
==4 - AOB==


Next Grid Operations Meeting:
Next Grid Operations Meeting:


14 March 2011, 14h00 Amsterdam time
28 March 2011, 14h00 Amsterdam time
 
----
Back [[GridOpsMeeting]]

Latest revision as of 18:07, 29 November 2012


Detailed Agenda

1 - Information (Mario)

1.1 - Update on status of the EMI 1.0 release

C. Aiftimiei (EMI release manager) will update the operations community about the current status of the EMI 1.0 release (slides)

1.2 - Transition from lcg-CE to CREAM-CE

Mail from Gergely Sipos March 1 (snippet):

LCG-CE hosts are gradually being replaced by CREAM-CEs within EGI the gLite developer consortium has decided to focus its effort on the CREAM-CE and terminate support for the LCG-CE software. The exact schedule is still under discussion, but the LCG-CE software will likely reach its end of support during the summer of 2011.

In order to help EGI user communities prepare to work with CREAM-CEs, the Operations and User Support teams of EGI.eu have prepared an information page about the CREAM-CE software and about the transfer of applications from LCG-CE to CREAM-CE:

FAQ:_lcg-ce_to_cream-ce, Support team: ucst@egi.eu

1.3 - gLite 3.1 CREAM reached end of life and DPM

Mail from Tiziana March 1 for the CREAM in glite 3.1

For DPM in glite 3.1, the following NGI's showed interest: Italy, UK, Greece, Geórgia, Poland (1 site). In order for us to make any request to the developers to produce new rpm with the dependencies in voms-api version that have the mem leak fixed, we need to have early adopter sites for the test. Currently there is only one EA from the UK.

1.4 - phasing out of EGEE Grid name

Mail from Tiziana March 3

Instructions are available at: MAN1_How_to_publish_Site_Information

As of March 10: 168 sites still publish EGEE (http://gstat.egi.eu/gstat/summary/GRID/EGEE/)

Every NGI is requested to follow-up this issue internally with the respective sites.

1.5 - Decomissioning of the old SAM infrastructure

EGI Broadcast March 7

Decommission the old SAM infrastructure by the end of June 2011.

Document which describes the new programmatic interface:

https://tomtools.cern.ch/confluence/display/SAM/Migration+to+the+new+SAM+portal+and+programmatic+interface

List of machines which are contacting old SAM (Nagioses excluded):

https://tomtools.cern.ch/confluence/display/SAM/List+of+hosts+using+the+old+system

Every NGI is requested to follow-up this issue internally with the respective sites.


2 - Staged Rollout (Mario)

2.1 - gLite 3.1

  • FTS/FTM/FTA 2.2.5: are in certification.
  • Torque_Utils: in staged rollout

2.2 - gLite 3.2

  • Ready for Certification, in Certification - Upcoming in the next few days to staged rollout:
    • L&B 2.1.11
    • Torque_utils: re certification because of configuration issues
    • FTS/A/M 2.2.5
  • In staged rollout
    • TopBDII 5.1.22: under test - Contains the updated GOCDB url.
    • SiteBDII 5.1.22: under test
    • VOBOX 3.2.11: in staged rollout
    • UI 3.2.9: in staged rollout
    • Torque clients/server: The Torque utils has configuration issues, and are being solved by the developer.

2.3 - Operational Tools

3 - Operational Issues (all)

3.1 Jobs work directory and temportary directory

We (EGI.eu) would like to discuss and finalize a long-standing issue which affects the LRMS administrators, concerning the management of the job work directory and the usage of temporary directories. A proposal is presented for discussion.

3.2 NGI wide WMS and top-level BDII used by Nagios

I would like to discuss setup of NGI Nagios services in terms of WMS/BDII used. Is there a recommended way what WMS should be used for good performance and reliability of nagios tests? Is it a good practice to set up a dedicated WMS for nagios tests with BDII limited to NGI resources? My concern comes from the fact that our CEs are sometimes marked as having problem because of non-responsive BDII/WMS. See ticket https://gus.fzk.de/dmsu/dmsu_ticket.php?ticket=67631 (-- Tomas Kouba (NGI_CZ))

3.3 EGI broadcast: issue of to many notifications

Actively followed in RT ticket:

https://rt.egi.eu/rt/Ticket/History.html?id=1409

4 - AOB

Next Grid Operations Meeting:

28 March 2011, 14h00 Amsterdam time


Back GridOpsMeeting