Difference between revisions of "Agenda-03-12-2012"
Line 25: | Line 25: | ||
==== 2.2 Updates from DMSU ==== | ==== 2.2 Updates from DMSU ==== | ||
=== | === FTS jobs abort with "No site found for host xxx.yyy" error === | ||
Details [https://ggus.eu/tech/ticket_show.php?ticket=87929 GGUS #87929] | |||
From time to time, some FTS transfers fail with the message above. | |||
The problem was reported at CNAF, IN2P3, and GRIDKA, noticed by Atlas, CMS, | |||
and LHCb VOs. The problem is appearing and disappearing in rather short | |||
and unpredictable intervals. | |||
Exact reasons are not yet understood, we keep investigating. | |||
Reports from sites affected by similar problem will be appreciated. | |||
'''Update Nov 20''' The user reports that both problem disappeared, probably fixed together. | |||
=== LCMAPS-plugins-c-pep in glexec fails at RH6 based WNs === | |||
Details [https://ggus.eu/tech/ticket_show.php?ticket=88520 GGUS #88520] | |||
Due to replacement of OpenSSL with NSS in the RH6 based distributions, | |||
LCMAPS-plugins-c-pep invoked from glexec fails on talking to Argus PEP | |||
via curl. | |||
This is a known issue, as mentioned in | |||
[http://www.eu-emi.eu/products/-/asset_publisher/1gkD/content/glexec-wn EMI glexec release notes] | |||
however, the workaround is not described in a usable way there. | |||
Once we make sure we understand it properly and that the fix works, | |||
it will be documented properly at UMD pages and passed to the developers | |||
to | |||
# fix the documentation | |||
# try to deploy the workaround automatically when NSS-poisoned system is detected | |||
'''UPDATE Nov 19th''': the fix is now well explained in the [http://www.eu-emi.eu/products/-/asset_publisher/1gkD/content/glexec-wn#Known_issues known issues section] and it will be included in a future yaim update | |||
=== WMS does not work with ARC CE 2.0 === | |||
Details [https://ggus.eu/tech/ticket_show.php?ticket=88630 GGUS #88630], | |||
further info [https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=3062 Condor ticket #3062] | |||
The format of jobid changed in in the ARC CE release 12. | |||
This is not recognised by Condor prior to version 7.8.3. | |||
However, current EMI-1 WMS uses Condor 7.8.0. | |||
This breaks submission from WMS to ARC CE. | |||
The problem hence affects CMS SAM tests as well as their production jobs. | |||
Hence updates to ARC CE 12 should be done carefully before the Condor update | |||
is available from EMI. | |||
'''UPDATE Nov 26th''': on a test WMS it was installed Condor 7.8.6, and the submission to ARC seemed to work fine; since this WMS isn't available any more, further deeper tests should be performed, perhaps using the EMI-TESTBED infrastructure | |||
===3. AOB === | ===3. AOB === | ||
=== 4. Minutes === | === 4. Minutes === |
Revision as of 11:54, 3 December 2012
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Detailed agenda: Grid Operations Meeting 22 October 2012
EVO direct link | Pwd: gridops |
EVO details | Indico page |
1. Middleware releases and staged rollout
1.1. Update on the status of EMI updates
Cristina Aiftimiei (EMI) reports on the EMI updates
1.2. Staged Rollout
2. Operational Issues
2.1 Unsupported middleware update
2.2 Updates from DMSU
FTS jobs abort with "No site found for host xxx.yyy" error
Details GGUS #87929
From time to time, some FTS transfers fail with the message above. The problem was reported at CNAF, IN2P3, and GRIDKA, noticed by Atlas, CMS, and LHCb VOs. The problem is appearing and disappearing in rather short and unpredictable intervals.
Exact reasons are not yet understood, we keep investigating. Reports from sites affected by similar problem will be appreciated.
Update Nov 20 The user reports that both problem disappeared, probably fixed together.
LCMAPS-plugins-c-pep in glexec fails at RH6 based WNs
Details GGUS #88520
Due to replacement of OpenSSL with NSS in the RH6 based distributions, LCMAPS-plugins-c-pep invoked from glexec fails on talking to Argus PEP via curl.
This is a known issue, as mentioned in EMI glexec release notes however, the workaround is not described in a usable way there.
Once we make sure we understand it properly and that the fix works, it will be documented properly at UMD pages and passed to the developers to
- fix the documentation
- try to deploy the workaround automatically when NSS-poisoned system is detected
UPDATE Nov 19th: the fix is now well explained in the known issues section and it will be included in a future yaim update
WMS does not work with ARC CE 2.0
Details GGUS #88630, further info Condor ticket #3062
The format of jobid changed in in the ARC CE release 12. This is not recognised by Condor prior to version 7.8.3. However, current EMI-1 WMS uses Condor 7.8.0. This breaks submission from WMS to ARC CE.
The problem hence affects CMS SAM tests as well as their production jobs.
Hence updates to ARC CE 12 should be done carefully before the Condor update is available from EMI.
UPDATE Nov 26th: on a test WMS it was installed Condor 7.8.6, and the submission to ARC seemed to work fine; since this WMS isn't available any more, further deeper tests should be performed, perhaps using the EMI-TESTBED infrastructure