Difference between revisions of "Agenda-11-02-2013"
(10 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
{{Template:Op menubar}} | {{Template:Op menubar}} | ||
= Detailed agenda: Grid Operations Meeting | = Detailed agenda: Grid Operations Meeting 11 Feb 2013 = | ||
{| | {| | ||
Line 22: | Line 22: | ||
** Over 15 products will be ready to be released [https://rt.egi.eu/rt/Dashboards/4248/Software%20Provisioning%20UMD-2 Full list] | ** Over 15 products will be ready to be released [https://rt.egi.eu/rt/Dashboards/4248/Software%20Provisioning%20UMD-2 Full list] | ||
*** Some important security vulnerabilities | *** Some important security vulnerabilities | ||
**** VOMS: | **** VOMS: important update | ||
**** DPM & LFC, | **** DPM & LFC, important udpate | ||
**** UNICORE/X6 -> still in verification | **** UNICORE/X6 -> still in verification | ||
*** GFAL 1.14.0 will solve the ATLAS data transfer problems: timeout issues in lcg_util and/or GFAL commands. In particular: lcg-cp fails with files over 5 GB [https://ggus.eu/tech/ticket_show.php?ticket=89998 GGUS ticket] | *** GFAL 1.14.0 will solve the ATLAS data transfer problems: timeout issues in lcg_util and/or GFAL commands. In particular: lcg-cp fails with files over 5 GB [https://ggus.eu/tech/ticket_show.php?ticket=89998 GGUS ticket] | ||
Line 31: | Line 31: | ||
** Security updates for UMD-1 | ** Security updates for UMD-1 | ||
*** DPM & LFC 1.8.6 and VOMS 2.0.10-1 | *** DPM & LFC 1.8.6 and VOMS 2.0.10-1 | ||
* Middleware products verified for the support of SHA-2 proxies and certificates | |||
** [https://wiki.egi.eu/wiki/Middleware_products_verified_for_the_support_of_SHA-2_proxies_and_certificates List of Products] | |||
* Several NGI's demonstrated there interest on the EMI WN tarball. | |||
** Work being done by Matt Doigge from Lancaster UK | |||
** For reference the GGUS tickets are regarding the wn tarball status: | |||
*** Tickets: [https://ggus.eu/ws/ticket_info.php?ticket=91032 91032] ; [https://ggus.eu/ws/ticket_info.php?ticket=91145 91145] ; [https://ggus.eu/ws/ticket_info.php?ticket=91007 91007] | |||
=== 2. Operational Issues === | === 2. Operational Issues === | ||
==== 2.1 Status of unsupported middleware update ==== | ==== 2.1 Status of unsupported middleware update ==== | ||
===== Status of the tickets opened by COD ===== | |||
* Still there are 11 sites raising alarms | |||
** Two sites (NGI_RU) with missing scheduled downtime | |||
===== Status of WN/DPM/LFC/dCache tickets ===== | |||
* On 08 Feb 2013: 26 sites with unsupported deployed middleware (8 NGIs) | |||
* 5 Sites waiting for the WN tarballs | |||
* 17 sites in downtime | |||
==== 2.2 Updates from DMSU ==== | ==== 2.2 Updates from DMSU ==== | ||
===== list-match problem with EMI2 WMS ===== | |||
Details [https://ggus.eu/tech/ticket_show.php?ticket=90240 GGUS #90240] | |||
Some CEs have enabled only a group or a role in their queues, not the entire VO: | |||
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys | |||
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys/Role=SoftwareManager | |||
so, when your primary attribute is: | |||
attribute : /gridit/ansys/Role=NULL/Capability=NULL | |||
if you use an EMI-2 WMS, you cannot match those resources (instead you can if use EMI-1 WMS) | |||
It seems that the problem is in the value of ''WmsRequirements'' contained the file ''/etc/glite-wms/glite_wms.conf'': the filter set in that variable is different from the one used in the EMI-1 WMS. The developers are investigating on it | |||
'''UPDATE Jan 31st''': the fix will be released in EMI-3. However, the developers provided us a rpm, ''glite-wms-classad_plugin-3.4.99-0.sl5.x86_64.rpm'', which we have installed on our EMI-2 WMS servers, and the issue has been fixed | |||
===== LFC-Oracle problem ===== | |||
Details [https://ggus.eu/tech/ticket_show.php?ticket=90701 GGUS #90701] | |||
The error is occurring with EMI2 emi-lfc_oracle-1.8.5-1.el5 and Oracle 11: | |||
#lfc-ls lfc-1-kit:/grid | |||
send2nsd: NS002 - send error : client_establish_context: The server had a problem while authenticating our connection | |||
lfc-1-kit:/grid: Could not secure the connection | |||
Experts suspect it is due to the use of Oracle 11 client when the LFC code has been | |||
compiled against the Oracle 10 API. The LFC developers expect | |||
to provide rpms built against Oracle 11 shortly. | |||
'''UPDATE Feb 5th''': solved by applying the work around with: | |||
add in /etc/sysconfig/lfcdaemon: | |||
export LD_PRELOAD=/usr/lib64/libssl.so:/usr/lib64/libglobus_gssapi_gsi.so.4 | |||
the Oracle 11 build wasn't tested because the service is in production | |||
=== 3. AOB === | === 3. AOB === | ||
Line 43: | Line 102: | ||
=== 4. Minutes === | === 4. Minutes === | ||
[[Category:Grid_Operations_Meetings]] | [[Category:Grid_Operations_Meetings]] | ||
[https://indico.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=1328 Minutes on indico page] |
Latest revision as of 16:00, 18 February 2013
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Detailed agenda: Grid Operations Meeting 11 Feb 2013
Audio conference link | Pass: gridops |
Audio conference details | Indico page |
1. Middleware releases and staged rollout
1.1. Update on the status of EMI updates
1.2. Staged Rollout
- Final preparation of the UMD 2.4.0
- Over 15 products will be ready to be released Full list
- Some important security vulnerabilities
- VOMS: important update
- DPM & LFC, important udpate
- UNICORE/X6 -> still in verification
- GFAL 1.14.0 will solve the ATLAS data transfer problems: timeout issues in lcg_util and/or GFAL commands. In particular: lcg-cp fails with files over 5 GB GGUS ticket
- Some important security vulnerabilities
- Over 15 products will be ready to be released Full list
- updates for UMD-1
- L&B 3.2.9
- Security updates for UMD-1
- DPM & LFC 1.8.6 and VOMS 2.0.10-1
- Middleware products verified for the support of SHA-2 proxies and certificates
- Several NGI's demonstrated there interest on the EMI WN tarball.
2. Operational Issues
2.1 Status of unsupported middleware update
Status of the tickets opened by COD
- Still there are 11 sites raising alarms
- Two sites (NGI_RU) with missing scheduled downtime
Status of WN/DPM/LFC/dCache tickets
- On 08 Feb 2013: 26 sites with unsupported deployed middleware (8 NGIs)
- 5 Sites waiting for the WN tarballs
- 17 sites in downtime
2.2 Updates from DMSU
list-match problem with EMI2 WMS
Details GGUS #90240
Some CEs have enabled only a group or a role in their queues, not the entire VO:
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys GlueCEAccessControlBaseRule: VOMS:/gridit/ansys/Role=SoftwareManager
so, when your primary attribute is:
attribute : /gridit/ansys/Role=NULL/Capability=NULL
if you use an EMI-2 WMS, you cannot match those resources (instead you can if use EMI-1 WMS)
It seems that the problem is in the value of WmsRequirements contained the file /etc/glite-wms/glite_wms.conf: the filter set in that variable is different from the one used in the EMI-1 WMS. The developers are investigating on it
UPDATE Jan 31st: the fix will be released in EMI-3. However, the developers provided us a rpm, glite-wms-classad_plugin-3.4.99-0.sl5.x86_64.rpm, which we have installed on our EMI-2 WMS servers, and the issue has been fixed
LFC-Oracle problem
Details GGUS #90701
The error is occurring with EMI2 emi-lfc_oracle-1.8.5-1.el5 and Oracle 11:
#lfc-ls lfc-1-kit:/grid send2nsd: NS002 - send error : client_establish_context: The server had a problem while authenticating our connection lfc-1-kit:/grid: Could not secure the connection
Experts suspect it is due to the use of Oracle 11 client when the LFC code has been compiled against the Oracle 10 API. The LFC developers expect to provide rpms built against Oracle 11 shortly.
UPDATE Feb 5th: solved by applying the work around with:
add in /etc/sysconfig/lfcdaemon: export LD_PRELOAD=/usr/lib64/libssl.so:/usr/lib64/libglobus_gssapi_gsi.so.4
the Oracle 11 build wasn't tested because the service is in production