Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-11-02-2013"

From EGIWiki
Jump to navigation Jump to search
 
(15 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}  
{{Template:Op menubar}}  


= Detailed agenda: Grid Operations Meeting 07 January 2013 =
= Detailed agenda: Grid Operations Meeting 11 Feb 2013 =


{|
{|
Line 21: Line 21:
* Final preparation of the [https://wiki.egi.eu/wiki/UMD-2:UMD-2.4.0 UMD 2.4.0]
* Final preparation of the [https://wiki.egi.eu/wiki/UMD-2:UMD-2.4.0 UMD 2.4.0]
** Over 15 products will be ready to be released [https://rt.egi.eu/rt/Dashboards/4248/Software%20Provisioning%20UMD-2 Full list]
** Over 15 products will be ready to be released [https://rt.egi.eu/rt/Dashboards/4248/Software%20Provisioning%20UMD-2 Full list]
*** DPM & LFC, VOMS solves some important security vulnerabilities
*** Some important security vulnerabilities
**** A vulnerability has been found in VOMS Java APIs
**** VOMS: important update
**** DPM & LFC, important udpate
**** UNICORE/X6 -> still in verification
*** GFAL 1.14.0 will solve the ATLAS data transfer problems: timeout issues in lcg_util and/or GFAL commands. In particular: lcg-cp fails with files over 5 GB  [https://ggus.eu/tech/ticket_show.php?ticket=89998 GGUS ticket]
*** GFAL 1.14.0 will solve the ATLAS data transfer problems: timeout issues in lcg_util and/or GFAL commands. In particular: lcg-cp fails with files over 5 GB  [https://ggus.eu/tech/ticket_show.php?ticket=89998 GGUS ticket]


Line 29: Line 31:
** Security updates for UMD-1
** Security updates for UMD-1
*** DPM & LFC 1.8.6 and VOMS 2.0.10-1
*** DPM & LFC 1.8.6 and VOMS 2.0.10-1
*** UNICORE/X6 -> still in verification
 
* Middleware products verified for the support of SHA-2 proxies and certificates
** [https://wiki.egi.eu/wiki/Middleware_products_verified_for_the_support_of_SHA-2_proxies_and_certificates List of Products]
 
*  Several NGI's demonstrated there interest on the EMI WN tarball.
** Work being done by Matt Doigge from Lancaster UK
** For reference the GGUS tickets are regarding the wn tarball status:
*** Tickets: [https://ggus.eu/ws/ticket_info.php?ticket=91032 91032] ; [https://ggus.eu/ws/ticket_info.php?ticket=91145 91145] ; [https://ggus.eu/ws/ticket_info.php?ticket=91007 91007]


=== 2. Operational Issues  ===
=== 2. Operational Issues  ===


==== 2.1 Status of unsupported middleware update ====
==== 2.1 Status of unsupported middleware update ====
===== Status of the tickets opened by COD =====
* Still there are 11 sites raising alarms
** Two sites (NGI_RU) with missing scheduled downtime
===== Status of WN/DPM/LFC/dCache tickets =====
* On 08 Feb 2013: 26 sites with unsupported deployed middleware (8 NGIs)
* 5 Sites waiting for the WN tarballs
* 17 sites in downtime
==== 2.2 Updates from DMSU ====
==== 2.2 Updates from DMSU ====
===== list-match problem with EMI2 WMS =====
Details [https://ggus.eu/tech/ticket_show.php?ticket=90240 GGUS #90240]
Some CEs have enabled only a group or a role in their queues, not the entire VO:
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys/Role=SoftwareManager
so, when your primary attribute is:
attribute : /gridit/ansys/Role=NULL/Capability=NULL
if you use an EMI-2 WMS, you cannot match those resources (instead you can if use EMI-1 WMS)
It seems that the problem is in the value of ''WmsRequirements'' contained the file ''/etc/glite-wms/glite_wms.conf'': the filter set in that variable is different from the one used in the EMI-1 WMS. The developers are investigating on it
'''UPDATE Jan 31st''': the fix will be released in EMI-3. However, the developers provided us a rpm, ''glite-wms-classad_plugin-3.4.99-0.sl5.x86_64.rpm'', which we have installed on our EMI-2 WMS servers, and the issue has been fixed
===== LFC-Oracle problem =====
Details [https://ggus.eu/tech/ticket_show.php?ticket=90701 GGUS #90701]
The error is occurring with EMI2 emi-lfc_oracle-1.8.5-1.el5 and Oracle 11:
#lfc-ls lfc-1-kit:/grid
send2nsd: NS002 - send error : client_establish_context: The server had a problem while authenticating our connection
lfc-1-kit:/grid: Could not secure the connection
Experts suspect it is due to the use of Oracle 11 client when the LFC code has been
compiled against the Oracle 10 API. The LFC developers expect
to provide rpms built against Oracle 11 shortly.
'''UPDATE Feb 5th''': solved by applying the work around with:
add in /etc/sysconfig/lfcdaemon:
export LD_PRELOAD=/usr/lib64/libssl.so:/usr/lib64/libglobus_gssapi_gsi.so.4
the Oracle 11 build wasn't tested because the service is in production


=== 3. AOB  ===
=== 3. AOB  ===
Line 42: Line 102:
=== 4. Minutes  ===
=== 4. Minutes  ===
[[Category:Grid_Operations_Meetings]]
[[Category:Grid_Operations_Meetings]]
[https://indico.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=1328 Minutes on indico page]

Latest revision as of 16:00, 18 February 2013

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security



Detailed agenda: Grid Operations Meeting 11 Feb 2013

Audio conference link Pass: gridops
Audio conference details Indico page


1. Middleware releases and staged rollout

1.1. Update on the status of EMI updates

1.2. Staged Rollout

  • Final preparation of the UMD 2.4.0
    • Over 15 products will be ready to be released Full list
      • Some important security vulnerabilities
        • VOMS: important update
        • DPM & LFC, important udpate
        • UNICORE/X6 -> still in verification
      • GFAL 1.14.0 will solve the ATLAS data transfer problems: timeout issues in lcg_util and/or GFAL commands. In particular: lcg-cp fails with files over 5 GB GGUS ticket
  • updates for UMD-1
    • L&B 3.2.9
    • Security updates for UMD-1
      • DPM & LFC 1.8.6 and VOMS 2.0.10-1
  • Middleware products verified for the support of SHA-2 proxies and certificates
  • Several NGI's demonstrated there interest on the EMI WN tarball.
    • Work being done by Matt Doigge from Lancaster UK
    • For reference the GGUS tickets are regarding the wn tarball status:

2. Operational Issues

2.1 Status of unsupported middleware update

Status of the tickets opened by COD
  • Still there are 11 sites raising alarms
    • Two sites (NGI_RU) with missing scheduled downtime
Status of WN/DPM/LFC/dCache tickets
  • On 08 Feb 2013: 26 sites with unsupported deployed middleware (8 NGIs)
  • 5 Sites waiting for the WN tarballs
  • 17 sites in downtime

2.2 Updates from DMSU

list-match problem with EMI2 WMS

Details GGUS #90240

Some CEs have enabled only a group or a role in their queues, not the entire VO:

GlueCEAccessControlBaseRule: VOMS:/gridit/ansys
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys/Role=SoftwareManager

so, when your primary attribute is:

attribute : /gridit/ansys/Role=NULL/Capability=NULL

if you use an EMI-2 WMS, you cannot match those resources (instead you can if use EMI-1 WMS)

It seems that the problem is in the value of WmsRequirements contained the file /etc/glite-wms/glite_wms.conf: the filter set in that variable is different from the one used in the EMI-1 WMS. The developers are investigating on it

UPDATE Jan 31st: the fix will be released in EMI-3. However, the developers provided us a rpm, glite-wms-classad_plugin-3.4.99-0.sl5.x86_64.rpm, which we have installed on our EMI-2 WMS servers, and the issue has been fixed

LFC-Oracle problem

Details GGUS #90701

The error is occurring with EMI2 emi-lfc_oracle-1.8.5-1.el5 and Oracle 11:

#lfc-ls lfc-1-kit:/grid

send2nsd: NS002 - send error : client_establish_context: The server had a problem while authenticating our connection
lfc-1-kit:/grid: Could not secure the connection

Experts suspect it is due to the use of Oracle 11 client when the LFC code has been compiled against the Oracle 10 API. The LFC developers expect to provide rpms built against Oracle 11 shortly.

UPDATE Feb 5th: solved by applying the work around with:

add in /etc/sysconfig/lfcdaemon:
export LD_PRELOAD=/usr/lib64/libssl.so:/usr/lib64/libglobus_gssapi_gsi.so.4

the Oracle 11 build wasn't tested because the service is in production

3. AOB

3.2 Next meeting

4. Minutes

Minutes on indico page