Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Agenda-28-01-2013

From EGIWiki
Jump to navigation Jump to search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security



Detailed agenda: Grid Operations Meeting 07 January 2013

Audio conference link No password
Audio conference details Indico page


1. Middleware releases and staged rollout

1.1. Update on the status of EMI updates

1.2. Staged Rollout

  • For UMD-1 (UMD-1.9.0) we have 2 products
    • Gridsite 1.7.25 which is under verification
    • L&B 3.2.9 in Stage Rollout (same version in UMd-2 production)
  • For UMD-2 (UMD-2.4.0)
    • Several products ready for production:
      • Cream 1.14.2
      • Cream Torque 2.0.0-2
      • Blah 1.18.2
      • All ARC 2.0.1 components
      • Glite Mpi - 1.4.0
      • Grobus rls 5.2.2
      • Globus MyProxy 5.2.2
      • Gridway - 5.12.0
    • In Stagged Rollout:
      • WMS 3.4.0 (some issues found)
      • Globus default security 5.2.2
      • Security Integration - 2.2.1
      • Gridsite 1.7.25
  • other products
    • CA update, version 1.52-1 (under SR) to be released at 30-01-2013
    • SAM-Update 20 (SR done)


  • New Products
    • IGE gridSAM (in verification)

2. Operational Issues

2.1 Status of unsupported middleware update

2.2 Updates from DMSU

lcg-gt problems with dcache SE

Details GGUS #90807

The current version of lcg-util and gfal (1.13.9-0) return the following error, apparently only when using dCache SEs:

$ lcg-gt -D srmv2 -T srmv2 srm://srm.triumf.ca/dteam/generated/2013-01-25/filed56b1d3e-76f8-4f5a-9b32-94e6d038ab4b gsiftp
gsiftp://dpool13.triumf.ca:2811/generated/2013-01-25/filed56b1d3e-76f8-4f5a-9b32-94e6d038ab4b
[ERROR] No request token returned with SRMv2

Instead, using an lder version of lcg-utils, like the one deployed in gLite, the command lcg-gt works fine. Indeed nagios doesn't detect this problem because it is still gLite-based (lcg_util-1.11.16-2 and GFAL-client-1.11.16-2)

the develpers are investigating on this issue

LFC-Oracle problem

Details GGUS #90701

The error is occurring with EMI2 emi-lfc_oracle-1.8.5-1.el5 and Oracle 11:

#lfc-ls lfc-1-kit:/grid

send2nsd: NS002 - send error : client_establish_context: The server had a problem while authenticating our connection
lfc-1-kit:/grid: Could not secure the connection

Experts suspect it is due to the use of Oracle 11 client when the LFC code has been compiled against the Oracle 10 API. The LFC developers expect to provide rpms built against Oracle 11 shortly.

list-match problem with EMI2 WMS

Details GGUS #90240

Some CEs have enabled only a group or a role in their queues, not the entire VO:

GlueCEAccessControlBaseRule: VOMS:/gridit/ansys
GlueCEAccessControlBaseRule: VOMS:/gridit/ansys/Role=SoftwareManager

so, when your primary attribute is:

attribute : /gridit/ansys/Role=NULL/Capability=NULL

if you use an EMI-2 WMS, you cannot match those resources (instead you can if use EMI-1 WMS)

It seems that the problem is in the value of WmsRequirements contained the file /etc/glite-wms/glite_wms.conf: the filter set in that variable is different from the one used in the EMI-1 WMS. The developers are investigating on it

proxy renewal problems on EMI1 WMS

Details GGUS #89801

Under some circumstances, ICE cannot renew the user credentials due to glite-wms-ice-proxy-renew hanging processes. It is believed that the guilty is this Savannah bug. The bug is already solved in EMI2.

Problems with aliased DNS names of myproxy

Details GGUS #89105

DNS aliases of myproxy server (i.e. used to implement round-robin load balance and/or high availability) may cause problems to proxy renewal when all DNS aliases, including the canonical name, are not included in the host certificate of the myproxy server SubjectAltNames extensions.

The failure may not appear always (it depends on multiple conditions like versions of globus etc.), however, sites are encouraged to use certificates which cover all the DNS aliases thoroughly.

EMI-2 WN: yaim bug for cleanup-grid-accounts

Detail GGUS #90486

For a bug, the cleanup-grid-accounts procedure doesn't properly work, so the occupied space on WNs may increase.

the yaim function config_lcgenv unsets the path $INSTALL_ROOT, so it isn't valid the path usesd by the cron cleanup-grid-accounts:

# cat /etc/cron.d/cleanup-grid-accounts
PATH=/sbin:/bin:/usr/sbin:/usr/bin
36 3 * * * root /sbin/cleanup-grid-accounts.sh -v >> /var/log/cleanup-grid-accounts.log 2>&1
# tail /var/log/cleanup-grid-accounts.log
/bin/sh: /sbin/cleanup-grid-accounts.sh: No such file or directory
# ls -l /sbin/cleanup-grid-accounts.sh
ls: /sbin/cleanup-grid-accounts.sh: No such file or directory
# ls -l /usr/sbin/cleanup-grid-accounts.sh
-rwxr-xr-x 1 root root 6747 May 16  2012 /usr/sbin/cleanup-grid-accounts.sh

Until the fix is released in production a workaround could be applied by changing the cleanup-grid-accounts cron with the correct path, like:

# cat /etc/cron.d/cleanup-grid-accounts
PATH=/sbin:/bin:/usr/sbin:/usr/bin
16 3 * * * root /usr/sbin/cleanup-grid-accounts.sh -v >> /var/log/cleanup-grid-accounts.log 2>&1

Another workaround is also possible:

/opt/glite/yaim/bin/yaim -r -s -n WN -n TORQUE_client -n GLEXEC_wn -f config_users

This does not execute config_lcgenv, therefore $INSTALL_ROOT is set correctly (to /usr).

Currently active surveys

3. AOB

3.2 Next meeting

4. Minutes