Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Middleware issues and solutions"

From EGIWiki
Jump to navigation Jump to search
Line 88: Line 88:


When VOMS server certificate is renewed, retaining the same DN, and ''both old and new'' certificates of this VOMS server are left in /etc/grid-security/vomsdir on the WMS machine,
When VOMS server certificate is renewed, retaining the same DN, and ''both old and new'' certificates of this VOMS server are left in /etc/grid-security/vomsdir on the WMS machine,
a Gridsite bug is exposed and job get refused with ambiguous error
a Gridsite bug is exposed and jobs get refused with ambiguous error


     Unable to delegate the credential to the endpoint: https://wms.your.domain.eu:7443/glite_wms_wmproxy_server
     Unable to delegate the credential to the endpoint: https://wms.your.domain.eu:7443/glite_wms_wmproxy_server

Revision as of 08:45, 21 December 2011

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


Purpose of this page is to document recurring middleware issues with broad impact and the respective solutions and/or workarounds.

This page is maintained by the Distributed Middleware Support Unit of EGI.

CREAM

CREAM refuses Terena-signed VOMS proxies

TERENA eScience SSL CA sets pathlen attribute among its basic constraints (see http://www.openssl.org/docs/apps/x509v3_config.html).

This triggers a bug in VOMS java API (the API incorrectly applies the policy on checking the chain of the attribute certificate), which is used by CREAM, resulting in errors on job submission:

   FATAL - User  <user's DN> not authorized for operation {http://www.gridsite.org/namespaces/delegation-2}getProxyReq

A workaround is not using TERENA eScience SSL CA signed certificates as host certificates of VOMS servers.

The problem is fixed in voms-api-java-2.0.5, released in EMI. A backport to gLite 3.2 is unofficially available with unreleased patch [#4997] (no gLite 3.2 update is expected).

GGUS ticket [#76129]

/tmp fills up with x509up*glexec files

Nov 27, 2011

CREAM CE uses glexec to gain the right local user identity for the job. As a sideeffect, a matching x509 proxy file is created in /tmp. The result is filling /tmp up with these files.

Starting from EMI-1 CREAM release, which uses glexec-0.8, the problem is avoided by the setting

  create_target_proxy = no

in the glexec.conf file.

In gLite CREAM releases it is worked around by putting

 <parameter name="glexec_probe_cmd" value="/opt/glite/bin/glexecprobe" />
 <parameter name="methods" value="JobRegister, putProxy" />

into the section

 <authzchain name="chain-1">
   <plugin name="localuserpip"

of /opt/glite/etc/glite-ce-cream/cream-config.xml. This is done automatically by YAIM but not by Quattor.

GGUS ticket [[1]]

VOMS

VOMS server fails with high number of VOs

VOMS server of gLite 3.2 is more memory greedy, it starts failing when configured to serve more than 10 (approx.) VOs.

Change -XX:MaxPermSize parameter of CATALINA_OPTS to the value of at least 512m in /etc/tomcat5/tomcat5.conf

 CATALINA_OPTS="-Xmx1508M -server -Dsun.net.client.defaultReadTimeout=240000 -XX:MaxPermSize=512m"

and add

 * soft nofile 2048
 * hard nofile 2048

into /etc/security/limits.conf.

GGUS ticket #72136

WMS

Job submission breaks when VOMS server certificate is renewed

When VOMS server certificate is renewed, retaining the same DN, and both old and new certificates of this VOMS server are left in /etc/grid-security/vomsdir on the WMS machine, a Gridsite bug is exposed and jobs get refused with ambiguous error

   Unable to delegate the credential to the endpoint: https://wms.your.domain.eu:7443/glite_wms_wmproxy_server

The workaround is manually removing the old VOMS server certificate from /etc/grid-security/vomsdir (it's not used either).

GGUS ticket [#77256]

Storage Element

BDII does not start at SE node

BDII daemon does not start correctly at SE node, yielding the service not to be published to GOC, etc. Symptoms are error messages:

  # service ldap restart
  Stopping slapd: [ OK ]
  Checking configuration files for slapd: bdb_db_open: Warning - No
  DB_CONFIG file found in directory /var/lib/ldap: (2)
  Expect poor performance for suffix dc=my-domain,dc=com.
  config file testing succeeded
  [ OK ]
  Starting slapd: [ OK ]

The problem is caused by settting the BDII_USER variable in site-info.def, which causes incorrect permission settings on some files slapd uses. This variable should not be set at SE nodes, it's intended for BDII node only.

GGUS ticket [#73086]

Globus

Lack of support for PKCS#8

Globus does not suporrt PKCS#8 format of private keys. Hovever, this is the default for OpenSSL 1.x. Therefore, key-certificate pairs generated by OpenSSL 1.x in the default way are not directly usable with services based on Globus (many former gLite ones), yielding errors like

   globus_gsi_gssapi: Unable to read credential for import
   globus_gsi_gssapi: Error with GSI credential
   globus_credential: Error reading proxy credential: Unhandled PEM sequence: PRIVATE KEY


The problem can be worked around by converting the PKCS#8 key to RSA

   openssl pkcs8 -in key.pk8 -out key-temp.pem
   openssl rsa -in key-temp.pem -out key.pem
   rm key-temp.pem

GGUS ticket [#77148]