Tools/Manuals/TS63

From EGIWiki
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators



Back to Troubleshooting Guide


submit-helper script ... gave error: cache export dir ...

Full message

A command like

globus-job-run my-CE/jobmanager-lcgpbs -q ops /bin/hostname

returns an error like:

submit-helper script running on host lxb1761 gave error: cache_export_dir
(/home/dteam002/.lcgjm/globus-cache-export.Of5sOd) on gatekeeper did not
contain a cache_export_dir.tar archive

Diagnosis

The WN cannot do a globus-url-copy back to its CE (needed by the "lcg" jobmanagers). There can be various causes; see below for possible solutions. To test globus-url-copy on the WN, the admin can imitate this example:

  • On the UI do a voms-proxy-init
  • Copy the proxy to the WN:
scp /tmp/x509up_u`id -u` root@my-WN:/tmp/test_proxy
  • On the WN as root:
chown dteam050 /tmp/test_proxy
su - dteam050
  • On the WN as "dteam050":
export X509_USER_PROXY=/tmp/test_proxy
globus-url-copy file:/etc/group gsiftp://my-CE/tmp/test.$$

If there is an error, use the -dbg option to get more details, if needed.

Solution

The error message from globus-url-copy will usually explain the problem.

Possible causes include:

  • Some CRLs on the WN or CE are out of date. Run the cron job manually, check for errors.
  • Other CA files in $X509_CERT_DIR (by default /etc/grid-security/certificates) on the WN or CE are absent or have expired. Check if all of the latest CA rpms have been installed.
  • The gridftp daemon is not running on the CE.
  • CE and WN are not time-synchronized. Even a difference of less than 1 minute can cause a problem.
  • The gatekeeper and the gridftpd on the CE do not map the DN to the same local account (this should never happen on an LCG-CE). Check this as follows:
globus-job-run my-CE /usr/bin/id
globus-url-copy file:/etc/group gsiftp://my-CE/tmp/test.$$
globus-job-run my-CE /bin/ls -l /tmp/test.$$
  • On the WN some script in /etc/profile.d or so unconditionally sets X509_USER_PROXY. It must only be set (to /tmp/x509up_u`id -u`) if it has not been defined already. A job will have its proxy in a temporary file somewhere else.