Tools/Manuals/SiteProblemsFollowUp
Jump to navigation
Jump to search
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Tools menu: | • Main page | • Instructions for developers | • AAI Proxy | • Accounting Portal | • Accounting Repository | • AppDB | • ARGO | • GGUS | • GOCDB |
• Message brokers | • Licenses | • OTAGs | • Operations Portal | • Perun | • EGI Collaboration tools | • LToS | • EGI Workload Manager |
Troubleshooting Guide about Operational Errors on EGI Sites
Authentication
Problem with host certificate (expired, etc.) or with authentication.
- TS01: 7 authentication failed
- TS02: 530 530 LCMAPS credential mapping NOT successful
- TS03: 530 530 No local mapping for Globus ID
- TS04: 530-Login incorrect
- TS05: Proxy expired
- TS06: 501 501-FTPD GSSAPI error: GSS Major Status: General failure
- TS07: 535 535-FTPD GSSAPI error: GSS Major Status: General failure
- TS08: Invalid CRL: The available CRL has expired
- TS09: Certificate proxy not yet valid
- TS10: sslv3 alert bad certificate
- TS11: GRAM Authentication test failure:
- TS12: No valid credential found ... Bad magic number
- TS13: Generic verification error for VOMS (failure)!
- TS14: Host certificate update
- TS15: failed unwrapping ENC message
- TS16: failed unwrapping MIC message
- TS17: gss_unwrap: internal problem with SSL BIO
- TS18: no passphrase authentication failed
Workload Management
Explanations and recipes for dealing with problems observed for jobs submitted to CEs of various types via the gLite/EMI WMS or Condor-G, or directly to CREAM CEs.
Generic documentation:
- gLite job submission diagram
- gLite CREAM Troubleshooting
- EMI CREAM Troubleshooting
- LCG-CE configuration options and diagram
- Dialog between WMS and LCG-CE
Specific errors:
- TS50: 10 data transfer to the server failed
- TS51: Cannot read JobWrapper output...
- TS52: Cannot download .BrokerInfo
- TS53: BrokerHelper: no compatible resources
- TS54: request expired
- TS55: Jobs sent to my WMS stay in Waiting state forever
- TS56: Jobs sent to some CE stay in Ready state forever
- TS57: Jobs sent to some CE stay in Scheduled state forever
- TS58: Jobs sent to some CE stay in Running state forever
- TS59: 444444 waiting jobs
- TS60: ssh problem from WN to CE
- TS05: Proxy expired
- TS62: Globus error 3
- TS63: submit-helper script ... gave error: cache export dir ...
- TS64: 8 the user cancelled the job
- TS65: 43 the job manager failed to stage the executable
- TS66: Globus error 17: the job failed when the job manager attempted to run it
- TS67: Globus error 21: the job manager failed to locate an internal script argument file
- TS68: Globus error 22: the job manager failed to create an internal script argument file
- TS69: Globus error 24: the job manager detected an invalid script response
- TS70: Globus error 25: the job manager detected an invalid script status
- TS71: Globus error 79: connecting to the job manager failed.
- TS72: Globus error 94: the jobmanager does not accept any new requests (shutting down)
- TS73: Globus error 155: the job manager could not stage out a file
- TS74: Globus error 158: the job manager could not lock the state lock file
- TS75: Unspecified gridmanager error
- TS76: Job got an error while in the CondorG queue
- TS77: GRAM Job submission failed because the job manager failed to open stderr (error code 74)
- TS78: MPI. ssh: connect to host <hname> port 22: No route to host
- TS13: Generic verification error for VOMS (failure)!
- TS80: globus-job-run returns nothing
- TS81: Cannot take token!
- TS82: JS always fails with 'user proxy expired' message
- TS83: lcgpbs job manager cancels all jobs
- TS84: Lots of <defunct> processes from globus-gma
- TS85: Tracing a WMS job ID to the batch system job ID
- TS86: Other Globus job submission error messages
Data Management
A Data Management command failed.
- TS21: lcg cr: Invalid argument
- TS22: 425 425 Can't open data connection. timed out() failed.
- TS23: gridftp works only once within a minute or so
- TS24: LFC and DPM troubleshooting page
- TS12: No valid credential found ... Bad magic number
- TS26: No valid credential found ... System error
- TS27: Could not establish context
- TS28: Transport endpoint is not connected
- TS29: Unknown error ... Communication error on send