Tools/Manuals/SiteProblemsFollowUp
Jump to navigation
Jump to search
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Tools menu: | • Main page | • Instructions for developers | • AAI Proxy | • Accounting Portal | • Accounting Repository | • AppDB | • ARGO | • GGUS | • GOCDB |
• Message brokers | • Licenses | • OTAGs | • Operations Portal | • Perun | • EGI Collaboration tools | • LToS | • EGI Workload Manager |
Troubleshooting Guide about Operational Errors on EGI Sites
Authentication
Problem with host certificate (expired, etc.) or with authentication.
- TS01: 7 authentication failed
- TS02: 530 530 LCMAPS credential mapping NOT successful
- TS03: 530 530 No local mapping for Globus ID
- TS04: 530-Login incorrect
- TS05: Proxy expired
- TS06: 501 501-FTPD GSSAPI error: GSS Major Status: General failure
- TS07: 535 535-FTPD GSSAPI error: GSS Major Status: General failure
- TS08: Invalid CRL: The available CRL has expired
- TS09: Certificate proxy not yet valid
- TS10: sslv3 alert bad certificate
- TS11: GRAM Authentication test failure:
- TS12: No valid credential found ... Bad magic number
- TS13: Generic verification error for VOMS (failure)!
- TS14: Host certificate update
- TS15: failed unwrapping ENC message
- TS16: failed unwrapping MIC message
- TS17: gss_unwrap: internal problem with SSL BIO
- TS18: no passphrase authentication failed
Information System
Problem in the Information System. Generic documentation:
- Information System home page
- Information System Troubleshooting Guide
- Information System FAQ
- BDII reference card
- Old Information System home page
Specific items:
- TS101: Service absent in site BDII
- TS102: Site absent in top-level BDII
- TS103: Some objects missing in site or top-level BDII
- TS104: Missing SubCluster entries in a top BDII
- TS105: Unreliable gathering of CE Information
- TS106: Value of an attribute looks like MjAwNTAzMjIxNzAwMzRaIA==
- TS107: How to close the site so it won't receive anymore jobs from the RBs
- TS108: Software installation tags not published
- GStat2 TS109: Cores format is wrong
- GStat2 TS110: Benchmark format is wrong
- GStat2 TS111: GlueHostProcessorOtherDescription does not exist
- GStat2 TS112: GlueCEPolicyAssignedJobSlots has negative or null value
- GStat2 TS113: AccessControlBaseRule has an invalid format, ops ACBR has an invalid format
Workload Management
Explanations and recipes for dealing with problems observed for jobs submitted to CEs of various types via the gLite/EMI WMS or Condor-G, or directly to CREAM CEs.
Generic documentation:
- gLite job submission diagram
- gLite CREAM Troubleshooting
- EMI CREAM Troubleshooting
- LCG-CE configuration options and diagram
- Dialog between WMS and LCG-CE
Specific errors:
- TS50: 10 data transfer to the server failed
- TS51: Cannot read JobWrapper output...
- TS52: Cannot download .BrokerInfo
- TS53: BrokerHelper: no compatible resources
- TS54: request expired
- TS55: Jobs sent to my WMS stay in Waiting state forever
- TS56: Jobs sent to some CE stay in Ready state forever
- TS57: Jobs sent to some CE stay in Scheduled state forever
- TS58: Jobs sent to some CE stay in Running state forever
- TS59: 444444 waiting jobs
- TS60: ssh problem from WN to CE
- TS05: Proxy expired
- TS62: Globus error 3
- TS63: submit-helper script ... gave error: cache export dir ...
- TS64: 8 the user cancelled the job
- TS65: 43 the job manager failed to stage the executable
- TS66: Globus error 17: the job failed when the job manager attempted to run it
- TS67: Globus error 21: the job manager failed to locate an internal script argument file
- TS68: Globus error 22: the job manager failed to create an internal script argument file
- TS69: Globus error 24: the job manager detected an invalid script response
- TS70: Globus error 25: the job manager detected an invalid script status
- TS71: Globus error 79: connecting to the job manager failed.
- TS72: Globus error 94: the jobmanager does not accept any new requests (shutting down)
- TS73: Globus error 155: the job manager could not stage out a file
- TS74: Globus error 158: the job manager could not lock the state lock file
- TS75: Unspecified gridmanager error
- TS76: Job got an error while in the CondorG queue
- TS77: GRAM Job submission failed because the job manager failed to open stderr (error code 74)
- TS78: MPI. ssh: connect to host <hname> port 22: No route to host
- TS13: Generic verification error for VOMS (failure)!
- TS80: globus-job-run returns nothing
- TS81: Cannot take token!
- TS82: JS always fails with 'user proxy expired' message
- TS83: lcgpbs job manager cancels all jobs
- TS84: Lots of <defunct> processes from globus-gma
- TS85: Tracing a WMS job ID to the batch system job ID
- TS86: Other Globus job submission error messages
Data Management
A Data Management command failed.
- TS21: lcg cr: Invalid argument
- TS22: 425 425 Can't open data connection. timed out() failed.
- TS23: gridftp works only once within a minute or so
- TS24: LFC and DPM troubleshooting page
- TS12: No valid credential found ... Bad magic number
- TS26: No valid credential found ... System error
- TS27: Could not establish context
- TS28: Transport endpoint is not connected
- TS29: Unknown error ... Communication error on send