Tools/Manuals/TS53

From EGIWiki
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators



Back to Troubleshooting Guide


BrokerHelper: no compatible resources

Full message

*************************************************************
BOOKKEEPING INFORMATION: 

Status info for the Job : https://gswms01.cern.ch:9000/ZUYZjkKFx8AfK6qq34E2Qw
Current Status:     Waiting
Status Reason:      BrokerHelper: no compatible resources
Submitted:          Sun Mar 27 23:16:52 2011 CEST
*************************************************************

And finally:

*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://gswms01.cern.ch:9000/ZUYZjkKFx8AfK6qq34E2Qw
Current Status:     Aborted 
Status Reason:      request expired
Submitted:          Sun Mar 27 23:16:52 2011 CEST
*************************************************************

Diagnosis

The job's JDL file contained requirements that could not be satisfied by any CE listed in the BDII used by the WMS. There can be various causes:

  • mistake in the JDL
  • the target CE/site is unavailable
    • it does not appear in the info system
    • it does not have status Production
      • note: CREAM will automatically set the state to Draining when it considers the CE "load" too high (check the CREAM docs for details)
    • its AccessControlBaseRule does not match your VO/FQAN
      • (it was possible for the necessary rule to have been removed by the "FCR" filter of the BDII used by the WMS, but this mechanism should no longer be in use as of late 2013)
    • its current value of TotalJobs equals or exceeds its value for MaxTotalJobs
  • the BDII used by the WMS has some problem (e.g. too slow)
  • the WMS parameters for querying the BDII are suboptimal (e.g. timeout too short)
  • the WMS has a stale image of the information system: a problem that may occur occasionally on some WMS
    • cure: /etc/init.d/glite-wms-wm restart

The WMS will repeatedly retry the match making after a configurable delay (by default 600s) until the end of a configurable grace period (by default 2h), after which the request is considered expired and the job is aborted if it still does not match any CE.

Usually it is advisable to avoid large values for the grace period, such that users may learn after a reasonable time which jobs are affected by the problem. The user can shorten the grace period through the "--to" and "--valid" options of glite-wms-job-submit.