Tools/Manuals/TS83
< Tools
Jump to navigation
Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Back to Troubleshooting Guide
lcgpbs job manager cancels all jobs
Full message
Example entries in /var/log/globus-gatekeeper.log:
Apr 20 21:07:42 ce05 gridinfo: [11413-12965] Submitted job 1208718397:lcgpbs:internal_2177507569:11193.1208718392 to batch system lcgpbs with ID 4151819.pbs01.pic.es Apr 20 21:10:39 ce05 gridinfo: [11413-11413] Job 1208718397:lcgpbs:internal_2177507569:11193.1208718392 added to DEQUEUE list Apr 20 21:10:39 ce05 gridinfo: [11413-19604] Job 1208718397:lcgpbs:internal_2177507569:11193.1208718392 (batch ID 4151819.pbs01.pic.es) REMOVED from batch system ok
Diagnosis
The /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm code will cancel any job that is reported with 'W' status:
if(/Q|W|T/) { if ($status_line eq "W") { $self->cancel(); $state = Globus::GRAM::JobState::FAILED; } else { $state = Globus::GRAM::JobState::PENDING; } }
The reason for this behavior is that jobs submitted by "lcgpbs" should
never end up in the 'W' state, which signals a configuration
problem: a WN failed to stage in files from the CE via "scp".
See ssh problem from WN to CE.