Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Tools/Manuals/TS83

From EGIWiki
< Tools
Revision as of 12:48, 23 November 2012 by Krakow (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators



Back to Troubleshooting Guide


lcgpbs job manager cancels all jobs

Full message

Example entries in /var/log/globus-gatekeeper.log:

Apr 20 21:07:42 ce05 gridinfo: [11413-12965] Submitted job
 1208718397:lcgpbs:internal_2177507569:11193.1208718392
 to batch system lcgpbs with ID 4151819.pbs01.pic.es
Apr 20 21:10:39 ce05 gridinfo: [11413-11413] Job
 1208718397:lcgpbs:internal_2177507569:11193.1208718392
 added to DEQUEUE list
Apr 20 21:10:39 ce05 gridinfo: [11413-19604] Job
 1208718397:lcgpbs:internal_2177507569:11193.1208718392
 (batch ID 4151819.pbs01.pic.es) REMOVED from batch system ok

Diagnosis

The /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm code will cancel any job that is reported with 'W' status:

           if(/Q|W|T/)
           {
               if ($status_line eq "W")
               {
                   $self->cancel();
                   $state = Globus::GRAM::JobState::FAILED;
               }
               else
               {
                   $state = Globus::GRAM::JobState::PENDING;
               }
           }


The reason for this behavior is that jobs submitted by "lcgpbs" should never end up in the 'W' state, which signals a configuration problem: a WN failed to stage in files from the CE via "scp". See ssh problem from WN to CE.