Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Tools/Manuals/TS62"

From EGIWiki
Jump to navigation Jump to search
(Created page with '{{TOC_right}} Category:FAQ ------ Back to Troubleshooting Guide ------ = Globus error 3 = == Full message == ***********************…')
 
m
Line 29: Line 29:
Usually caused by '''lack of memory''' on the CE where the job was sent to.
Usually caused by '''lack of memory''' on the CE where the job was sent to.
For example, in <font face="Courier New,Courier">/opt/globus/lib/perl/Globus/GRAM/Helper.pm</font> (part of
For example, in <font face="Courier New,Courier">/opt/globus/lib/perl/Globus/GRAM/Helper.pm</font> (part of
the "lcg" job managers) the {{{queue_submit}}} function calls
the "lcg" job managers) the <font face="Courier New,Courier">queue_submit</font> function calls
<font face="Courier New,Courier">l_check_memory</font> to check the available memory using the values
<font face="Courier New,Courier">l_check_memory</font> to check the available memory using the values
reported in <font face="Courier New,Courier">/proc/meminfo</font>:
reported in <font face="Courier New,Courier">/proc/meminfo</font>:

Revision as of 14:43, 25 May 2011


Back to Troubleshooting Guide


Globus error 3

Full message

*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://my-WMS:9000/SI6d3IRgrYQ65uNVhW3TDQ
Current Status:     Done (Failed)
Exit code:          0
Status Reason:      Got a job held event, reason: Globus error 3: an I/O operation failed
Destination:        some-CE:2119/jobmanager-lcgpbs-long
reached on:         Mon Jun  8 08:23:28 2009
*************************************************************

also GRAM errors

NORESOURCES
error 3

Diagnosis

Usually caused by lack of memory on the CE where the job was sent to. For example, in /opt/globus/lib/perl/Globus/GRAM/Helper.pm (part of the "lcg" job managers) the queue_submit function calls l_check_memory to check the available memory using the values reported in /proc/meminfo:

   my $freefrac = ($memfree+$swapfree)/($memtot+$swaptot);

   return 1 if $freefrac < $MIN_MEM_FREE;

The job submission will fail if that ratio falls below $MIN_MEM_FREE, which is 0.2 (i.e. 20%) by default.

As pointed out by Rod Walker, Linux usually keeps $memfree small, so the numerator typically is determined by $swapfree. To avoid the ratio accidentally falling below the threshold, the swap space should be at least as large as the physical memory: in that case the ratio will exceed 0.5 until the swap space starts getting used (not likely when the physical memory is very large).

Error 3 can also occur due to the following causes:

  • lack of disk space or quota
  • a permission problem with the grid account home directory
  • a hardware I/O error