Tools/Manuals/TS62
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Back to Troubleshooting Guide
Globus error 3
Full message
************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://my-WMS:9000/SI6d3IRgrYQ65uNVhW3TDQ Current Status: Done (Failed) Exit code: 0 Status Reason: Got a job held event, reason: Globus error 3: an I/O operation failed Destination: some-CE:2119/jobmanager-lcgpbs-long reached on: Mon Jun 8 08:23:28 2009 *************************************************************
also GRAM errors
NORESOURCES
error 3
Diagnosis
Usually caused by lack of memory on the CE where the job was sent to. For example, in /opt/globus/lib/perl/Globus/GRAM/Helper.pm (part of the "lcg" job managers) the queue_submit function calls l_check_memory to check the available memory using the values reported in /proc/meminfo:
my $freefrac = ($memfree+$swapfree)/($memtot+$swaptot); return 1 if $freefrac < $MIN_MEM_FREE;
The job submission will fail if that ratio falls below $MIN_MEM_FREE, which is 0.2 (i.e. 20%) by default.
As pointed out by Rod Walker, Linux usually keeps $memfree small, so the numerator typically is determined by $swapfree. To avoid the ratio accidentally falling below the threshold, the swap space should be at least as large as the physical memory: in that case the ratio will exceed 0.5 until the swap space starts getting used (not likely when the physical memory is very large).
Error 3 can also occur due to the following causes:
- lack of disk space or quota
- a permission problem with the grid account home directory
- a hardware I/O error