Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Jobs work directory and temportary directory"

From EGIWiki
Jump to navigation Jump to search
Line 34: Line 34:
====Standard environment variable====
====Standard environment variable====
Even if for most of the applications a temporary directory contained in the same file system of the working directory is a good option, it would be better to keep their definitions separated. If for parallel jobs a home directory in a shared file system might be a good option, there is not reason to use this file system for the large temporary files, which belong to a single job and are used only by that job.
Even if for most of the applications a temporary directory contained in the same file system of the working directory is a good option, it would be better to keep their definitions separated. If for parallel jobs a home directory in a shared file system might be a good option, there is not reason to use this file system for the large temporary files, which belong to a single job and are used only by that job.
A variable specifying the proper path for (largish) temporary files might be set in the job environment, before the execution, and used in the job code to create those files. The name of the variable should be a standard feature for all the batch systems, it would be part of the standard environment that every job can expect on the worker node.  
A variable specifying the proper path for (largish) temporary files might be set in the job environment, before the execution, and used in the job code to create those files. The name of the variable should be a standard feature for all the batch systems, it would be part of the standard environment that every job can expect on the worker node.  
Currently the definition of a temporary directory different from the workdir, is not a problem extensively addressed. Currently the name of this variable and how to set it -from batch system, in the JobWrapper…- are still open issues.
Currently the definition of a temporary directory different from the workdir, is not a problem extensively addressed. Currently the name of this variable and how to set it -from batch system, in the JobWrapper…- are still open issues.

Revision as of 09:20, 8 March 2011

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Problem description

Workdir

The workdir is the directory associated to a user’s job. It is the directory where all the files are created, when a path is not specified, or used to map a relative path into an absolute path. It is basically the current directory (the path specified in the $PWD environment variable) from where a process, a job, is run. In many configurations a batch job is run from a workdir contained in the unix user’s home directory. Often the home directories are imported by worker nodes from a shared file system. This could raise serious performance issues for the file servers, having many jobs accessing to a distributed file system, not really needing distributed data.

Temp dir

The temporary directory is used by jobs to create large, temporary files. The unix convention is to use the /tmp directory. However in some configurations it would be better to point the jobs to a better file system for large temporary files, for performance issues or to avoid clashes between jobs running in the same worker node.

Proposed solutions

Wordir

Workdir: Batch system configuration

The clean and right way to address the problem is probably properly configuring the batch system, in order to set the desired directory under which job working directories should be created. This solution is up to site managers that install e configure the batch system, as long as the LRMS allows to tune this configuration. It may be worthy also to specify different workdir for parallel jobs: the parallel jobs may need to be run froma distributed file system.

Workdir: JobWrapper customization points

If for some reasons the previous solution is not possible and if the directory assigned by the batch system (which by default is usually the home directory of the local account) is not ok, a possible solution for this issue is referring to the customization points in the JobWrapper.

A user job (the one specified as 'executable' in the job JDL) is "included" in a JobWrapper which, besides running the user payload, is also responsible for other operations (input/output sandbox management, LB logging, etc.). This JobWrapper can be created by by the WMS, for jobs submitted to LCG-CEs through the WMS, or by CREAM, for all job submission paths (direct submissions, submissions through the WMS, submissions through Condor)

In the current implementations of the JobWrapper (for both LCG-CE and CREAM-CE) there is not an explicit "cd" operation before the creation the job working directory. This means that the job working directory is created under the directory "assigned" by the batch system (which is, as said above, the home directory of the local account mapped to that Grid user).

As referred in [R2] and [R3] customization points are scripts (to be provided by the local administrator) which are run by the JobWrapper (both the WMS JobWrapper and the CREAM JobWrapper). There are customization points script run before and after the job execution. In particular the first customization point [R1]:

 ${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1.sh , is executed in the beginning of the JobWrapper execution, 
 before the creation of the job working directory.

Therefore such customization point that must be created in each WN, could be used to "cd" to the desired site specific directory.

Tempdir

Standard environment variable

Even if for most of the applications a temporary directory contained in the same file system of the working directory is a good option, it would be better to keep their definitions separated. If for parallel jobs a home directory in a shared file system might be a good option, there is not reason to use this file system for the large temporary files, which belong to a single job and are used only by that job.

A variable specifying the proper path for (largish) temporary files might be set in the job environment, before the execution, and used in the job code to create those files. The name of the variable should be a standard feature for all the batch systems, it would be part of the standard environment that every job can expect on the worker node.

Currently the definition of a temporary directory different from the workdir, is not a problem extensively addressed. Currently the name of this variable and how to set it -from batch system, in the JobWrapper…- are still open issues.