Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

EGI QC Specific

From EGIWiki
Revision as of 09:00, 18 September 2013 by Enolfc (talk | contribs) (Created page with "== Job Execution Appliances == This category covers Computing Elements products (CREAM, ARC-CE, QCG-COMP,...) === Interaction with the batch system === Job execution appliance...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Job Execution Appliances

This category covers Computing Elements products (CREAM, ARC-CE, QCG-COMP,...)

Interaction with the batch system

Job execution appliances must be able to perform basic management jobs in a batch system:

  • create new jobs,
  • retrieve the status of the jobs submitted by the appliance,
  • cancel jobs, and
  • (optionally) hold and resume jobs

The Appliance may perform these operations for individual jobs or for set of jobs in order to improve its performance (e.g. for retrieving the status instead of querying each of the individual jobs, do a single query for all jobs submitted for the appliance)

Verification must be performed for at least one of the following batch systems:

  • Torque/PBS
  • SGE/OGE
  • SLURM
  • LSF

How to test

  • Submit simple jobs (e.g. sleep for a couple of minutes) to the Job Execution Appliance and check:
 * the jobs are correctly executed in the batch system
 * the status of the job is retrieved correctly and in a timely manner (i.e. status may not be updated in real-time, but it should be available within a short period of time)
 * cancel the jobs in the Appliance removes the job in the batch system
  • Submit jobs with some input/output files and assure that the files are correctly transferred.

Sample jobs for some CEs are available at https://github.com/enolfc/egi-qc/tree/master/tests/jobexecution

Multi-node/multi-core jobs

Job Execution Appliances should support multi-node/-core jobs. Different support modes are considered:

  • multi-slot request: the job specifies the number of slots, which will be allocated
 following a default policiy defined by the site (e.g. filling up machines, using free
 slots of any machine, etc.)
  • single-machine multi-core request: the job specifies number of required slots that get
 allocated within a single machine.
  • multi-node multi-core request: job can specify the number of cores and the number of hosts
 to use (e.g. 4-cores at 2 different hosts)
  • Exclusive request: job request specifies the hosts to be used exclusively.

How to test

Submit jobs for testing the different modes listed above and check in the batch system that the allocated slots are as specified.

Sample jobs for some CEs are available at https://github.com/enolfc/egi-qc/tree/master/tests/jobexecution

Parallel jobs

Storage Management Appliances

This category covers Storage Elements products (DPM, dCache, StoRM, ARC-CE,...)

VOMS

Job Scheduling

This category covers WMS and qcg-broker

WMS

Interactive Job

Client Tools