Workload Manager

From EGIWiki
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Tools menu: Main page Instructions for developers AAI Proxy Accounting Portal Accounting Repository AppDB ARGO GGUS GOCDB
Message brokers Licenses OTAGs Operations Portal Perun EGI Collaboration tools LToS EGI Workload Manager


Contents

Overview

EGI Workload manager (also known as DIRAC4EGI) is a service is provided to the EGI community as a workload management service used to distribute the users' computing tasks among the available resources both HTC and cloud.

The service is a DIRAC instance on EGI federated resource. It is coordinated by the EGI Foundation and operated by IN2P3 on resources provided by CYFRONET.

Tool name Workload Manager
Tool Category and description A workload management service used to distribute the users' computing tasks among the available resources both HTC and cloud.
Tool url dirac.egi.eu
Email dirac-support@mailman.egi.eu, dirac@mailman.egi.eu, dirac-admins@mailman.egi.eu (operation team)
GGUS Support unit DIRAC
GOC DB entry GRIDOPS-DIRAC4EGI
Requirements tracking - EGI tracker technical-support-cases, dirac4egi-eiscat3d-requirements, dirac4egi-mobrain-requirements
Issue tracking - Developers tracker https://github.com/DIRACGrid/DIRAC/issues
Release schedule
Release notes https://github.com/DIRACGrid/DIRAC/wiki#release-information
Roadmap
Related OLA https://documents.egi.eu/document/3254
Test instance url https://dirac.egi.eu/
Documentation http://dirac.readthedocs.io/en/latest/
License GNU General Public License v3.0
Provider IN2P3

CYFRONET

Source code https://github.com/DIRACGrid/DIRAC

Main features

Workload Manager provides a Workload Management Service (WMS) for High Throughput Computing resources based on DIRAC, which improves the general job throughput compared with native management of grid computing resources. Cloud computing resources are managed as well in a uniform and transparent way for the users.

DIRAC data and job management systems ensure proven production scalability up to peaks of more than 100 thousand concurrently running jobs for the LHCb experiment. This is by far large enough for the computing requirements of environmental science in a sensible temporal horizon.

Targeting User Groups

The service suits for the established Virtual Organization communities, long tail of users, SMEs and Industry

This service platform eases scientific computing by overlaying distributed computing resources in a transparent manner to the end-user. For example, WeNMR, a structured biology community, uses DIRAC for a number of community services, and reported an improvement from previous 70% to 99% with DIRAC job submission. The benefits of using this service include but not limited to :

Technical Service Architecture

DIRAC was originally developed to support the production activities of the LHCb experiment at CERN (~10 years ago), today it acts as is a general purpose software support for Grid, Cloud, HPC, targeting various large scientific communities including LHCb, Belle II, EGI, CTA, GridPP, WeNMR, VIP, FranceGrilles, SKA, VIRGO, etc. DIRAC provides complete solutions for production managements, handling distributed large scale of scientific data and optimising job executions.

The DIRAC framework offers standards rules to create DIRAC extension, large part of the functionality is implemented as plugins and it allows to customize the DIRAC functionality for a particular application with minimal effort. It provides multiple commands usable in a Unix shell giving access to all the DIRAC functionalities. It also provides a RESTful API suitable for use with application portals (e.g. WS-PGRADE portal is interfaced with DIRAC this way) and a Python programming interface is the basic way to access all the DIRAC facilities and to create new extensions. For all DIRAC service please check it out at DIRAC documentation

The Workload Manager service (DIRAC4EGI) is a cluster of DIRAC service running on EGI resources (HTC, CLOUD, HPC) supporting multi-VO. The main service components include:

DIRAC Web Portal

and all DIRAC service are at or above TRL8

The modular organiszation of the DIRAC components allows selecting a subset of the functionality suitable for particular applications or easily adding the missing functionality. These are very useful for communities to have a customised environments to handling own data.

Get Starts

1. Submit a service request via the EGI website/marketplace request form
2. The UCST team contacts CNRS to request the support the service integration (on-boarding of the new customer)

Use Cases

WeNMR

The EGI Workload Manager is already used in production by some early adopters like WeNMR, that was able to easily switch their Science Gateways from gLite WMS to DIRAC. http://indico3.twgrid.org/indico/getFile.py/access?contribId=61&sessionId=20&resId=0&materialId=slides&confId=593. Ongoing development (by Jun 2018) is supported by the WeNMR Thematic Service under the EOSC-hub umbrella.


EISCAT-3D

EISCAT and EGI set up a Competence Centre (CC) in the context of the EGI-Engage project to provide researchers with data analysis tools to improve their scientific discovery opportunities.

The team developed a web portal for researchers to to discover, access and analyse the data generated by EISCAT_3D. The CC opted to use the EGI Workload Manager service.

The service provides a web-based graphical interface and command line interface to interact with data search and job management. The system also facilitates the development of data models and modelling tools within the EISCAT_3D community, and the applicability of operating a central portal service for scientists to interact and compute with EISCAT data. https://wiki.egi.eu/wiki/Competence_centre_EISCAT_3D#First_portal_-_proof_of_concept

New development (by Jun 2018) is supported by the EISCAT-3D CC under the EOSC-hub umbrella.

VIRGO

Virgo is a giant laser interferometer designed to detect gravitational waves and located at the European Gravitational Observatory (EGO) site in Cascina, a small town near Pisa. Virgo was designed and built by a collaboration between the French National Center for Scientific Research (CNRS) and the National Institute for Nuclear Physics (INFN). It is now operated and improved by an international collaboration of scientists from France, Italy, the Netherlands, Poland, and Hungary. In 2017, the Virgo and LIGO Scientific Collaborations received the Physics Nobel Prize for their role in the detection of gravitational waves.

Virgo is now performing tests using the EGI workload manager service. The fact that DIRAC is already used by many communities as a mature tool was an important factor in making this decision. In addition to the EGI Workload Manager, the Virgo collaboration also decided to test distributed data management solution to better understand its potential. Considering the Data Management needs of Virgo, it was agreed to set-up a dedicated DIRAC file catalog component as well, hosted at the INFN data centre in Bologna, Italy.

The tests conducted so far (by Jun 2018) showed good performance results. For example, the catalog was populated with millions of records, and the performances were good even with a large number of records similar to the real numbers that are expected to be in production. The tests also allowed to find and fix some misconfigurations on the resource centres currently available in France, Italy, and the Netherlands. In the following months, more sites will be involved and there are plans to move and register the production data between the sites, using the DIRAC data transfer feature.


References

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export