Difference between revisions of "VO Services and Tools Portfolio"
(→Diane) |
|||
(83 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
{{Template:Op menubar}} | |||
= | {{Template:Doc_menubar}} | ||
[[Category:Operations Documentation]] | |||
= Ganga = | {{TOC_right}} | ||
= VO Job oriented tools = | |||
== Job management== | |||
=== Ganga === | |||
GANGA aims to be an easy tool for job submission and management. It is built on python and provides client command tools, a Graphical User Interface (GUI), and a WebGUI. A job in GANGA is constructed from a set of building blocks. All jobs must specify the software to be run (application) and the processing system (backend) to be used. Pragmatically, this means that GANGA can be used to submit jobs to the localhost where it is installed, to a local farm or to a computing grid such as LCG/EGI, as long as the appropriate clients command tools are available to GANGA. GANGA is presently used by ATLAS and LHCb users among other collaborations. | GANGA aims to be an easy tool for job submission and management. It is built on python and provides client command tools, a Graphical User Interface (GUI), and a WebGUI. A job in GANGA is constructed from a set of building blocks. All jobs must specify the software to be run (application) and the processing system (backend) to be used. Pragmatically, this means that GANGA can be used to submit jobs to the localhost where it is installed, to a local farm or to a computing grid such as LCG/EGI, as long as the appropriate clients command tools are available to GANGA. GANGA is presently used by ATLAS and LHCb users among other collaborations. | ||
Line 37: | Line 40: | ||
|- | |- | ||
| '''Offered as a service?''' | | '''Offered as a service?''' | ||
| | | '''The VO could offer a central installation service for their VO users''' | ||
|- | |- | ||
|'''Evaluation Document:''' | |'''Evaluation Document:''' | ||
|[[Media:Diane_Ganga_Evaluation_v2.pdf|Ganga Evaluation Document]] | |[[Media:Diane_Ganga_Evaluation_v2.pdf|Ganga Diane Minidashboard Evaluation Document]] | ||
|} | |} | ||
<br /> | <br /> | ||
= Diane = | === Diane === | ||
DIANE is a lightweight job execution control framework for parallel scientific applications aiming to improve the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery. The backbone of DIANE communication model is based on master-worker architecture. This approach is also known as agent-based computing or pilot jobs in which a set of worker agents controls the resources. The resource allocation is independent from the application execution control and therefore may be easily adapted to various use cases. DIANE uses the GANGA to allocate resources by sending worker agent jobs, hence the system supports a large of computing backends: LSF, PBS, SGE, Condor, LCG/EGI Grid. | DIANE is a lightweight job execution control framework for parallel scientific applications aiming to improve the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery. The backbone of DIANE communication model is based on master-worker architecture. This approach is also known as agent-based computing or pilot jobs in which a set of worker agents controls the resources. The resource allocation is independent from the application execution control and therefore may be easily adapted to various use cases. DIANE uses the GANGA to allocate resources by sending worker agent jobs, hence the system supports a large of computing backends: LSF, PBS, SGE, Condor, LCG/EGI Grid. | ||
Line 79: | Line 82: | ||
|- | |- | ||
| '''Offered as a service?''' | | '''Offered as a service?''' | ||
| | | '''The VO could offer a central installation service for their VO users''' | ||
|- | |||
|'''Evaluation Document:''' | |||
|[[Media:Diane_Ganga_Evaluation_v2.pdf|Ganga Diane Minidashboard Evaluation Document]] | |||
|} | |||
<br /> | |||
== Job monitoring == | |||
=== MiniDashboards === | |||
The mini-Dashboard monitoring service provides a web-based interface where users may easily keep track of GANGA and DIANE jobs. The mini-Dashboard is a service which runs a simple mysql DB at the backend (as opposed to Oracle DB used by HEP VOs) but uses the same web interface technology as HEP VOs. | |||
<br /> | |||
<big>'''Key Features'''</big> | |||
* Possibility to have an aggregated / integrated graphical customized view of individual user usage at a given time | |||
* VOs could install their own instance, customize it for VO specific needs | |||
<br /> | |||
{|border="1" cellspacing="0" cellpadding="1" | |||
|'''Description:''' | |||
|Web-based interface where users may easily keep track of GANGA and DIANE jobs | |||
|- | |||
|'''Home web page:''' | |||
|https://twiki.cern.ch/twiki/bin/view/ArdaGrid/EGIIntroductoryPackage | |||
|- | |||
|'''Installation guide:''' | |||
|https://twiki.cern.ch/twiki/bin/view/ArdaGrid/GangaDIANEOperationsProcedures | |||
|- | |||
|'''User Manual:''' | |||
|https://twiki.cern.ch/twiki/bin/view/ArdaGrid/GangaDIANEOperationsProcedures#Instructions_for_users | |||
|- | |||
| '''FAQ''' | |||
| | |||
|- | |||
|'''Video link:''' | |||
| | |||
|- | |||
| '''Offered as a service?''' | |||
| '''Yes (@ CERN)''' <br /> '''gangamon instance:''' http://gangamon.cern.ch/ganga <br />'''dianemon instance:''' http://dianemon.cern.ch/diane | |||
|- | |- | ||
|'''Evaluation Document:''' | |'''Evaluation Document:''' | ||
|[[Media:Diane_Ganga_Evaluation_v2.pdf|Ganga Evaluation Document]] | |[[Media:Diane_Ganga_Evaluation_v2.pdf|Ganga Diane Minidashboard Evaluation Document]] | ||
|} | |} | ||
<br /> | <br /> | ||
== Data Management == | |||
=== SE intervention: LFCBrowseSE === | |||
When a SE needs intervention (e.g. it becomes full, it is decommissioned), files stored could be compromised. All replicas from files in the VO-LFC stored in such server need to be backup. Therefore, VO managers need to identify them, get the details of the owners and either warn them or backup all of them in a different SE. | |||
A tool has been produced for this point. It enables the VO manager (or any user with permissions on the LFC): | |||
* Obtaining all the SURLs of the files compromised (LFCBrowseSE <se> <vo> --sfn). It profits from an API in LFC that directly queries the Cns_file_replica database and quickly retrieves all the SURLs of the files stored in the specified SE and listed in the LFC catalogue pointed out in LFC_HOST. | |||
* Obtaining more information (such as DNs, LFNs, GUIDs, size) from the files affected. (LFCBrowseSE <se> [--vo <vo>] [--sfn] [--lfn] [--dn] [--size] [--guid]). A filter by VO is provided by querying the metadata table (slower but independent of the structure of the SURL). | |||
* Obtain a summary of all the users with files in the SEs and the size occupied by them (LFCBrowseSE <se> [--vo <vo>] --summary). | |||
* Replicating affected files in other SEs (LFCBrowseSE [--vo <vo>] --rep <DstSE>). | |||
<br /> | |||
<big>'''Key Features'''</big> | |||
* Independent command line implemented in C (URL). | |||
* Requires lfc-1.7.4.7 or superior. Works well in UIs. | |||
* It simply requires a valid certificate and VOMS membership. | |||
<br /> | |||
{|border="1" cellspacing="0" cellpadding="1" | |||
|'''Description:''' | |||
| SE intervention tool. | |||
|- | |||
|'''Home web page:''' | |||
| http://lsgc.org/en/Biomed-Shifts:Biomed-support-tools | |||
|- | |||
|'''Installation Guide:''' | |||
| Simply "make" the code | |||
|- | |||
|'''User Manual:''' | |||
| [[Media:SE_Decommission.pdf|SE Browse User Manual Document]] | |||
|- | |||
|'''FAQ''' | |||
| | |||
|- | |||
|'''Video link:''' | |||
| | |||
|- | |||
| '''Offered as a service?''' | |||
| '''Command-line-based.''' | |||
|- | |||
|'''Evaluation Document:''' | |||
| [[Media:SE_Decommission.pdf|SE Browse Evaluation Document]] | |||
|} | |||
<br /> | |||
= VO infrastructure oriented tools = | |||
== Infrastructure monitoring == | |||
=== Service Availability Monitoring (SAM) for VOs === | |||
The VO SAM service is an adaptation of the operation SAM service used by NGIs. It is useful to monitor VO/VRC infrastructures within a given NGI or groups of NGIs. One of the most obvious advantages of this service is that a VO can then develop and integrate their own probes. While EGI operation teams test and monitor the status of resources through generic tests, they can be considered insufficient for certain communities. These approach allows those communities to define custom test suites and insert them in their SAM system. | |||
<br /> | |||
<big>'''Key Features'''</big> | |||
* Possibility to monitor the infrastructure of several VOs using the same SAM box | |||
* VOs could install their own instance | |||
* VOs could develop and integrate their own probes | |||
<br /> | |||
{|border="1" cellspacing="0" cellpadding="1" | |||
|'''Description:''' | |||
|SAM instance to monitor VO/VRC infrastructures within a given NGI or groups of NGIs | |||
|- | |||
|'''Home web page:''' | |||
|[[VO_Service_Availability_Monitoring]] | |||
|- | |||
|'''Installation guide:''' | |||
|[[VO_Service_Availability_Monitoring#Install_and_configure_SAM_with_YAIM]] | |||
|- | |||
|'''User Manual:''' | |||
| | |||
|- | |||
| '''FAQ''' | |||
|[[VO_Service_Availability_Monitoring#Frequently_Asked_Questions_.26_Troubleshooting]] | |||
|- | |||
|'''Video link:''' | |||
| | |||
|- | |||
| '''Offered as a service?''' | |||
| '''Yes (@ LIP and UPV)''' <br/> '''VO SAM instance:''' https://nagios01.ncg.ingrid.pt/nagios | |||
|- | |||
|'''Evaluation Document:''' | |||
|Not Applicable | |||
|} | |||
<br /> | |||
= VO Administration oriented tools = | |||
== Administrative tools == | |||
=== VO Admin Dashboard === | |||
The VO Administrators dashboard is a tool designed to VO administrators which provides on a single web page different views from different applications (EGI and non EGI) specific for VO administration. The integration of all this tools, scattered among several places, into a single view provides to the VO Administrators an easy access and navigation trough all the administrative tools allowing him to make a faster identification of problems together with a improving in the correlation of events. The combination of this characteristics with the quick and direct access to the administrative applications makes the VO administration tasks much easier and faster allows to improve the analysis of statistics, performance issues and/or misconfigurations detection's. | |||
The tool also provides a highly configurable environment so that VO administrators can costumize the offered views according to their own needs, or even add additional VO specific sources on demand. | |||
<big>'''Key Features'''</big> | |||
*No installation required (web browser) | |||
*It simply requires a valid certificate and VOMS membership. | |||
* Administration tools centralized in one view | |||
* VO configurable view | |||
*Manual possibility of integration of new tools | |||
<br /> | |||
{|border="1" cellspacing="0" cellpadding="1" | |||
|'''Description:''' | |||
| A functional browser specifically designed for VO administrative tasks. | |||
|- | |||
|'''Home web page:''' | |||
|https://vodashboard.lip.pt/ | |||
|- | |||
|'''Installation Guide:''' | |||
| No installation required | |||
|- | |||
|'''User Manual:''' | |||
|[[VO_Admin_Dashboard]] | |||
|- | |||
|'''FAQ''' | |||
|[[VO_Admin_Dashboard#Know_issues]] | |||
|- | |||
|'''Video link:''' | |||
| | |||
|- | |||
| '''Offered as a service?''' | |||
| Yes | |||
|- | |||
|'''Evaluation Document:''' | |||
| | |||
|} | |||
= Service provision summary table = | |||
The following table summarizes the different deployment scenarios for the previous proposed tools. | |||
<br /> | |||
{|border="1" cellspacing="0" cellpadding="1" | |||
| colspan="2" align="center" | '''Deployment scenarios''' | |||
|align="center"| '''Supported by the VO''' | |||
| '''Temporarily hosted elsewhere''' | |||
| '''Hosted elsewhere''' | |||
|- | |||
|rowspan="2"|'''Job Management oriented tools''' | |||
| [[VO Services and Tools Portfolio#Ganga |'''Ganga''']] | |||
|align="center"|<big><span title="&otimes;">⊗</span></big> | |||
| | |||
| | |||
|- | |||
|[[VO Services and Tools Portfolio#Diane| '''Diane''']] | |||
|align="center"|<big><span title="&otimes;">⊗</span></big> | |||
| | |||
| | |||
|- | |||
|'''Job Monitoring oriented tools''' | |||
|[[VO Services and Tools Portfolio#Minidashboards |'''Minidashboards''']] | |||
| | |||
| | |||
|align="center"|<big><span title="&otimes;">⊗</span> (@CERN)</big> | |||
|- | |||
|'''VO infrastructure Monitoring oriented tools''' | |||
|[[VO Services and Tools Portfolio#Service_Availability_Monitoring_.28SAM.29_for_VOs |'''VO SAM''']] | |||
|align="center"|<big><span title="&otimes;">⊗</span></big> | |||
|align="center"|<big><span title="&otimes;">⊗</span> (@LIP,UPV)</big> | |||
| | |||
|- | |||
|'''VO Administration oriented tools''' | |||
|[[VO Services and Tools Portfolio#VO_Admin_Dashboard |'''VO Admin Dashboard''']] | |||
| | |||
|align="center"|<big><span title="&otimes;">⊗</span> (@LIP)</big> | |||
| | |||
|} |
Latest revision as of 11:33, 23 November 2012
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
VO Job oriented tools
Job management
Ganga
GANGA aims to be an easy tool for job submission and management. It is built on python and provides client command tools, a Graphical User Interface (GUI), and a WebGUI. A job in GANGA is constructed from a set of building blocks. All jobs must specify the software to be run (application) and the processing system (backend) to be used. Pragmatically, this means that GANGA can be used to submit jobs to the localhost where it is installed, to a local farm or to a computing grid such as LCG/EGI, as long as the appropriate clients command tools are available to GANGA. GANGA is presently used by ATLAS and LHCb users among other collaborations.
Key Features
- Easy installation
- Extensible for integration of other backends
- Easy command tools (if you are used to python)
- Easy GUI with the capacity to re-use jobs and job templates.
- The VO may want to offer a central installation to be used by all users
- GANGA GUI may increase the user learning curve on using the VO infrastructure
Description: | A tool for computational-task management and easy access to Grid resources |
Home web page: | http://ganga.web.cern.ch/ganga/ |
Installation guide: | http://ganga.web.cern.ch/ganga/user/installation/index.php |
User Manual: | http://ganga.web.cern.ch/ganga/user/html/GUI_User_Manual/ |
FAQ | |
Video link: | http://www.lip.pt/grid/ganga.avi |
Offered as a service? | The VO could offer a central installation service for their VO users |
Evaluation Document: | Ganga Diane Minidashboard Evaluation Document |
Diane
DIANE is a lightweight job execution control framework for parallel scientific applications aiming to improve the reliability and efficiency of job execution by providing automatic load balancing, fine-grained scheduling and failure recovery. The backbone of DIANE communication model is based on master-worker architecture. This approach is also known as agent-based computing or pilot jobs in which a set of worker agents controls the resources. The resource allocation is independent from the application execution control and therefore may be easily adapted to various use cases. DIANE uses the GANGA to allocate resources by sending worker agent jobs, hence the system supports a large of computing backends: LSF, PBS, SGE, Condor, LCG/EGI Grid.
Key Features
- Easy installation
- Increase reliability and success rate for job management
- The VO may want to offer a central installation to be used by all users
- Proper for VO production needs
Description: | Lightweight distributed framework for parallel scientific applications in master-worker model. |
Home web page: | http://it-proj-diane.web.cern.ch/it-proj-diane/ |
Installation guide: | http://it-proj-diane.web.cern.ch/it-proj-diane/install.php |
User Manual: | https://twiki.cern.ch/twiki/bin/view/ArdaGrid/DIANETutorial#DIANE_Tutorial |
FAQ | https://twiki.cern.ch/twiki/bin/view/ArdaGrid/DIANEQuestionsAndAnswers |
Video link: | |
Offered as a service? | The VO could offer a central installation service for their VO users |
Evaluation Document: | Ganga Diane Minidashboard Evaluation Document |
Job monitoring
MiniDashboards
The mini-Dashboard monitoring service provides a web-based interface where users may easily keep track of GANGA and DIANE jobs. The mini-Dashboard is a service which runs a simple mysql DB at the backend (as opposed to Oracle DB used by HEP VOs) but uses the same web interface technology as HEP VOs.
Key Features
- Possibility to have an aggregated / integrated graphical customized view of individual user usage at a given time
- VOs could install their own instance, customize it for VO specific needs
Description: | Web-based interface where users may easily keep track of GANGA and DIANE jobs |
Home web page: | https://twiki.cern.ch/twiki/bin/view/ArdaGrid/EGIIntroductoryPackage |
Installation guide: | https://twiki.cern.ch/twiki/bin/view/ArdaGrid/GangaDIANEOperationsProcedures |
User Manual: | https://twiki.cern.ch/twiki/bin/view/ArdaGrid/GangaDIANEOperationsProcedures#Instructions_for_users |
FAQ | |
Video link: | |
Offered as a service? | Yes (@ CERN) gangamon instance: http://gangamon.cern.ch/ganga dianemon instance: http://dianemon.cern.ch/diane |
Evaluation Document: | Ganga Diane Minidashboard Evaluation Document |
Data Management
SE intervention: LFCBrowseSE
When a SE needs intervention (e.g. it becomes full, it is decommissioned), files stored could be compromised. All replicas from files in the VO-LFC stored in such server need to be backup. Therefore, VO managers need to identify them, get the details of the owners and either warn them or backup all of them in a different SE. A tool has been produced for this point. It enables the VO manager (or any user with permissions on the LFC):
- Obtaining all the SURLs of the files compromised (LFCBrowseSE <se> <vo> --sfn). It profits from an API in LFC that directly queries the Cns_file_replica database and quickly retrieves all the SURLs of the files stored in the specified SE and listed in the LFC catalogue pointed out in LFC_HOST.
- Obtaining more information (such as DNs, LFNs, GUIDs, size) from the files affected. (LFCBrowseSE <se> [--vo <vo>] [--sfn] [--lfn] [--dn] [--size] [--guid]). A filter by VO is provided by querying the metadata table (slower but independent of the structure of the SURL).
- Obtain a summary of all the users with files in the SEs and the size occupied by them (LFCBrowseSE <se> [--vo <vo>] --summary).
- Replicating affected files in other SEs (LFCBrowseSE [--vo <vo>] --rep <DstSE>).
Key Features
- Independent command line implemented in C (URL).
- Requires lfc-1.7.4.7 or superior. Works well in UIs.
- It simply requires a valid certificate and VOMS membership.
Description: | SE intervention tool. |
Home web page: | http://lsgc.org/en/Biomed-Shifts:Biomed-support-tools |
Installation Guide: | Simply "make" the code |
User Manual: | SE Browse User Manual Document |
FAQ | |
Video link: | |
Offered as a service? | Command-line-based. |
Evaluation Document: | SE Browse Evaluation Document |
VO infrastructure oriented tools
Infrastructure monitoring
Service Availability Monitoring (SAM) for VOs
The VO SAM service is an adaptation of the operation SAM service used by NGIs. It is useful to monitor VO/VRC infrastructures within a given NGI or groups of NGIs. One of the most obvious advantages of this service is that a VO can then develop and integrate their own probes. While EGI operation teams test and monitor the status of resources through generic tests, they can be considered insufficient for certain communities. These approach allows those communities to define custom test suites and insert them in their SAM system.
Key Features
- Possibility to monitor the infrastructure of several VOs using the same SAM box
- VOs could install their own instance
- VOs could develop and integrate their own probes
Description: | SAM instance to monitor VO/VRC infrastructures within a given NGI or groups of NGIs |
Home web page: | VO_Service_Availability_Monitoring |
Installation guide: | VO_Service_Availability_Monitoring#Install_and_configure_SAM_with_YAIM |
User Manual: | |
FAQ | VO_Service_Availability_Monitoring#Frequently_Asked_Questions_.26_Troubleshooting |
Video link: | |
Offered as a service? | Yes (@ LIP and UPV) VO SAM instance: https://nagios01.ncg.ingrid.pt/nagios |
Evaluation Document: | Not Applicable |
VO Administration oriented tools
Administrative tools
VO Admin Dashboard
The VO Administrators dashboard is a tool designed to VO administrators which provides on a single web page different views from different applications (EGI and non EGI) specific for VO administration. The integration of all this tools, scattered among several places, into a single view provides to the VO Administrators an easy access and navigation trough all the administrative tools allowing him to make a faster identification of problems together with a improving in the correlation of events. The combination of this characteristics with the quick and direct access to the administrative applications makes the VO administration tasks much easier and faster allows to improve the analysis of statistics, performance issues and/or misconfigurations detection's.
The tool also provides a highly configurable environment so that VO administrators can costumize the offered views according to their own needs, or even add additional VO specific sources on demand.
Key Features
- No installation required (web browser)
- It simply requires a valid certificate and VOMS membership.
- Administration tools centralized in one view
- VO configurable view
- Manual possibility of integration of new tools
Description: | A functional browser specifically designed for VO administrative tasks. |
Home web page: | https://vodashboard.lip.pt/ |
Installation Guide: | No installation required |
User Manual: | VO_Admin_Dashboard |
FAQ | VO_Admin_Dashboard#Know_issues |
Video link: | |
Offered as a service? | Yes |
Evaluation Document: |
Service provision summary table
The following table summarizes the different deployment scenarios for the previous proposed tools.
Deployment scenarios | Supported by the VO | Temporarily hosted elsewhere | Hosted elsewhere | |
Job Management oriented tools | Ganga | ⊗ | ||
Diane | ⊗ | |||
Job Monitoring oriented tools | Minidashboards | ⊗ (@CERN) | ||
VO infrastructure Monitoring oriented tools | VO SAM | ⊗ | ⊗ (@LIP,UPV) | |
VO Administration oriented tools | VO Admin Dashboard | ⊗ (@LIP) |