Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Operations Surveys"

From EGIWiki
Jump to navigation Jump to search
 
(79 intermediate revisions by 3 users not shown)
Line 1: Line 1:
This page collects information about surveys relevant to the EGI Opreations Community and the EGI Technology Providers (such as EMI, IGE, etc.). Surveys are conducted to collect information about deployed software, oprational tools and generally speaking to collect feedback from Resource Infrastructure Providers and Resource Centres.
{{Template:Op menubar}} {{TOC_right}}


==Survey on Logging and Bookkeeping capabilities, service management and auditing and gLite-CLUSTER==
This page collects information about surveys relevant to the EGI Operations Community and the EGI Technology Providers (such as EMI, IGE, etc.). Surveys are conducted to collect information about deployed software, operational tools and generally speaking to collect feedback from Resource Infrastructure Providers and Resource Centres.
* Release date: 18 Jan 2011
* Deadline for submission: 10 Feb 2011


===Overview===
* '''PART 1, [http://egee.cesnet.cz/cs/JRA1/LB/ Logging and Bookkeeping Service]''': Logging and Bookkeeping (LB) it is a monitoring service which  gathers, aggregates and archives information on infrastructure behaviour from the perspective of users' tasks. The EMI project aims at extending the LB scope and its further integration with other grid services. The first  page of the survey contains a set of questions to help LB Team to better  design the new features of the LB Service, and to target the real users' needs.


* '''PART 2, Remote Grid Service Management (RGSM)'''. Management is performed through a set of notifications issued to the relevant Grid service instance. Examples of management actions are: start, stop, drain etc. The RGSM framework can be used for remote management of a service. EMI has a dedicated [https://twiki.cern.ch/twiki/bin/view/EMI/EmiJra1T5TaskForceServiceManagement task force] to investigate the requirements for common service monitoring and management interfaces. This  survey is to collect information and requirements from the EGI operations community and sites to understand which technologies are of interest for service management.
= NGI services provisioning and usage=


* '''PART 3, Grid service auditing (GSA)''', that is a feature that allows a system administrators and users to check the status of a service in terms of load, length of internal queues, and to monitor service workload from a grid point of view over time. Service auditing is different from Nagios-based monitoring as it is not based on probes, but rather on the periodic gathering of service status information.
*Release date: '''19-06-2013'''  
*Deadline for submission:'''12-07-2013'''


* '''PART 4, gLite CLUSTER''', [https://documents.egi.eu/document/308 glite-CLUSTER] allows the configuration of information related to the batch system environment to be separated from the configuration of the job submission interface. With this service sites will be able to publish their resources information consistently and without any workaround. Even in case of multiple CEs or cluster with heterogeneous hardware configuaration. There are few questions to understand how the publishing of cluster information is a problem for site managers.
== Overview  ==


===Instructions===
The NGI operations sustainability survey provides indications of little progress in securing new funding to support national operations services to compensate the end of EC funding expected after April 2014 with the end of EGI-InSPIRE. Through EGI-InSPIRE EC currently contribute to the 33% of the operational costs of national grid infrastructures.  
Survey '''closed'''.


===Resources===
Costs of running operations can be partly reduced by sharing operational services with other NGIs and by increasing their efficiency by centrally coordinating the deployment of core services. With this survey we aim at collecting NGI expressions of interest in exploring some of these deployment scenarios in preparation to the end of EGI-InSPIRE. The results of this survey will be used to plan actions for a transition to the period after EGI-InSPIRE.
* [https://wiki.egi.eu/w/images/8/88/SA1_Survey_2011_Jan_18-v3.pdf pdf] version of the survey
 
===Results===
== Instructions  ==
'''Q1''':Do you need more platforms to be supported by EMI software, in addition to those already in use (SL5)?
 
* 12 responses
Please participate to the survey by submitting your contribution. <big>'''[https://www.surveymonkey.com/s/NGI_services_provisioning_and_usage  Click here to take survey]'''</big>.
'''Q2''':Part One, Logging and Bookkeeping(Questions from 3 to 10)LB: Which services should be watched in the grid in addition to gLite WMS and CREAM, that are already supported by LB?
 
* ARC CE 7 - 58%
<big>'''[https://wiki.egi.eu/wiki/File:Survey_Federation_of_NGI_services_and_central_coordination_v2_.pdf PDF version of the survey]'''</big>. It is a fill-in form PDF (''Acrobat Reader X'' allows to save forms) and can be used for your convenience, but submit your official answer using the online survey.
* UNICORE CE 5 - 42%
 
* Data Transfer 12 - 100%
== Results  ==
* SRM Operations 8 - 67%
 
* Other 1
= Use of configuration management tools in the EGI resource centres  =
'''Q3''':LB: What aggregated information would be useful (e.g. average queue traversal time, task failure rate etc.)?
 
* 12 responses
*Release date:&nbsp; '''2013-01-28 '''  
'''Q4''':LB: And at what level of aggregation (referring to the previous question)?
*Deadline for submission:'''*2013-03-01, close of business*'''
* Per Use 6 - 55%
 
* Per VO 10 - 91%
== Overview  ==
* Per service instance 8 - 73%
 
* Other, please specify: 5 - 45%
The support of the common components of yaim (glite-yaim-core, in the EMI products list) after the end of the EMI project is being discussed, and the product teams responsible of EMI/gLite components currently relying on YAIM, may decide to move to a different configuration mechanism in the future.<br> <br> Several sites are already using various software configuration management tools (e.g. puppet) to deploy their middleware services, and if in the future it will not be possible to set up grid services with a homogenous configuration file (e.g. the site-info.def used by yaim) the deployment and configuration of grid resource centres may rely more heavily on these configuration tools.  
'''Q5''': Would you leverage capturing dependencies among the tracked entities (e.g. to know that a computational jobs are blocked by failing transfers of their inputs, and to be able to discover detailsimmediately)?
 
* Yes 5 - 50%
'''Survey purposes:''' The results of this survey will be used by the EGI Operations Management Board to identify suitable replacements for yaim,for those EGI/gLite products for which yaim support will be abandoned (if this will be the case), and to encourage NGIs and sites to share their current experiences with these tools, and best practices.  
* No  5 - 50%
 
'''Q6''': What is the desired level of complexity of the queries on the service?
== Instructions  ==
*Simple, like: "all tasks on this CE", "this user's tasks within a given time interval": 4 - 36%
 
* More sophisticated, but through current LB querying language: 2 - 18%
Please participate to the survey by submitting your contribution. <big>'''[https://www.surveymonkey.com/s/9TH5T9L Click here to take survey]'''</big>. '''(please, submit one response per site)'''
* Full SQL/XQuery power on the task data: 7 - 64%
 
* Intermediate, describe here: 1 - 9%
<big>'''[https://documents.egi.eu/document/1557 PDF version of the survey]'''</big>. It is a fill-in form PDF (''Acrobat Reader X'' allows to save forms) and can be used for your convenience, but submit your official answer using the online survey.  
'''Q7''': LB: What are the output formats to be supported
 
* Glue-conforming WS interface: 6 - 55%
== Results  ==
*Simple key=value text format: 7 - 64%
* [https://indico.egi.eu/indico/getFile.py/access?contribId=6&resId=0&materialId=slides&confId=1234 slides] OMB, March 2013
*JSON: 5 - 45%
 
*Human readable HTML:2 - 18%
 
*Other: 4 - 36%
 
'''Q8''': What modes of retrieving information are foreseen ?
 
*Synchronous (query-response):7 - 64%
= [[EGI Operations Surveys (CLOSED)|CLOSED Surveys]]  =
*Asynchronous (subscribe for notification, eventually via message bus): 4 - 36%
 
'''Q9''': For how long data about the task should be kept?
[[Category:Operations_Management_Board]]
*One day 0 0%
*One week 2 18%
*One month 4 36%
*One year or more 5 45%
'''Q10''': Part two, Remote Grid Service Management (questions from 11 to 20) RGSM:Do any of your Grid services come with capabilities to react certain conditions by adapting their behaviour?
*Yes 4 40%
*No 6 60%
'''Q11''': According to your day-to-day experience, please describe typical service management scenarios. How is management performed?
*8 Responses
'''Q9''':  Are there any management commands that can be performed on your Grid services that go beyond specific business logic (e.g. purge persistent data)?
*Yes 2 25%
*No 6 75%
14. RGSM: What are the limits of your Grid service management capabilities?Is there a gap between the capabilities offered and your Grid service management needs?
9 Responses
15. RGMS:  Please list the 5 management commands you would need the most in your setup (e.g. start/stop services, deploy/un-deploy service, purge service data, dynamically change access rights..).
10 Responses
16. RGSM: Which of those 5 management commands apply to all of your Grid services?
9 Responses
17. RGSM: Out of your day-by-day eperience, how many services do you really need to manage remotely?
11 Responses
18. RGSM: are you capable of (un)deploy Grid services at runtime?
Yes 1 9%
No, but I would need it. 5 45%
No, and I don't need it. 5 45%
Total 11 100%
19. RGSM: If you're deploying stateful Grid services in a site: does the Grid service interface support Grid service state deletion?
Yes 2 25%
No 6 75%
Total 8 100%
20. RGSM: What kind of setup would you prefer for remotely managing your Grid services?
Dedicated: service management interfaces on each Grid service 5 45%
Decoupled: services get their commands from a messaging solution they register to 1 9%
Both 5 45%
Total 11 100%
21. Part three, Grid Service Auditing (questions from 21 to 26). GSA: What kind of data are you already collecting about your Grid services and how are you doing it?
13 Responses
22. GSA: For which services auditing of service status is important?(service status: workload, queue status, etc..)
10 Responses
23. GSA: For each service above, which data is mainly useful?
8 Responses
24. GSA: For each service above, which data is mainly useful?
3 Responses
25. GSA: Are the current service auditing capabilities sufficient, or should this be improved?
Yes 4 44%
No 5 56%
Total 9 100%
26. GSA: Should status data be automatically archived?
Yes 9 90%
No 1 10%
Total 10 100%
27. Part four, glite-CLUSTER (questions from 27 to 29). gC: How many sites in your NGI/EIRO have heterogeneous clusters, or multiple sub clusters (disjoint sets of workernodes, each set having sufficiently homogeneous properties), or multiple CEs?
0 4 33%
Up to 5 6 50%
More than 5 2 17%
Total 12 100%
28. gC: How many of those sites reported difficulties in configuring their CEs, in order to properly publish their site capacity?
0 5 38%
Up to 5 5 38%
More than 5 3 23%
Total 13 100%
29. cG: Given that gLite-CLUSTER is released only for lgc-CE, how many sites in your NGI/EIRO are interested in usinge the gLite-CLUSTER capability?
0 8 62%
Up to 5 2 15%
More than 5 3 23%
Total 13 100%

Latest revision as of 12:29, 20 June 2013

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security



This page collects information about surveys relevant to the EGI Operations Community and the EGI Technology Providers (such as EMI, IGE, etc.). Surveys are conducted to collect information about deployed software, operational tools and generally speaking to collect feedback from Resource Infrastructure Providers and Resource Centres.


NGI services provisioning and usage

  • Release date: 19-06-2013
  • Deadline for submission:12-07-2013

Overview

The NGI operations sustainability survey provides indications of little progress in securing new funding to support national operations services to compensate the end of EC funding expected after April 2014 with the end of EGI-InSPIRE. Through EGI-InSPIRE EC currently contribute to the 33% of the operational costs of national grid infrastructures.

Costs of running operations can be partly reduced by sharing operational services with other NGIs and by increasing their efficiency by centrally coordinating the deployment of core services. With this survey we aim at collecting NGI expressions of interest in exploring some of these deployment scenarios in preparation to the end of EGI-InSPIRE. The results of this survey will be used to plan actions for a transition to the period after EGI-InSPIRE.

Instructions

Please participate to the survey by submitting your contribution. Click here to take survey.

PDF version of the survey. It is a fill-in form PDF (Acrobat Reader X allows to save forms) and can be used for your convenience, but submit your official answer using the online survey.

Results

Use of configuration management tools in the EGI resource centres

  • Release date:  2013-01-28
  • Deadline for submission:*2013-03-01, close of business*

Overview

The support of the common components of yaim (glite-yaim-core, in the EMI products list) after the end of the EMI project is being discussed, and the product teams responsible of EMI/gLite components currently relying on YAIM, may decide to move to a different configuration mechanism in the future.

Several sites are already using various software configuration management tools (e.g. puppet) to deploy their middleware services, and if in the future it will not be possible to set up grid services with a homogenous configuration file (e.g. the site-info.def used by yaim) the deployment and configuration of grid resource centres may rely more heavily on these configuration tools.

Survey purposes: The results of this survey will be used by the EGI Operations Management Board to identify suitable replacements for yaim,for those EGI/gLite products for which yaim support will be abandoned (if this will be the case), and to encourage NGIs and sites to share their current experiences with these tools, and best practices.

Instructions

Please participate to the survey by submitting your contribution. Click here to take survey. (please, submit one response per site)

PDF version of the survey. It is a fill-in form PDF (Acrobat Reader X allows to save forms) and can be used for your convenience, but submit your official answer using the online survey.

Results



CLOSED Surveys