Difference between revisions of "Resource Allocation Task Force brainstorming page"
Line 141: | Line 141: | ||
|- | |- | ||
! scope="row" | Role | ! scope="row" | Role | ||
| ''' | | '''Supervisor''' | ||
| '''Resource Provider''' | | '''Resource Provider''' | ||
| '''Customer''' | | '''Customer''' |
Revision as of 22:11, 25 January 2013
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
This page is a brainstorming space for Resource Allocation Task Force. It contains ideas and issues for discussion.
Terminology
Service Level Agreement (SLA) - An agreement between a Service Provider and a customer/client. The SLA describes the IT Service, documents service level targets and specifies the responsibilities of the Service Provider and the customer/client.. A single SLA may cover multiple IT Services or multiple customers/clients.
Operational Level Agreement (OLA) - An agreement between an IT Service Provider and another part of the same organisation. An OLA supports the IT Service Provider’s delivery of IT Services to Users. The OLA defines the goods or Services to be provided and the responsibilities of both parties.
Resource Allocation -
Please refer to the EGI Glossary for the definitions of the terms used.
Goal
The main goals are:
- Attract new users and allocate them some grid resources
- Provide users with a simplified procedure to find grid resources for their needs and by means of a central unique reference point
- Foster a virtuous cycle among new scientific communities, EGI, NGIs, Resource Centers and funding agencies in order to attract new funding to strengthen and expand the European grid infrastructure according to the needs of those scientific communities
- Demonstrate to the national funding agencies, EC (and everybody) that the EGI infrastructure and services are valuable, useful and used by a vast variety of different users
The high level concept
EGI is "a place" where Customer and Resource Provider meet and negotiate resource allocation.
Resource allocation high level process
Roles and activities
Role | Customer |
Resource Provider |
Supporter | Supervisor | Broker |
---|---|---|---|---|---|
Activities |
|
|
|
|
|
Who |
|
|
|
|
|
Prerequisits |
VOs entitled to use resources should be registered and in production status in Operations Portal |
Resource Centres offering resources must be registered in GOCDB as production entities, this ensure that all security policies will be enforced. |
Resource Allocation Models
3 models were introduced to examine best option for resource allocation process in EGI project.
Different models may be applied depending on
- types of Customer
- Active or New VO
- International or national VO
- type of request
- "Big" or "Small"
- type of NGIs' role in the allocation process.
Model 1 - Open market
RP’ resources and Customers’ requests are published on an open market (e.g. shopping website like for example ebay) – are visible for everyone.
Supervisor regulates and supervises the market. It determines the rules (e.g. tools, procedures, policies) of the market and acts as a referee in case of agreement violation/negotiation.
Role | Supervisor | Resource Provider | Customer |
---|---|---|---|
Pros |
|
|
|
Con |
|
|
Sutable to:
- "Small" requests
- Active Customers
- National Customer
Options
International Customers
- EGI
- for international Customers is a Supervisor
- NGI
- for international Customers can be a Broker (Model 2) or Supporter (Model 3) for its RC
- RC
- provides resources
National Customers
- EGI
- for national Customers may be a Supporter
- NGI
- for national Customers can be Supervisor for its RC
- RC
- provides resources
Model 2 - Broker
Providers’ resources and Customers’ requests are visible to Broker.
Broker matches Providers’ resources to Customers’ requests needs.
Interaction Customer <-> Provider limited.
Role | Broker | Resource Provider | Customer |
---|---|---|---|
Pros |
|
|
|
Con |
|
|
Sutable to:
- "Big" requests
- Generic and specialised requests
- Active and New Customers
Options
International Customers
- EGI
- for international VOs is a Broker
- for international VOs is a Broker
- NGI
- for international VOs can be a Supporter (Model 3) for its RC
- for international VOs can be a Supporter (Model 3) for its RC
- RC
- provides resources
National Customers
- EGI
- for national VOs is a Supporter/Supervisor
- NGI
- for national VOs can be Broker for its RC
- RC
- provides resources
Model 3 - Freedom of choice
Providers’ resources and Customers’ requests transparent - visible for everyone.
The parties have freedom to decide how they want to negotiate resources and under which conditions (no regulation on Supporter side).
Supporter role is to support Providers and/or Customers on demand. E.g. providing tool for resource allocation, helping in request fulfilment, specialised requests etc.
Role | Supporter | Resource Provider | Customer |
---|---|---|---|
Pros |
|
|
|
Con |
|
|
|
Sutable to:
- National Customer
Options
International Customers
- EGI
- for international Customeis a Supporter
- for international Customeis a Supporter
- NGI
- can be a Broker/Superviser
- can be a Broker/Superviser
- RC
- provides resources
National Customers
- EGI
- for national Customeis a Supporter
- NGI
- for national VOs can be Supporter
- for national VOs can be Supporter
- RC
- provides resources
Issues for discussion
Issue |
Comment |
Requirement |
Open questions |
---|---|---|---|
The roles of NGI, EGI, RC in the process |
|
||
Scientific request review |
Scientific Review Committee (SRC) - Terms Of Reference https://documents.egi.eu/secure/ShowDocument?docid=1472&version=2 |
|
|
Service Level Management |
|
What should be included in SLA? | |
Technical request review |
Needed to be sure that what we promise can be delivered to the Customer |
|
|
The acknowledgement for resources use |
VT_Scientific_Publications_Repository the recommendation about citing EGI in publications is defined here: https://documents.egi.eu/document/1369the implementation requires a change in the Grid AUP; the issue has been raised with the Security Coordination Group and David Kelsey is checking with the big customer (WLCG) if/how they are gonna accept it before making the change... work in progress. |
|
|
Resources allocations |
|
|
|
Monitoring of usage, Reporting and evaluation |
|
||
Customer support |
|
|
|
Quality of Service |
|
||
Meeting specific needs of the VO in terms of OS, configuration, software etc. |
|
||
Regional VOs |
|
||
International VOs |
EGI fully support international VOs |
Requirements
EGI
- EGI is responsible of negotiating with the RPs the requirements collected from VOs. E.g. it is the body who investigates which RPs could be willing to implement the needed server or software setup or the required storage system etc... and a single point of contact for new users (babysitting)
- EGI should be responsible of all activities around service level management for the federated pool to offload VOs and RPs.
- All EGI security policies are enforced by the RPs contributing resources to the federated pool (e.g. individual RCs must be registered as production sites in GOCDB).
- Processes for collection of demand, offer and negotiation of SLAs and OLAs should be automated as much as possible to ensure the service scales.
Resource Provider
- Elastic resource provisioning: the federated pool does not need to be defined statically. The pool should include a statically defined minimum set of resources, and additional resources could be contributed on-demand by partners so that resources can be contributed or withdraw in a short time scale depending on which VOs submitted a request and on the short-term resource occupancy of a cluster. By doing so EGI has a chance of attracting more resources.
- Scientific review: a scientific review may not always needed for example when (1) the resource requirements are small and do not justify the overhead of a review (threshold can be defined), (2) when resources are requested to kickoff the activities on a new VO (3) the user community/project is already supported by one or more NGIs
- Technical review: A technical review of demand is needed to complement the initial scientific peer review (where applicable) and decide which RPs can meet the technical requirements. RPs need to directly interact with a VO representative for this. RPs must be involved in this, to technically inspect VO requirements to understand the impact on the resource configurations (cluster, storage set-up, network etc.), to understand if the demand complies to the local policies, and to acknowledge that resource allocation is possible for them.
- Match demand and offer. The resource allocation process should not only allow to address a resource request with an offer, but also vice versa a site with free resources to advertise it.
- Quality of Service. Different levels of Quality of Service could be offered by the federated pool (the type and number of levels should be discussed). Two opposite examples of levels are: (1) Best effort allocation without minimum number of slots allocated (opportunistic usage), (2) Minimum number of slots allocated at high priority, jobs arriving to that site will enter in execution at the earliest time the cluster occupation permits.
- Acknowledgement of usage. VOs must periodically acknowledge usage of resources through scientific publication, press releases. The entities to be acknowledged are EGI, the NGIs and the individual RPs contributing resources. See the wenmr best-practice and the recommendations for a scientific publication repository.
- Efficient usage of resources. Resources allocated must be used by VOs. If usage is insufficient or not acknowledged properly, the agreement can be terminated.
- A single entity (e.g. EGI) must be responsible of ensuring the enforcement of OLAs and SLAs.
- The allocation processes involving multiple RPs and the liaison with other RPs should be made transparent to one RP. EGI is responsible of ensuring that the federated RPs collectively deliver the service requested. One RP is not responsible of or not involved with the services provided by other RPs. The provisioning of resource to a federated pool should be with minimum overhead to the RP.
- Lightweight OLA negotiation: RPs (typically sites) should be relieved from the burden of negotiating OLAs for each VO request. The NGI could function as a proxy and sign the OLA on behalf of the individual sites.
- It is desirable that the same interfaces and tools offered to support international users can also be used to support national user communities. In this case the supervisioning role of EGI.eu (Model 2) and all related service level management functions should be delegated to the NGI.
Customer
- Provide a single broker contact point to the user to hide the complexity of a network of heterogeneous RPs and to establish a single Service Level Agreement, instead of multiple ones with multiple RPs.
- The federated pool of resources should formally support opportunistic usage of resources (opportunistic is intended here as the minimum level of service that can be offered by a RP). This is needed to provide a formal level of engagement between VOs and RPs (which does not exist currently).
- Support multiple levels of services in one resource demand: users should be provided with the flexibility of requiring a minimum guaranteed amount of resources complemented by an additional amount of resources to be allocated elastically (for example opportunistically).
- Simple and standardized SLA. Additional requirements should be allowed where needed by the VO. Many service levels should be made optional, as we do now in the VO ID card. The broker (EGI.eu for international VOs, NGIs for national VOs) should support the customer to express technical requirements, where needed.
- An external entity (e.g. EGI.eu or the NGI) is responsible of reporting on usage of the federated resource pool (rather than the VO itself). Enforcement of SLAs and OLAs does not involve VOs (it is devolved to EGI.eu).
- Usage of a federated distributed pool comes with minimum overhead for the user community.
- Efficient usage and acknowledgement of usage are two requirements for extension of the grant.
Note. The same processes that are being defined for resource allocation, could be generalized to request services, i.e. to allow new user communities to request services to be provided by NGIs, such as application porting.
References
NGI proposals