Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Resource Allocation Task Force brainstorming page

From EGIWiki
Revision as of 09:54, 25 January 2013 by Krakow (talk | contribs) (→‎SLA)
Jump to navigation Jump to search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security




Introduction

This page is a brainstorming space for Resource Allocation Task Force. It contains ideas and issues for discussion.

Goal

The main goals are:

  • Attract new users and allocate them some grid resources
  • Provide users with a simplified procedure to find grid resources for their needs and by means of a central unique reference point
  • Foster a virtuous cycle among ‘new’ scientific communities, EGI, NGIs, Resource Centers and funding agencies in order to attract new funding to strengthen and expand the European grid infrastructure according to the needs of those scientific communities
  • Demonstrate to the national funding agencies, EC (and everybody) that the EGI infrastructure and services are valuable, useful and used by a vast variety of different users


The high level concept

RA overview.png

EGI is "a place" where Customer and Provider meet and negotiate resource allocation.

NGI proposals

Resource allocation components

RA processes.png


Actors

EGI 

  • EGI.eu
  • supervisor of RA process

Provider

  • NGIs that can directly decide on resource allocation on a pool of resources on specific sites
  • NGIs that can coordinate resources allocation with their sites
  • Resource Centres that are identified to allocate resources autonomously (bypassing the NGI). Resource Centres offering resources must be registered in GOCDB as production entities, this ensure that all security policies will be enforced.
  • User communities can be RPs for other user communities (see the WeNMR case)

Customer

  • Regional or international VO
  • New or existing VO
  • VO should be registered and in production status in Operations Portal




Regional VO:

  • Should be up to NGIs to help the VO and allocate resources within NGI
  • EGI provides single point of contact for resource allocation for Customers

International VO:

  • EGI provides
    • Single point of contact for resource allocation for their Customers
    • Negotiation and signing procedures
    • Monitoring
    • Reporting

Terminology

Service Level Agreement (SLA) - An agreement between a Service Provider and a customer/client. The SLA describes the IT Service, documents service level targets and specifies the responsibilities of the Service Provider and the customer/client.. A single SLA may cover multiple IT Services or multiple customers/clients.

Operational Level Agreement (OLA) - An agreement between an IT Service Provider and another part of the same organisation. An OLA supports the IT Service Provider’s delivery of IT Services to Users. The OLA defines the goods or Services to be provided and the responsibilities of both parties.


Please refer to the EGI Glossary for the definitions of the terms used.

Models

Different models can be applied for different requests:

  • Mature VO vs New VO
  • International or regional VO
  • „Big” request vs „Small” request

or NGIs.

Model 1- Open market

RA Model1.png

Providers’ resources and Customers’ requests are published on an open market (shopping website like for example ebay) – are visible for everyone.

EGI regulates and supervises the market. It determines the rules (e.g. procedures) of the market and acts as a referee in case of agreement violation/negotiations

EGI

  • Pros: EGI knows what is going on in the infrastructure

Provider

  • Pros: There are clear rules for resource allocation
  • Pros: In case of agreement violation/negotiations EGI is a referee
  • Pros: Knows what are the Customers' current needs
  • Cons: Negotiation process cannot be adapted to specific needs

Customer

  • Pros: There are clear rules for resource allocation
  • Pros: In case of agreement violation/negotiations EGI is a referee
  • Pros: knows which resources are now available
  • Cons: lack of external body to help match needs to resources

Use cases

  • Standard resources' requests

Scientific request review

  • EGI decides if the review is needed. If yes then should be done before request is published to the open market.

Activities

  • EGI
    • regulates and supervises the process
    • determines the rules (e.g. procedures)
    • acts as a referee in case of agreement violation/negotiations
  • NGI
    • can be a broker for its RC
  • RC
    • dd


Model 2 - Broker

RA Model2.png

Providers’ resources and Customers’ requests are visible to EGI.

EGI matches Providers’ resources to Customers’ requests needs.

Interaction Customer <-> Provider limited.

Provider selection can be done in two ways:

  • EGI has up to date information about free resources
  • EGI ask Providers to express their interest to support VO - each time request arrives

EGI

  • Cons: Time consuming for EGI

Provider


Customer

  • Pros: gets single point of contact and support from one entity


Cons

  • EGI has to get and manage the knowledge what is available in the infrastructure (each site)
  • Some Providers want to talk with Customers directly
  • EGI may not be able to match needs to resources.

Use cases

  • big international Customers
  • specialised Customer requests
  • new international Customers

Scientific request review

  • EGI decide is the review is needed. If yes then should be done before request is published to the open market.

Activities

  • EGI
    • technical and scientific review
    • collect resource demand from customers and related requirements
    • call for resource offer to create a resource pool (if needed)
    • establish a resource pool in collaboration with the RPs (if needed)
    • monitoring
    • accounting collect and provide information about resource usage
    • periodic assessment of usage
    • reporting
    • periodic assessment of acknowledgement of resource usage by VOs
    • help new VO 
  • NGI
    • communication channel between EGI and RC
  • RC
    • provides resources

Model 3 - Freedom of choice

RA Model3.png

Providers’ resources and Customers’ requests transparent - visible for everyone.

The parties have freedom to decide how they want to negotiate resources and under which conditions (no regulation on EGI side).

EGI role is to support Providers and/or Customers on demand. E.g. providing tool for resource allocation, helping in request fulfilment, specialised requests etc.


EGI

  • Cons: we can end up with lots of different SLA and negotiation procedures
  • Cons: no influence on the RA process
  • Pros: does not require a significant commitment of EGI in the process

Provider

  • Pros: freedom to apply own RA procedures
  • Cons: has to define own RA processes

Customer

  • Cons: we can end up with lots of different SLA and negotiation procedures
  • Cons: has to negotiate with each of the Providers separately
  • Cons: lack of external body to help match needs to resources

Use cases

  • Regional Customer

Scientific request review

  • it is up to Provider if it is needed

Activities

  • EGI
    • support Providers and/or Customers on demand
    • provide tool for resource allocation
  • NGI
    • responsible for the whole RA process relatived to its resources
  • RC
    • provide resources

Issues for discussion

SLA

  • who should be the party of the contract
  • what should be included in SLA

Requirement:

  • Simple and standardized SLA. Additional requirements should be allowed where needed by the VO. Many service levels should be made optional, as we do now in the VO ID card.

The roles of NGI, EGI, RC in the process

Scientific request review

Technical request review

  • needed to be sure that what we promise can be delivered to the Customer

The acknowledgement for resources use

  • VT_Scientific_Publications_Repository
  • the recommendation about citing EGI in publications is defined here: https://documents.egi.eu/document/1369the implementation requires a change in the Grid AUP; the issue has been raised with the Security Coordination Group and David Kelsey is checking with the big customer (WLCG) if/how they are gonna accept it before making the change... work in progress.

Monitoring of usage

Provider selection

Reporting and evaluation

Customer support

Reservations/resource pool assurance/guaranteed usage

Meeting specific needs of the VO in terms of OS, configuration, software etc.

Requirements

EGI

  1. EGI is responsible of negotiating with the RPs the requirements collected from VOs. E.g. it is the body who investigates which RPs could be willing to implement the needed server or software setup or the required storage system etc... and a single point of contact for new users (babysitting)
  2. EGI should be responsible of all activities around service level management for the federated pool to offload VOs and RPs.
  3. All EGI security policies are enforced by the RPs contributing resources to the federated pool (e.g. individual RCs must be registered as production sites in GOCDB).
  4. Processes for collection of demand, offer and negotiation of SLAs and OLAs should be automated as much as possible to ensure the service scales.

Resource Provider

  1. Scientific review: a scientific review may not always needed for example when (1) the resource requirements are small, (2) the user community/project is officially supported by one or more NGIs
  2. Technical review: A technical review of demand is needed complementing the initial scientific peer review (where applicable). RPs must be involved in this, to technically inspect VO requirements to understand the impact on the resource configurations (cluster, storage set-up, network etc.), to understand if the demand complies to the local policies, and to acknowledge that resource allocation is possible for them.
  3. Match demand and offer. The resource allocation process should not only allow to address a resource request with an offer, but also vice versa a site with free resources to advertise it.
  4. Quality of Service. Different levels of Quality of Service could be offered by the federated pool. For example: Level I. Best effort allocation without minimum number of slots allocated (opportunistic usage). Level II. Minimum number of slots allocated at high priority, jobs arriving to that site will enter in execution at the earliest time the cluster occupation permits.
  5. Acknowledgement of usage. VOs must periodically acknowledge usage of resources through scientific publication, press releases. The entities to be acknowledged are EGI, the NGIs and the individual RPs contributing resources. See the wenmr best-practice and the recommendations for a scientific publication repository.
  6. Efficient usage of resources. Resources allocated must be used by VOs. If usage is insufficient or not acknowledged properly, the agreement can be terminated.
  7. A single entity (e.g. EGI) must be responsible of ensuring the enforcement of OLAs and SLAs.
  8. The allocation processes involving multiple RPs and the liaison with other RPs should be made transparent to one RP. EGI is responsible of ensuring that the federated RPs collectively deliver the service requested. One RP is not responsible of or not involved with the services provided by other RPs. The provisioning of resource to a federated pool should be with zero overhead to the RP.
  9. It is desirable that the same interfaces and tools offered to support international users can also be used to support national user communities. In this case the supervisioning role of EGI.eu (Model 2) and all related service level management functions should be delegated to the NGI.

Customer

  1. Provide a single broker contact point to the user to hide the complexity of a network of heterogeneous RPs and to establish a single Service Level Agreement, instead of multiple ones with multiple RPs.
  2. Simple and standardized SLA. Additional requirements should be allowed where needed by the VO. Many service levels should be made optional, as we do now in the VO ID card.
  3. The broker (EGI.eu) should support the customer to express technical requirements, where needed.
  4. An external entity (e.g. EGI.eu) is responsible of reporting on usage of the federated resource pool (rather than the VO itself). Enforcement of SLAs and OLAs does not involve VOs (it is devolved to EGI.eu).
  5. Usage of a federated distributed pool comes with zero overhead for the user community.
  6. Efficient usage is a requirement for extension of the grant.

Note. The same processes that are being defined for resource allocation, could be generalized to request services, i.e. to allow new user communities to request services to be provided by NGIs, such as application porting.