Difference between revisions of "2019-bidding/platforms"

From EGIWiki
Jump to: navigation, search
Line 31: Line 31:
 
== Coordination ==
 
== Coordination ==
  
 +
This activity is responsible for the coordination of the service maintenance activities with EGI operations team and other technology providers for the EGI Core Infrastructure.
  
 
== Operations ==
 
== Operations ==
 +
 +
*Daily running of the service instances.
 +
*Provisioning of a high availability configuration.
 +
*Creating an Availability and Continuity Plan and implementing countermeasures to mitigate the risks defined in the related risk assessment.
  
 
== Maintenance ==
 
== Maintenance ==
 +
 +
This activity includes:
 +
*Bug fixing, proactive maintenance, improvement of the system.
 +
*Maintenance of probes to test the functionality of the service.
 +
*Documentation.
  
  

Revision as of 17:05, 15 November 2019

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security



EGI Services and Service Management Support menu: Bids Old Bids Performance

Go back to the EGI Services Bidding page.

Compute Plaftorms

Introduction

EGI Storage and Computing services provides the baseline computing and storage resources to deliver research-oriented platforms. Examples of these platforms are the Scientific Applications and tools listed in the EGI website, the EGI Notebooks, or the EGI Applications on Demand services. This bid welcomes any platform compatible with the EGI Computing and Storage services and capable of authenticating users with Check-in.

Technical description

Platforms must comply with the following requirements:

  • The platform integrates with Check-in for authenticating users, and delegates those identities as needed for interacting with other EGI services on behalf of the users.
  • The platform can exploit computing and storage resources from EGI services. Ideally the platform should dynamically allocate those resources by using the services APIs.
  • Provide probes for monitoring the Avalaibility and Reliablity of the platform with the EGI Monitoring service.

The platform should be offered as a centrally managed instance that will be run by the project and will provide access to ‘long-tail of science’ type use cases on ‘generic clouds’ for any user, and to thematic groups on the ‘thematic clouds’. Additionally, the platform provider should be capable of setting up dedicated instances to specific communities as needed upon request.

Optionally, platforms should also be available as ready-to-use artifacts (VM images, containers, helm charts) in the EGI AppDB, docker or helm repositories alongside with complete documentation for the deployment and operation of the platform so dedicated local instances can be setup on cloud providers as needed.

Below follows a list of platforms that are expected to be covered by applications to this bid (applications may cover one or several of the platforms below):

  • Workload manager (DIRAC): A job manager system compatible with EGI HTC and Cloud resources.
  • EC3: A VM orchestrator system compatible with EGI Cloud resources
  • Notebooks (JupyterHub): A web-based interface to define and run computational notebooks.
  • Binder: A web-based environment to reproduce and/or recreate computational environments as notebooks.
  • FutureGateway Science Gateway (FGSG): a web-based gateway that provides general purpose scientific applications on demand.
  • DODAS: Can turn IaaS cloud sites into ‘batch job execution’ facilities.
  • PROMINENCE: Can turn IaaS cloud sites into ‘batch job execution’ facilities and can support clouds to burst out jobs to other clouds.

Coordination

This activity is responsible for the coordination of the service maintenance activities with EGI operations team and other technology providers for the EGI Core Infrastructure.

Operations

  • Daily running of the service instances.
  • Provisioning of a high availability configuration.
  • Creating an Availability and Continuity Plan and implementing countermeasures to mitigate the risks defined in the related risk assessment.

Maintenance

This activity includes:

  • Bug fixing, proactive maintenance, improvement of the system.
  • Maintenance of probes to test the functionality of the service.
  • Documentation.


Software Compliance

  • Unless explicitly agreed, software being used and developed to provide the service should:
    • Be licensed under an open source and permissive license (like MIT, BSD, Apache 2.0,...).
      • The license should provide unlimited access rights to the EGI Foundation and EGI federation member organisations.
    • Have source code publicly available via a public source code repository (if needed a mirror can be put in place under the EGI organisation in GitHub.) All releases should be appropriately tagged.
    • Adopt best practices:
      • Defining and enforcing code style guidelines.
      • Using Semantic Versioning.
      • Using a Configuration Management frameworks such as Ansible.
      • Taking security aspects into consideration through at every point in time.
      • Having automated testing in place.
      • Using code reviewing.
      • Treating documentation as code.
        • Documentation should be available for Developers, administrators and end users.

IT Service Management compliance

  • Key staff who deliver services should have foundation or basic level ITSM training and certification.
    • ITSM training and certification could include FitSM, ITIL, ISO 20000 etc.
  • Key staff and service owners should have advanced/professional training and certification covering the key processes for their services.
  • Providers should have clear interfaces with the EGI SMS processes and provide the required information.
  • Providers should commit to improving their management system used to support the services they provide.

Support

  • Support is provided through the EGI Helpdesk.
  • Support hours: eight hours a day (for example 9-17 CE(S)T), Monday to Friday – excluding public holidays of the hosting organisation.

Service level targets

Platforms must have 95% A/R on a monthly basis.

Support should be provided according to the medium Quality of Support level described in the GGUS QoS levels.

Expressions of interest

?

Selection criteria

?