Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Long-tail of science

From EGIWiki
Jump to navigation Jump to search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security



This page provides information about the 'EGI platform for the Long-tail of science', as well as about the project that develops the platform. If you want to join the platform as a science gateway provider, or cloud resource provider, then jump to Information for providers


Project coordinators: Peter Solagna, Gergely Sipos/EGI.eu

Developer meetingshttp://indico.egi.eu/indico/categoryDisplay.py?categId=36

Developers' email list: long-tail-pilot@mailman.egi.eu


Motivation

Processes are well established for several years in EGI to allocate resources for user communities. However individual researchers and small research teams sometimes struggle to access grid and cloud compute and storage resources from the network of NGIs to implement ‘big data applications’. Recognising the need for simpler and more harmonised access for individual researchers and small research groups, aka. members of the ‘long-tail of science’ the EGI community started to design and prototype a new platform in October 2014.

This project designs and prototypes a new e-infrastructure platform in EGI to simplify access to grid and cloud computing services for the long-tail of science, ie. those researchers and small research teams who work with large data, but have limited or no expertise in distributed systems. The project will establish a set of services integrated together and suited for the most frequent grid and cloud computing use cases of individual researchers and small research teams.

Members of the project evaluate technologies that can lower the barrier of access to grid and cloud resources, select the best candidates, and assemble these into a coherent platform. The platform will serve users via a centrally operated 'user registration portal' and a set of science gateways that will be connected to resources in a new Virtual Organisation. Through the registration portal users will be able to request resources from the platform, while through the science gateways they consume the resources by implementing big data and compute-intensive use cases. The platform will be relevant for any scientist who works with large data, moreover it will include communication channels and mechanisms to connect these people to their respective National Grid Initiatives where they can receive more customised services. The platform will be open for grid and cloud resource providers, for science gateway providers, and for user support teams. The project welcomes such contributors.

The expected benefits of the project:

  • A single grid and cloud computing resource allocation at the European scale for individual researchers and small research teams.
  • Harmonised and simplified access to grid and cloud resources of EGI.
  • Custom Virtual Research Environments available for individual researchers and small research teams at the European scale.
  • New and improved connection within the NGIs to individual researchers and small research teams.
  • An innovative new security subsystem (per-user sub-proxies) that - in the future - may be used in other EGI community platforms.

Presentations

Slideset about the concept of the EGI long-tail of science platform: [1]

Slideset about the authentication & authorization model adopted (incl. per-user subproxies): [2]

Project information

Mandate

The project has started in the 5th year of the EGI-InSPIRE FP7 project and current supported by the H2020 EGI-Engage project. The target of this activity is to design and prototype the services of the EGI long-tail platform. The services will be deployed in a new Virtual Organisation of EGI.

Objectives

The capabilities that the long tail of science platform have to support are:

  • Zero-barrier access: any user who carries out relevant research can get a start-up resource allocation
  • 100% coverage: anyone with internet access can become a user
  • User-centric: User support for platform users is available through the NGIs
  • Realistic: Reuse existing technology building blocks as much as possible, require minimal new development
  • Secure: Provide acceptable level of tracking of users and user activities (Not necessarily f2f vetting)
  • Scalable: Can scale up to support large number resource providers, technology providers, use cases and users
  • Valuable: Result tangible outcomes


Information for providers

How to connect a science gateway to the platform

Connecting the SG with the User Registration Portal


Client service Registration

1. Open the GGUS ticket to operations that include return URIs

2. UNITY team send Client clientID and secretKey


Authorization procedure Unity with Client:

1] The Client sends a request to the OpenID Provider


parameters:
response_type:code
redirect_uri: Redirect url
client_id:unity-oauth-egrant
scope:profile openid 

example:
    response_type=code
    &client_id=123123123
    &redirect_uri=https%3A%2F%2Fclient.pl%2Fauth
    &scope=openid%20profile
    &state=a123a123a123


2] Authorization Server authenticates the End-User.
3] Authorization Server obtains End-User Consent/Authorization.
4] Authorization Server sends the End-User back to the redirect uri from the first request (Redirect url) with code.

example of the response

    code=uniquecode123
    &state=a123a123a123



5] Client sends the code to the Token Endpoint to receive an Access Token and ID Token in the response.

POST /token HTTP/1.1
  Host: client.pl
  Authorization: Basic czZCaGRSa3F0MzpnWDFmQmF0M2JW
  Content-Type: application/x-www-form-urlencoded

  grant_type=authorization_code&code=uniquecode123
    &redirect_uri=https%3A%2F%2Fclient.pl%2Fauth




6] Client validates the tokens and retrieves the End-User's Subject Identifier.

example:

  HTTP/1.1 200 OK
  Content-Type: application/json
  Cache-Control: no-store
  Pragma: no-cache
  {
   "access_token":"accessToken123",
   "token_type":"Bearer",
   "expires_in":3600,
   "refresh_token":"refreshToken123",
   "id_token":"idToken123123"
  }

You should decode id_token and make some validation (more information: http://openid.net/specs/openid-connect-basic-1_0.html)


7] Client Gets some information from userpoint endpoint (https://unity.egi.eu/oauth2/userinfo)

example


8] User gets information about user such as email or name in json format



important data:
unity.server.clientId=  [YOUR CLIENT ID]
unity.server.clientSecret= [YOUR SECRET KEY]
unity.server.base=https://unity.egi.eu

full configuration:

Connecting the SG with the per-user subproxy

Diego to add



How to join as a resource provider

Any EGI resource provider can join the platform to offer capacity for members of the long-tail of science. The site needs to run one of the supported grid or cloud middleware software, and enable per-user sub-proxies (for user authentication and authorisation). The next subsections provide instructions on how to enable per-user sub-proxies on EGI sites. Please email support@egi.eu if you wish to join as a resource provider.

Instructions for OpenStack providers

Keystone-VOMS has support for PUSP in the special branch called subproxy_support available in the github repository https://github.com/enolfc/keystone-voms (code is in progress of being integrated into the main branch of Keystone-VOMS). You can install the code from the repository following these instructions:

 git clone -b subproxy_support https://github.com/enolfc/keystone-voms.git
 cd keystone-voms
 pip install .

Configuration and deployment of the plugin does not change from the normal Keystone-VOMS plugin, follow the Keystone-VOMS documentation to deploy it.

There are new parameters to configure in your keystone config file, under the [voms] section:

  • allow_subproxy, should be set to True for enabling PUSP support.
  • subproxy_robots, should be set to * (recommended) or to a list of the DNs that are allowed to create PUSP in the system.
  • subproxy_user_prefix, determines the expected prefix for the PUSP user specification. It is safe to leave it undefined so the default value (CN=eToken is used.

Instructions for gLite providers

There is an EGI manual that shows how to set up a per-user sub-proxy (PUSP) to allow identification of the individual users under a common robot certificate. You can find the guide here: https://wiki.egi.eu/wiki/MAN12

Instructions for OpenNebula providers

Development is ongoing. Release is not expected before the EGI Community Forum

How to join the user support team

If you wish to support users in your country, region or science disciplinary area with the EGI platform, then please email support@egi.eu. We can train you and register you as a supporter.


Architecture details and technical discussions

Virtual Organisation

Name: vo.access.egi.eu

Scope: Global

Homepage URL: https://wiki.egi.eu/wiki/Long-tail_of_science (This wiki will evolve further to become a page about the platform.)

GGUS dedicated support: No (support will be via email)

Acceptable use policy for users: https://documents.egi.eu/document/2635

Discipline: Support Activities (to be changed to Multi-disciplinary support as soon as this is possible)

VOMS: We will use VOMS+PERUN.


Resources - Requirements:

Contacts: <long-tail-support@mailman.egi.eu> for all. This is a new email list that will have technology and user supporters

Resources:

  • INFN Catania, gLite grid site, WLCG site with opportunistic access: <capacity>
  • INFN Catania, OpenStack site: <capacity>

User Management Portal (UMP)

Introductory text

EGI enables researchers to get access to distributed resources, EGI have recognised the need for simpler and more harmonised access to the distributed EGI Infrastructure. This portal allows individual researchers and small research teams to be productive using EGI without barriers and without unnecessary overhead.

What can you get through this portal?

The type of services and resources are driven by the capabilities of the science gateways integrated with the platform. Ideally EGI can offer:

  • HTC resources
  • Cloud resources
  • Storage resources

And through the science gateways:

  • Run a variety of software and applications already available and

used in EGI, from statistics tools to bioinformatics. While the platform grows, more services and capabilities will be extended, feel free to provide feedback!

Who can get access to this portal?

User need to:

  • Be able to demonstrate affiliation with a research institution within Europe, or to have contacts with a research institution in Europe

(e.g. a referee in an institution)

  • Be able to describe the purpose of his/her research
    • Possibly with medium term goals
  • Be willing to acknowledge the EGI/NGI support in their publications

How the registration works?

  1. Register with EGI SSO credential to the portal. eduGAIN support is

planned and will be available soon. You will be reidrected to EGI sso or to other IdP supported in the future. Creating an EGI SSO account requires few minutes and it is completely automatic.

  1. Provide information about your affiliation. Your institution or the

research team you are member of.

  1. Request access to the resources. To submit a request you will have

to describe the research subject and the goals of the activities in EGI.

  1. Once approved login with your credential to the science gateways

supported by the platform and start using EGI!

Steps 2) and 3) can be performed by the user immediately, but they require approval by EGI team.

For the resource providers

Are you a site manager or an NGI manager and do you want to support with your resources the long tail of science platform? Contact operations@egi.eu!

Analysis of the functionalities and architecture

  • Registration of the user. Including the form where to provide information about the user's institution, field of research and the purposes of his/her activity in EGI resources.
    • The request must be approved by authorized users.
  • User registry. The UMP will be a registry of the users who are accessing, or accessed, EGI through the long tail of science platform.
  • User authentication
    • UMP must support a catch all IdP for the homeless users (use of EGI SSO?)
    • Consider in the UMP the possibility to integrate external IdPs.
    • The other services of the long tail of science platform should get hthe user information from the UMP. This will ensure that users are associated to uniform identifiers assigned only by the UMP to facilitate accounting and authorization.

As shown in the following figure, the UMP must act as a service proxy, between science gateways and the identity providers, being them EGI SSO or other IdP (e.g. eduGAIN federations). In this way UMP can control the access to the infrastructure for the long tail of science users. UMP acts as unique IdP for the science gateways.

This architecture also allows the UMP to be the service provider that needs to be authorized by the IdPs.

User Management Portal Architecture

Once the users' request is authorized on the User Management Portal, they are redirected to one or several science gateways where they can run their computational tasks or manage their data on the grid. A possible workflow to access resources could be the following:

  1. User accesses the Scienge Gateway (SG).
  2. The SG redirect the request to the UMP.
  3. The UMP redirect the request to the IdP that holds the credentials of the user (e.g. EGI SSO).
  4. The User authenticate on his/her IdP.
  5. The IdP provides the assertion with some attributes about the user to the UMP (e.g. the user email address).
  6. The UMP answers to the SG adding more attributes including the Unique Identifier that identifies the user in the UMP registry, and that is unique for every user using the LTOS platform.
  7. The SG uses the UID to ask a credentials that can be univocally associated to the individual user.
  8. The credential is used to access EGI resources.


User Management Portal workflow

Credential services

The user credential service will be based on the per-user sub-proxies (PUSP).

The purpose of a per-user sub-proxy (PUSP) is to allow identification of the individual users that operate using a common robot certificate. A common example is where a web portal (e.g., a scientific gateway) somehow identifies its user and wishes to authenticate as that user when interacting with EGI resources. This is achieved by creating a proxy credential from the robot credential with the proxy certificate containing user-identifying information in its additional proxy CN field. The user-identifying information may be pseudo-anonymised where only the portal knows the actual mapping.

This solution will allow LToS users to access EGI resources through their LToS portal credentials (e.g. EGI SSO, Identity Federations, etc.) without owning a personal grid certificate. This will simplify the access to the infrastructure for the final users.

Policy changes

The long-tail platform requires two policies:

  1. An 'Acceptable Use Policy' (AUP) for the platform.
  2. A new security policy that describes the conditions of generating and using user-specific proxies from robot certificates

AUP

Acceptable Use Policy and Conditions of Use of the EGI Platform for the Long-tail of Science: https://documents.egi.eu/document/2635

Security Policy for the Long-tail platform

SPG:Drafts:LToS Service Scoped Security Policy - DRAFT

Accounting of usage

Detailed accounting data about the VO users can be obtained by the VO managers at https://accounting.egi.eu/user/voadm.php