Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Long-tail of science"

From EGIWiki
Jump to navigation Jump to search
Line 58: Line 58:
| ACC CYFRONET AGH gLite site  
| ACC CYFRONET AGH gLite site  
| CYFRONET-LCG2 site capacity:  
| CYFRONET-LCG2 site capacity:  
*XX CPU cores  
*50 CPU cores  
*XX GB of /opt/exp_soft  
*50 GB of /opt/exp_soft  
*XX GB RAM
*3 GB RAM per core
*XX GB of opportunistic disk storage
*500 GB of opportunistic disk storage
|-
|-
| Science gateway  
| Science gateway  

Revision as of 11:36, 1 April 2016

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security



This page provides information about the 'EGI platform for the Long-tail of science'. The long-tail of science refers to the individual researchers and small laboratories who - opposed to large, expensive collaborations - do not have access to computational resources and online services to manage and analyse large amount of data. This EGI platform allows individual researchers and small research teams to perform compute and data-intensive simulations on large, distributed networks of computers in a user friendly way. If you are interested in the project that developed and now maintains the platform, please jump to the Long-tail of science project page.


Information for users

What can you access in the platform?

The platform is accessible through this portal and offers grid, cloud and application services from across the EGI community for individual researchers and small research teams. The platform offers the following type of resources:

  • High-throughput computing sites for running compute/data-intensive jobs
  • Cloud sites suited for both compute/data intensive jobs and hosting of scientific services
  • Storage resources for storing job input and output data, and for setting up data catalogues
  • Science gateways that provide graphical web environments for building and executing applications in the platform.
  • Applications that are made available ‘as services’ through the science gateways.

Current available resources in the platform:  

Type Name Description
Cloud and storage site INFN Catania Openstack site INFN-CATANIA-STACK site capacity:
  • 20 vCPUs
  • 50GB RAM
  • 10 floating IPs
  • 10TB storage 
Cloud and storage site RECAS Bari Openstack site RECAS-BARI site capacity:
  • 15 vCPUs
  • 30GB RAM
  • 1TB storage 
High-Throughput Compute and Storage site INFN Catania gLite site GILDA-INFN-CATANIA site capacity:
  • 1M HEPSPEC-hours
  • 30GB of /opt/exp_soft
  • 10GB RAM
  • 100GB of opportunistic disk storage
High-Throughput Compute and Storage site INFN BARI gLite site INFN-BARI site capacity:
  • 0.5M HEPSPEC-hours
  • 2GB RAM per core
  • 100GB of opportunistic disk storage
High-Throughput Compute and Storage site ACC CYFRONET AGH gLite site CYFRONET-LCG2 site capacity:
  • 50 CPU cores
  • 50 GB of /opt/exp_soft
  • 3 GB RAM per core
  • 500 GB of opportunistic disk storage
Science gateway Catania Science Gateway The Catania Science Gateway is a new generation of Science Gateway based on standard that changes the way e-Infrastructures are used. The gateway incorporates several scientific applications and offers these ‘as services’ for the user.
Science gateway WS-PGRADE The WS-PGRADE Portal (Web Services Parallel Grid Runtime and Developer Environment Portal) is the Liferay-based web portal (WS-PGRADE web application) of gUSE, wich also includes a graphical portal service. WS-PGRADE is a Web portal hosted in a standard portal framework, using the client APIs of gUSE services to turn user requests into sequences of gUSE specific Web service calls.
Application Hello World Hello World is a simple grid-based application that demonstrates the use of remote resources by printing the hostname where the job is executed. It is accessible through the Catania Science Gateway.
Application The Statistical R The Statistical R is a language and environment for statistical computing and graphics. It is accessible through the Catania Science Gateway.
Application Chipster Chispter is a user-friendly analysis software fro high-throughput data. It contains over 300 analysis tools for next generation sequencing (NGS), microarray, proteomics and sequence data. Users can save and share automatic analysis workflows, and visualize data interactively using a built-in genome browser and many other visualizations.
Application ClustalW2 ClustalW2 is a multiple sequence alignment tool for the alignment of DNA or protein sequences.
Application The Semantic Search Engine (SSE) SSE is a framework conceived to demonstrate the potential of information coupled with semantic web technologies to address the issues of data discovery and correlation. It is accessible through the Catania Science Gateway.

Would you like to access an application, gateway or resource in the platform that is not yet integrated? Request it in email at long-tail-support@mailman.egi.eu!

Who can access the platform?

The platform is open for any researcher who needs a simple and user-friendly access to compute, storage and applications services in order to carry out data/compute intensive science and innovation. You need to be affiliated with, or at least have a partner (for example a referee), at a European research institution to qualify for access. The platform is designed to meet the needs of individual researchers and small research groups who have limited or no experience with distributed and cloud computing.

How can you access the platform?

  1. Login to the entry portal with an EGI SSO, Google or Facebook account.
  2. Provide information on your profile page about your affiliation to a research institute or team.
  3. Request resources from the platform: Indicate what you would like to achieve with the resources so we can help you find the most suitable ones.
  4. After your request is approved, login to any of the science gateways and build or execute compute/data intensive applications.

Presentations about the platform

  • Overview of the EGI Platform for the long tail of science (EGI Community Forum, November 2015): [1]
  • Poster and animated slides from Demo at EGI Community Forum, November 2015 (Winner of best demo prize): [2]
  • Slideset about the concept of the EGI long-tail of science platform (from Nov. 2014): [3]
  • Slideset about the authentication and authorization model adopted (from Nov. 2015): [4]

Guide for providers

  • Science gateway and resource providers must accept and follow the platform security policy: https://wiki.egi.eu/wiki/SPG:Drafts:LToS_Service_Scoped_Security_Policy
  • Science gateway providers must integrate with the User Registration Portal to enable the single sign-on capability for users.
  • Science gateway providers must integrate with the per-user subproxy solution to offer traceable user authentication towards the e-infrastructure VO.
  • Science gateways must implement user resource usage quota (to prohibit a user consuming all the resources from the platform).
  • Resource providers must support the per-user subproxy solution and join the e-infrastructure VO.

The below subsections provide guidance to complete these steps.

How to connect a science gateway to the platform

Connecting the science gateway with the User Registration Portal


Client service Registration

1. Open the GGUS ticket to operations that include return URIs

2. UNITY team send Client clientID and secretKey


Authorization procedure Unity with Client:

1] The Client sends a request to the OpenID Provider


parameters:
response_type:code
redirect_uri: Redirect url
client_id:unity-oauth-egrant
scope:profile openid 

example:
    response_type=code
    &client_id=123123123
    &redirect_uri=https%3A%2F%2Fclient.pl%2Fauth
    &scope=openid%20profile
    &state=a123a123a123


2] Authorization Server authenticates the End-User.
3] Authorization Server obtains End-User Consent/Authorization.
4] Authorization Server sends the End-User back to the redirect uri from the first request (Redirect url) with code.

example of the response

    code=uniquecode123
    &state=a123a123a123



5] Client sends the code to the Token Endpoint to receive an Access Token and ID Token in the response.

POST /token HTTP/1.1
  Host: client.pl
  Authorization: Basic czZCaGRSa3F0MzpnWDFmQmF0M2JW
  Content-Type: application/x-www-form-urlencoded

  grant_type=authorization_code&code=uniquecode123
    &redirect_uri=https%3A%2F%2Fclient.pl%2Fauth




6] Client validates the tokens and retrieves the End-User's Subject Identifier.

example:

  HTTP/1.1 200 OK
  Content-Type: application/json
  Cache-Control: no-store
  Pragma: no-cache
  {
   "access_token":"accessToken123",
   "token_type":"Bearer",
   "expires_in":3600,
   "refresh_token":"refreshToken123",
   "id_token":"idToken123123"
  }

You should decode id_token and make some validation (more information: http://openid.net/specs/openid-connect-basic-1_0.html)


7] Client Gets some information from userpoint endpoint (https://unity.egi.eu/oauth2/userinfo)

example


8] User gets information about user such as email or name in json format



important data:
unity.server.clientId=  [YOUR CLIENT ID]
unity.server.clientSecret= [YOUR SECRET KEY]
unity.server.base=https://unity.egi.eu

full configuration:

OpenId Connect for Liferay

OpenId Connect for Liferay is a very rough but effective implementation of the OpenId connect protocol for Liferay. Use this module to authenticate with any OpenId Connect provider.

Connecting the science gateway with per-user subproxies

The platform uses per-user subproxies (PSUPs) for user authentication. Any connected science gateway must generate per-user sub proxies for their users and must use these for any interaction with VO resources on behalf of the users. A gateway can generate PSUPs in two ways:

  1. . From a robot certificate that is physically hosted on the gateway server itself. OR
  2. . From a remote robot certificate that is hosted in the e-Token Server by INFN Catania.

We recommend you to choose the first option. If there is an IGTF CA in your country which issues robot certificates then obtain a robot from this CA. If such robots are not available in your country or region, then EGI can issue a robot for you from the SEEGRID catch-all CA. The next subsections provide details information to complete these steps.

Generic requirements

The Per-User Sub-Proxy (PUSP) and End-Entity Certificate (EEC) must satisfy the following requirements:

  1. The EEC is a valid robot certificate:
  2. The PUSP is RFC 3820 compliant, i.e. no legacy GT2 or GT3 proxies
  3. The PUSP is the first proxy delegation
  4. If the same user enters via the same portal, he must get the same PUSP DN
  5. No two distinct identified users will have the same PUSP DN.

A robot EEC that generates PUSP credentials SHOULD NOT be used for any other purpose; for example, it should not be used to generate non-PUSP proxy credentials and should not be use for direct authenticating.

The machine/service that will take care of PUSP generation and management should respect the following rules:

  1. Documented response procedures in case of incidents (that are periodically tested).
  2. A listed/accredited CSIRT team.
  3. Internal risk assessment and an actuarial team to calculate the effective risk

Using a robot certificate from your national IGTF CA

  1. Obtain a robot certificate from your national IGTF Certification Authority following the instructions here.
  2. Register the robot in the vo.access.egi.eu VO: https://perun.metacentrum.cz/cert/registrar/?vo=vo.access.egi.eu
  3. Generate proxies from the robot using this script: https://ndpfsvn.nikhef.nl/viewvc/mwsec/trunk/lcmaps-plugins-robot/tools/

Obtaining a robot certificate from EGI catch-all CA

  1. Contact long-tail-support@mailman.egi.eu and send a short description of your gateway service and the way it would be integrated with platform resources. The team will arrange a robot certificate for your gateway from the SEEGRID CA (which operates as a 'catch-all' CA in EGI) and will register this in the VO and in the e-Token Server in Italy.
  2. Register the robot in the vo.access.egi.eu VO: https://perun.metacentrum.cz/cert/registrar/?vo=vo.access.egi.eu
  3. Generate proxies from the robot using this script: https://ndpfsvn.nikhef.nl/viewvc/mwsec/trunk/lcmaps-plugins-robot/tools/

Instructions to use the e-Token Server

  1. Contact long-tail-support@mailman.egi.eu and send a short justification why you would like to use the eToken server (instead of hosting the robot certificate locally). Describe your gateway service and the way it would be integrated with platform resources. The team will arrange a robot certificate for your gateway from the SEEGRID CA and will register it in vo.access.egi.eu.
  2. Provide long-tail-support@mailman.egi.eu with the static IP address of your gateway server, so proxy requests can be authorized from this address on the e-Token Server.
  3. Generate proxies from the e-Token server following this guideline:

There are two available e-Token Server instances for availability and reliability reasons:

  • etokenserver.ct.infn.it
  • etokenserver2.ct.infn.it

The following rest API is available to get a PUSP given a unique identifier:

https://[eToken Server instance]:8443/eTokenServer/eToken/[Robot Certificate ID]?voms=[VO]:/[VO]&proxy-renewal=[true|false]&disable-voms-proxy=[true|false]&rfc-proxy=[true|false]&cn-label=user:[user unique identifier]
  • Robot cetificate ID: it is the ID of your robot certificate in the e-Token server. It will be generated after the setup of your robot into the e-Token Server.
  • VO: the VO you want to use to perform any action on the EGI infrastructure. The robot certificate must be a member of this VO.
  • proxy-renewal: this option is used to enable (true) or disable (false) the automatic registration of a long-term proxy into a MyProxy Server.
  • disable-voms-proxy: this option is used to generate plain (true) or VOMS proxy certificate (false).
  • rfc-proxy: this option is used to generate standard RFC proxies (true) or legacy proxies (false).
  • cn-label: this option is used to generate a PUSP for the given unique identifier.

below an example:

https://[eToken Server instance]:8443/eTokenServer/eToken/27br90771bba31acb942efe4c8209e69?voms=training.egi.eu:/training.egi.eu&proxy-renewal=false&disable-voms-proxy=false&rfc-proxy=true&cn-label=user:test1

Connecting the gateway with the EGI monitoring system

...

How to join as a resource provider

Any EGI resource provider can join the platform to offer capacity for members of the long-tail of science. The site needs to run one of the supported grid or cloud middleware software, enable per-user sub-proxies (for user authentication and authorisation), and join the vo.access.egi.eu Virtual Organisation. The next subsections provide instructions on how to enable per-user sub-proxies on EGI sites. Please email long-tail-support@egi.eu if you wish to join as a resource provider.

In order to provide authorization to the users of the LToS VO, a couple of DNs (Distinghished Names) are required to be configured on the services to be enabled. For instance, for the CREAM CE the usual grid-mapfile is the place where to add them, for OpenStack it's /etc/keystone/voms.json. You can find below the instructions for each service.

The following Robot Certificate DNs must be configured:

/DC=EU/DC=EGI/C=HU/O=Robots/O=MTA SZTAKI/CN=Robot:zfarkas@sztaki.hu
/C=IT/O=INFN/OU=Robot/L=Catania/CN=Robot: Catania Science Gateway  - Roberto Barbera

Instructions for OpenStack providers

Keystone-VOMS has support for per-user subproxies in the special branch called subproxy_support available in the github repository https://github.com/enolfc/keystone-voms (code is in progress of being integrated into the main branch of Keystone-VOMS). You can install the code from the repository following these instructions:

 git clone -b subproxy_support https://github.com/enolfc/keystone-voms.git
 cd keystone-voms
 pip install .

Configuration and deployment of the plugin does not change from the normal Keystone-VOMS plugin, follow the Keystone-VOMS documentation to deploy it.

There are new parameters to configure in your keystone config file, under the [voms] section:

  • allow_subproxy, should be set to True for enabling PUSP support.
  • subproxy_robots, should be set to * (recommended) or to a list of the DNs that are allowed to create PUSP in the system.
  • subproxy_user_prefix, determines the expected prefix for the PUSP user specification. It is safe to leave it undefined so the default value (CN=eToken is used.

Instruction for gLite providers

There is an EGI manual that shows how to set up a per-user sub-proxy to allow identification of the individual users under a common robot certificate. You can find the guide here: https://wiki.egi.eu/wiki/MAN12

Instruction for OpenNebula providers

OpenNebula sites are not yet supported in the platform.

How to join the user support team

If you wish to support platform users from your country, region or scientific disciplinary area, then please email long-tail-support@egi.eu. We can train you and then register you as a supporter in our team.


Technical and architecture details

User Registration Portal

The User Registration Portal of the platform is hosted by CYFRONET in Poland and serves as the entry point for users. The portal offers login with social or EGI SSO accounts, and allow users to manage their profiles, resource requests and a central hub to access the connected science gateways. The portal is used by the user support team to review user profiles and to evaluate the users' resource requests. The portal is accessible at http://access.egi.eu.

Virtual Organisation

The HTC, cloud and storage resources of the platform are federated through the 'vo.access.egi.eu' Virtual Organisation of EGI (VO). Technical details of this VO are the following:

Per-user sub-proxies

The purpose of a per-user sub-proxy (PUSP) is to allow identification of the individual users that operate using a common robot certificate. A common example is where a web portal (e.g., a scientific gateway) somehow identifies its user and wishes to authenticate as that user when interacting with EGI resources. This is achieved by creating a proxy credential from the robot credential with the proxy certificate containing user-identifying information in its additional proxy CN field. The user-identifying information may be pseudo-anonymised where only the portal knows the actual mapping.

Example of a Per-User Sub-Proxy (PUSP):

subject   : /C=IT/O=INFN/OU=Robot/L=Catania/CN=Robot: EGI Training Service - XXXXX/CN=user:test1/CN=1286259828
issuer    : /C=IT/O=INFN/OU=Robot/L=Catania/CN=Robot: EGI Training Service - XXXXX/CN=user:test1
identity  : /C=IT/O=INFN/OU=Robot/L=Catania/CN=Robot: EGI Training Service - XXXXX
type      : RFC3820 compliant impersonation proxy
strength  : 1024
path      : /home/XXXXX/proxy.txt
timeleft  : 23:59:15
key usage : Digital Signature, Key Encipherment, Data Encipherment
=== VO training.egi.eu extension information ===
VO        : training.egi.eu
subject   : /C=IT/O=INFN/OU=Robot/L=Catania/CN=Robot: EGI Training Service - XXXXX
issuer    : /DC=org/DC=terena/DC=tcs/OU=Domain Control Validated/CN=voms1.grid.cesnet.cz
attribute : /training.egi.eu/Role=NULL/Capability=NULL
timeleft  : 23:59:17
uri       : voms1.grid.cesnet.cz:15014

E-Token Server

The platform adopted the e-Token server [1] as a central service to generate PUSPs for science gateways. In a nutshell the e-Token server is a standard-based solution developed by and hosted in INFN Catania for central management of robot certificates and provisioning of digital, short-term proxies from these, allowing seamless and secure access to e-Infrastructures with X.509-based Authorisation layer.

The e-Token server uses the standard JAX-RS framework [2] to implement RESTful Web services in Java technologies and provides, to the end-users, portals and new generation of Science Gateways, a set of REST APIs to generate PUSPs given a unique identifier. PUPS are usually generated starting from standard X.509 certificates. These digital certificates have to be uploaded into one of the secure USB smart cards (e.g. SafeNet Aladdin eToken PRO 32/64 KB) and plugged in the server.

The e-Token server was conceived for providing a credential translator system to Science Gateways and Web Portals that need to interact with the EGI platform for the long-tail (and in general with any e-Infrastructure).

[1] Valeria Ardizzone, Roberto Barbera, Antonio Calanducci, Marco Fargetta, E. Ingrà, Ivan Porro, Giuseppe La Rocca, Salvatore Monforte, R. Ricceri, Riccardo Rotondo, Diego Scardaci, Andrea Schenone: The DECIDE Science Gateway. Journal of Grid Computing 10(4): 689-707 (2012)

[2] Java API for RESTful Web Services (JAX-RS): https://en.wikipedia.org/wiki/Java_API_for_RESTful_Web_Services

Policies

Links for administrators

User approval:

  1. Approve affiliation: https://access.egi.eu:8888/modules#/list/Affiliations
  2. Approve resource request: https://e-grant.egi.eu/ltos/auth/login

Gateway and support approval:

Monitoring:

Accounting:

  • Accounting data of platform users: ...
  • ...

Roadmap

No Task Priority
Responsible Start date Deadline Comment STATUS

Definition of the LTOS portal Terms and Conditions Medium
Solagna
1 April

Setup of the structures (team, processes,procedures) needed to support the LTOS platform
Medium
Solagna
1 May

Registration of LTOS components in GOC DB
High
Krakowian started 1 April

GRIDOPS-CSGF

GRIDOPS-LTOS

missing registration of administrators and some additional info for GRIDOPS-LTOS

In progress

Agree on OLAs supporting LTOS resources High
Krakowian

1 April
In progress

Finalization of the LTOS business model
Medium
Solagna

1 May


Integrate WS-PGRADE gUSE to LTOS
High
La Rocca
started
1 April

https://ggus.eu/index.php?mode=ticket_info&ticket_id=116323

in progress

Accounting system integration
Medium
La Rocca started TBD

In progress

Implementing Roles in the URP
Low
Szepieniec

TBD
better understand requirement

Instruction for Lifewary providers

La Rocca started
finished
https://github.com/csgf/OpenIdConnectLiferay DONE

Space for the resource providers logos
Low
Szepieniec



Logos of NGIs/institutions providing resources for the LToS platform should be added on page [1] (in the bottom). [1] https://access.egi.eu/start


Integration with QCG
Medium
La Rocca
started
TBD
https://ggus.eu/?mode=ticket_info&ticket_id=117764 In progress

Login modes
Medium
Szepieniec

1 April explanation

page not refreshed
Medium Szepieniec
1 April explanation

Rephprase point 3 of "How can you access the platform?"
Low
Szepieniec
TBD

accepting and rejecting the affiliations
Medium
Szepieniec
1 April explanation

information menu
Medium
Szepieniec
1 May

General usage policy
Medium Szepieniec
1 April


notifications
High
Szepieniec

1 April


Link to www.egi.eu
Low
Szepieniec
1 May access.egi.eu does already contain an EGI logo but the link is wrong. It should point to www.egi.eu instead of https://access.egi.eu/
In progress

Pre-defined templates for the requests
High Szepieniec
1 April HTC [Computing] = 10k hours
HTC [Storage] = 100 GB of total storage capacity
Cloud [Computing] = 10 vCPU cores per hours
Cloud [Storage] = 100 GB of storage volume
In progress

Add contacts for support/requests
Low
Szepieniec
TBD

Access to general usage policy
Medium Szepieniec
1 April MK+GLR where to put link In progress

Add an institutional email for the communications
High
Peter

TBD

Users should always be able to go back to the home page
Medium Szepieniec
1 June

Monitoring of URP
High Krakowian

http://argo.egi.eu/lavoisier/status_report-site?report=OPS-MONITOR-Critical&accept=html
DONE

Monitoring of SGs. Update SG integration doc in Wiki accordingly
High Krakowian
http://argo.egi.eu/lavoisier/status_report-site?report=OPS-MONITOR-Critical&accept=html
DONE

Setup GGUS units for trouble tickets
High Peter

TBD
In progress

Define identity vetting manual for user request approvers
High
La Rocca

TBD

Sign OLA with URP provider
High Krakowian
21.03
1 April

In progress

Sign OLA with SG
High Krakowian 21.03 1 April
IN progress

Document process on how to monitor user-level accounting & how to respond to quota overuse
Low
La Rocca

TBD


Manage user-level quota inside the SG
Low La Rocca
TBD

Define and implement process for downtime notification
Medium
Krakowian

TBD


Move the security policy into final document format
High
Krakowian
14.03.2016
1 April
https://documents.egi.eu/document/2769 DONE

Discuss details of joining with interested sites and SGs
High
La Rocca

TBD

In progress

Involve NGI representatives in request approver team
Medium
Solagna

1 April


Adoption of URP to Hungarian Academic Cloud
Low
Sipos