Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

EGI Notebooks

From EGIWiki
Jump to navigation Jump to search

EGI Jupyter is an 'as a Service' environment that provides a browser-based, scalable tool for interactive analysis of data. The environment provider users with notebooks where they can combine text, mathematics, computations and rich media output using Jupyter technology. EGI Jupyter is a multi-user service and can scale to multiple servers based on the EGI cloud service.

Unique Features

EGI Jupyter provides the well-known Jupyter interface for notebooks with the following added features:

  • Integration with EGI Check-in for authentication, login with any EduGAIN or social accounts (e.g. Google, Facebook)
  • Persistent storage associated to each user, available in the notebooks environment.
  • Customisable with new notebook environments, expose any existing notebook to your users.
  • Runs on EGI e-Infrastructure so can easily use EGI compute and storage from your notebooks.

Service Modes

We offer different service modes depending on your needs:

  • Individual users can use the centrally operated service from EGI. Users, after lightweight approval, can login, write and play and re-play notebooks. Notebooks can use storage and compute capacity from the access.egi.eu Virtual Organisation. Request access via EGI marketplace
  • User communities can have their customised EGI Jupyter service instance. EGI offers consultancy and support, as well as can operate the setup. Contact support@egi.eu to make an arrangement. A community specific setup allows the community to
    • use the community's own Virtual Organisation (i.e. federated compute and storage sites) for Jupyter
    • add custom libraries into Jupyter (e.g. discipline-specific analysis libraries)
    • have fine grained control on who can access the instance (based on the information available to the EGI Check-in AAI service).
  • (under development) BinderHub mode that allows to recreate notebooks from existing repositories making the code immediately reproducible by anyone, anywhere. While under development, this option does not have persistent storage and does not require authentication, there is ongoing work to integrate with the modes described above. Alpha instance available at https://binderhub.fedcloud-tf.fedcloud.eu

Data Management

Persistent storage for the notebooks is available at /persistent linked from the notebooks home directory. This is backed up by a NFS server managed as persistent volume in Kubernetes and automatically mounted at every notebook users create. Access to other kinds of persistent storage, specially for community specific instances that can be tailored to your specific needs and available storage systems.

Getting your data in

Your notebooks have full outgoing internet connectivity so you can connect to any external service to bring data in for analysis. We are evaluating integration with EOSC-hub services for facilitating the access to input data, with EGI DataHub as first target. Please contact support@egi.eu if you are interested in other I/O integrations.

Deposit output data

As with input data, you can connect to any external service to deposit the notebooks output.

Access to other services

Notebooks running on EGI Jupyter can access other existing computing and storage services. The centrally operated EGI Jupyter instance is using resources from the access.egi.eu Virtual Organisation. We are open to suggestions on which services you would like to access to create guidelines and extend the service with tools to ease these tasks.

Bring your custom notebooks

Adding new notebooks to the EGI Jupyter just requires a working Docker image accessible from a public repository that follows these rules:

  1. It must install JupyterHub v0.8
  2. It must not run as user root, user with uid 1000 is recommended
  3. It must use $HOME as notebook directory

If you have such image, let us know so we can add it to the configuration.

Once binder integration is complete, you will be able to import any notebook just by providing the URL of a repository which contains your notebook.

Early adopter communities

  • OpenDreamKit: Open Digital Research Environment Toolkit for the Advancement of Mathematics
  • AGINFRA+: Accelerating user-driven e-infrastructure innovation in Food and Agriculture
  • IFREMER: Marine sciences

Technology Stack

The EGI setup is based in the following components:

The enolfc/egi-jupyterhub github repo contains detailed configuration information on the existing setup at EGI resources.

Next steps

We are looking into:

  • Prometheus based monitoring for the Kubernetes cluster and the JupyterHub
  • Integration with EGI accounting to report usage of resources
  • Complete Binder integration with the regular JupyterHub so users can have persistent notebooks created from external repositories
  • Integration with storage services, EGI DataHub as first target