Difference between revisions of "EGI Notebooks"
Line 1: | Line 1: | ||
EGI Jupyter is an 'as a Service' environment that provides a browser-based, scalable tool for interactive analysis of data. The environment provider users with notebooks where they can combine text, mathematics, computations and rich media output using Jupyter technology. The 'Jupyter as a service' is a multi-user environment and can scale to multiple servers with the use of the [[Federated_Cloud_user_support#What_is_the_EGI_cloud.3F|EGI cloud service]]. | |||
= Unique Features = | = Unique Features = |
Revision as of 14:52, 1 February 2018
EGI Jupyter is an 'as a Service' environment that provides a browser-based, scalable tool for interactive analysis of data. The environment provider users with notebooks where they can combine text, mathematics, computations and rich media output using Jupyter technology. The 'Jupyter as a service' is a multi-user environment and can scale to multiple servers with the use of the EGI cloud service.
Unique Features
EGI Jupyter provides the well-known Jupyter interface for notebooks with the following added features:
- Integration with EGI Check-in for authentication, login with any EduGAIN or social accounts (e.g. Google, Facebook)
- Persistent storage associated to each user, available in the notebooks environment.
- Customisable with new notebook environments, expose any existing notebook to your users.
- Runs on EGI e-Infrastructure so can easily use EGI compute and storage from your notebooks.
Service Modes
We offer different service modes depending on your needs:
- For individual users, EGI hosts and offers the service within the Applications On Demand Service. Users (after lightweight approval) can login, write and play notebooks using storage and compute capacity from the access.egi.eu VO. Request access via EGI marketplace
- For user communities EGI offers consultancy and technology to setup a community-specific JupyterHub instance on top of community VO resources and with community-specific storage/data. Communities can have fine grained control on who can access the instance based on the information available to EGI Check-in. Contact support@egi.eu for more information.
- (under development) BinderHub mode that allows to recreate notebooks from existing repositories making the code immediately reproducible by anyone, anywhere. While under development, this option does not have persistent storage and does not require authentication, there is ongoing work to integrate with the modes described above. Alpha instance available at https://binderhub.fedcloud-tf.fedcloud.eu
Data Management
Persistent storage for the notebooks is available at /persistent
linked from the notebooks home directory. This is backed up by a NFS server managed as persistent volume in Kubernetes and automatically mounted at every notebook users create. Access to other kinds of persistent storage, specially for community specific instances that can be tailored to your specific needs and available storage systems.
Getting your data in
Your notebooks have full outgoing internet connectivity so you can connect to any external service to bring data in for analysis. We are evaluating integration with other existing EGI and EOSC-hub services for facilitating the access to input data, with EGI DataHub as first target.
Deposit output data
As with input data, you can connect to any external service to deposit the notebooks output.
Access to other services
Notebooks running on EGI Jupyter can access other existing computing and storage services. We are open to suggestions on which services you would like to access to create guidelines and extend the service with tools to ease these tasks.
Bring your custom notebooks
Adding new notebooks to the EGI Jupyter just requires a working Docker image accessible from a public repository that follows these rules:
- It must install
JupyterHub v0.8
- It must not run as user
root
, user with uid 1000 is recommended - It must use
$HOME
as notebook directory
If you have such image, let us know so we can add it to the configuration.
Once binder integration is complete, you will be able to import any notebook just by providing the URL of a repository which contains your notebook.
Technology Stack
The EGI setup is based in the following components:
- Kubernetes as container orchestration platform running on top of EGI Federated Cloud resources
- Jupyterhub with custom EGI Check-in oauthentication and Kubernetes Spawner
- BinderHub to build and reproduce notebooks
- Traefik as HTTP proxy
The enolfc/egi-jupyterhub github repo contains detailed configuration information on the existing setup at EGI resources.
Next steps
We are looking into:
- Prometheus based monitoring for the Kubernetes cluster and the JupyterHub
- Integration with EGI accounting to report usage of resources
- Complete Binder integration with the regular JupyterHub so users can have persistent notebooks created from external repositories
- Integration with storage services, EGI DataHub as first target