Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Jupyter Notebook with EC3

From EGIWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Overview For users For resource providers Infrastructure status Site-specific configuration Architecture




Jupyter Notebook with EC3

Introduction

This guide is intended for researchers who want to use Jupyter to create and share documents that contain live code, equations, visualizations and explanatory text, in the cloud-based resources provided by EGI. This how-to guide has been tested for the EGI Science Applications on-demand Infrastructure.


The only pre-requisite to use the Jupyter Notebook in the EGI, user has to be a member/get access to one of these infrastructures.

Objectives

In this guide we will show how to:

  • Deploy and configure an elastic cluster with Jupyter Notebook (v4.1.1) on top of the EGI FedCloud Infrastructure
  • Check the list of available Kernels supported by the notebook
  • Start the notebook
  • Run some statistical analysis with R

The EC3 architecture

The Elastic Cloud Computing Cluster (EC3) is a framework to create elastic virtual clusters on top of Infrastructure as a Service (IaaS) providers is composed by the following components:

  • EC3 as a Service (EC3aaS)

For further details about the architecture of the EC3 service, please refer to the official documentation.

Create Jupyter Notebook with EC3

The Jupyter Notebook is an open-source web application that allows user to create and share documents that contain live code, equations, visualizations and explanatory text. This web application is now available in the EGI Science Applications on-demand Infrastructure and is accessible through this portal and offers grid, cloud and application services from across the EGI community for individual researchers and small research teams.

To create your Jupyter Notebook, accesses the front page and clicks on the "Deploy your cluster!" button. A simple wizard will guide the user during the configuration process of the cluster, allowing to configure details like the operating system, the characteristics of the nodes, the maximum number of nodes of the cluster or the pre-installed software packages.


Jupyter 1.png


When the front-end node of the cluster has been successfully deployed, the user will be notified by the portal with the credentials to access via SSH as shown in the figure below:


Jupyter 2.png


The configuration of the cluster may take some time. Please wait its completion before to start using the cluster.


Accessing the cluster

With the credentials provided by the EC3 portal it is possible to access to the front-end of the elastic cluster.

[larocca@aktarus EC3]$ ssh -i key.pem cloudadm@<YOUR_CLUSTER_IP>
The authenticity of host '193.144.35.156 (193.144.35.156)' can't be established.
RSA key fingerprint is 78:c9:af:31:70:09:c1:c6:26:cf:9d:ae:14:d1:34:a7.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '193.144.35.156' (RSA) to the list of known hosts.
Last login: Thu Mar 30 16:12:38 2017 from 158.42.105.214

     | |_   _ _ __  _   _| |_ ___ _ __       
  _  | | | | | '_ \| | | | __/ _ \ '__|      
 | |_| | |_| | |_) | |_| | ||  __/ |         
  \___/ \__,_| .__/ \__, |\__\___|_|         
  _   _      |_|    |___/               _    
 | \ | | ___ | |_ ___| |__   ___   ___ | | __
 |  \| |/ _ \| __/ _ \ '_ \ / _ \ / _ \| |/ /
 | |\  | (_) | ||  __/ |_) | (_) | (_) |   < 
 |_| \_|\___/ \__\___|_.__/ \___/ \___/|_|\_\
                                            
   Welcome to the Jupyter Notebook v4.1.1
[..]

The configuration of the front-end is done with ansible. This process usually takes some time before to finish. User can monitor the status of the configuration of the front-end node by checking the presence of some ansible processes:

[root@ip-193-144-35-156 cloudadm]# ps auxwww | grep -i ansible
cloudadm  3827  1.5  0.5 160044 20576 ?        S    16:26   0:00 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg
cloudadm  2743  4.3  0.6 174640 27256 ?        S    16:26   0:01 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg
cloudadm  1234 18.5  0.6 174488 26032 ?        Sl   16:26   0:04 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg
cloudadm  4556  1.3  0.7 184412 28636 ?        S    16:26   0:00 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg
root      4578  6.2  2.0 140904 82060 ?        S    16:26   0:01 /usr/bin/python /tmp/ansible_fRE4qh/ansible_module_apt.py
root      5467  0.0  0.0  11744   952 pts/0    S+   16:26   0:00 grep --color=auto -i ansible

and using the is_cluster_ready command-line-tool:

root@torqueserver:~# is_cluster_ready
Cluster is still configuring.

When the command returns the following message:

root@torqueserver:~# is_cluster_ready
Cluster configured!

The front-end node is successfully configured and ready to be used!

Listing the Jupyter Kernels

[cloudadm@ip-193-144-35-156 ~]$ jupyter-kernelspec list
Available kernels:
  python2    /home/cloudadm/anaconda2/lib/python2.7/site-packages/ipykernel/resources
  bash       /home/cloudadm/.local/share/jupyter/kernels/bash
  octave     /home/cloudadm/.local/share/jupyter/kernels/octave
  scilab     /home/cloudadm/.local/share/jupyter/kernels/scilab
  ir         /home/cloudadm/anaconda2/share/jupyter/kernels/ir


Starting the Jupyter Notebook

[cloudadm@ip-193-144-35-156 ~]$ jupyter notebook 
Serving notebooks from local directory: /home/cloudadm
0 active kernels 
The Jupyter Notebook is running at: 
https://[all ip addresses on your system]:8888/?token=d3dd9acc2d7667e56a972fe5544f913f2ff3d4f9ba3a8d77

Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). 

Copy/paste this URL into your browser when you connect for the first time, to login with a token:
https://localhost:8888/?token=d3dd9acc2d7667e56a972fe5544f913f2ff3d4f9ba3a8d77

To access the Jupyter Notebook, open the browser at this URL:

https://<YOUR_CLUSTER_IP>:8888/?token=d3dd9acc2d7667e56a972fe5544f913f2ff3d4f9ba3a8d77

as shown in figure:


Jupyter 3.png


Destroy a cluster

To destroy a running cluster, click on the "Manage your deployed clusters" button and select the clusterID you want to destroy.