Jupyter Notebook with EC3
Overview | For users | For resource providers | Infrastructure status | Site-specific configuration | Architecture |
Jupyter Notebook with EC3
Introduction
This guide is intended for researchers who want to use Jupyter to create and share documents that contain live code, equations, visualizations and explanatory text, in the cloud-based resources provided by EGI. This how-to guide has been tested for the EGI Science Applications on-demand Infrastructure.
The only pre-requisite to use the Jupyter Notebook in the EGI, user has to be a member/get access to one of these infrastructures.
Objectives
In this guide we will show how to:
- Deploy and configure an elastic cluster with Jupyter Notebook (v4.1.1) on top of the EGI FedCloud Infrastructure
- Check the list of available Kernels supported by the notebook
- Start the notebook
- Run some statistical analysis with R
The EC3 architecture
The Elastic Cloud Computing Cluster (EC3) is a framework to create elastic virtual clusters on top of Infrastructure as a Service (IaaS) providers is composed by the following components:
- EC3 as a Service (EC3aaS)
For further details about the architecture of the EC3 service, please refer to the official documentation.
Create Jupyter Notebook with EC3
The Jupyter Notebook is an open-source web application that allows user to create and share documents that contain live code, equations, visualizations and explanatory text. This web application is now available in the EGI Science Applications on-demand Infrastructure and is accessible through this portal and offers grid, cloud and application services from across the EGI community for individual researchers and small research teams.
To create your Jupyter Notebook, accesses the front page and clicks on the "Deploy your cluster!" button. A simple wizard will guide the user during the configuration process of the cluster, allowing to configure details like the operating system, the characteristics of the nodes, the maximum number of nodes of the cluster or the pre-installed software packages.
When the front-end node of the cluster has been successfully deployed, the user will be notified by the portal with the credentials to access via SSH as shown in the figure below:
The configuration of the cluster may take some time. Please wait its completion before to start using the cluster.
Accessing the cluster
With the credentials provided by the EC3 portal it is possible to access to the front-end of the elastic cluster.
[larocca@aktarus EC3]$ ssh -i key.pem cloudadm@<YOUR_CLUSTER_IP> The authenticity of host '193.144.35.156 (193.144.35.156)' can't be established. RSA key fingerprint is 78:c9:af:31:70:09:c1:c6:26:cf:9d:ae:14:d1:34:a7. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '193.144.35.156' (RSA) to the list of known hosts. Last login: Thu Mar 30 16:12:38 2017 from 158.42.105.214 | |_ _ _ __ _ _| |_ ___ _ __ _ | | | | | '_ \| | | | __/ _ \ '__| | |_| | |_| | |_) | |_| | || __/ | \___/ \__,_| .__/ \__, |\__\___|_| _ _ |_| |___/ _ | \ | | ___ | |_ ___| |__ ___ ___ | | __ | \| |/ _ \| __/ _ \ '_ \ / _ \ / _ \| |/ / | |\ | (_) | || __/ |_) | (_) | (_) | < |_| \_|\___/ \__\___|_.__/ \___/ \___/|_|\_\ Welcome to the Jupyter Notebook v4.1.1 [..]
The configuration of the front-end is done with ansible. This process usually takes some time before to finish. User can monitor the status of the configuration of the front-end node by checking the presence of some ansible processes:
[root@ip-193-144-35-156 cloudadm]# ps auxwww | grep -i ansible cloudadm 3827 1.5 0.5 160044 20576 ? S 16:26 0:00 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg cloudadm 2743 4.3 0.6 174640 27256 ? S 16:26 0:01 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg cloudadm 1234 18.5 0.6 174488 26032 ? Sl 16:26 0:04 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg cloudadm 4556 1.3 0.7 184412 28636 ? S 16:26 0:00 python_ansible /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//ctxt_agent.py /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003//general_info.cfg /tmp/.im/47845a5c-7hs1-78e6-b568-580000000003/<YOUR_CLUSTER_IP>/config.cfg root 4578 6.2 2.0 140904 82060 ? S 16:26 0:01 /usr/bin/python /tmp/ansible_fRE4qh/ansible_module_apt.py root 5467 0.0 0.0 11744 952 pts/0 S+ 16:26 0:00 grep --color=auto -i ansible
and using the is_cluster_ready command-line-tool:
root@torqueserver:~# is_cluster_ready Cluster is still configuring.
When the command returns the following message:
root@torqueserver:~# is_cluster_ready Cluster configured!
The front-end node is successfully configured and ready to be used!
Listing the Jupyter Kernels
[cloudadm@ip-193-144-35-156 ~]$ jupyter-kernelspec list Available kernels: python2 /home/cloudadm/anaconda2/lib/python2.7/site-packages/ipykernel/resources bash /home/cloudadm/.local/share/jupyter/kernels/bash octave /home/cloudadm/.local/share/jupyter/kernels/octave scilab /home/cloudadm/.local/share/jupyter/kernels/scilab ir /home/cloudadm/anaconda2/share/jupyter/kernels/ir
Starting the Jupyter Notebook
[cloudadm@ip-193-144-35-156 ~]$ jupyter notebook Serving notebooks from local directory: /home/cloudadm 0 active kernels The Jupyter Notebook is running at: https://[all ip addresses on your system]:8888/?token=d3dd9acc2d7667e56a972fe5544f913f2ff3d4f9ba3a8d77 Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). Copy/paste this URL into your browser when you connect for the first time, to login with a token: https://localhost:8888/?token=d3dd9acc2d7667e56a972fe5544f913f2ff3d4f9ba3a8d77
To access the Jupyter Notebook, open the browser at this URL:
https://<YOUR_CLUSTER_IP>:8888/?token=d3dd9acc2d7667e56a972fe5544f913f2ff3d4f9ba3a8d77
as shown in figure: