Install & Setup notebook (Jupyter/Zeppelin)

Install & Setup notebook (Jupyter/Zeppelin)

With Metatron Discovery, you can analyze various data using ‘Workbook’ and ‘Workbench’.
In additionally for more advanced analysis, it supports interconnect with 3rd party Notebook application.

In this post, we will learn how to install the Jupyter and Zeppelin Notebook server.

Jupyter

Install Jupyter through Anaconda. Anaconda installation is recommended because data analysis requires a lot of Python Library.

Anaconda3

  • https://www.anaconda.com/distribution/(shows the latest version of Anaconda)
  • We need Python 3.x You can download here
$ ~/.Anaconda3-2018.12-MacOSX-x86_64.sh

After the installation, install R-kernel. (Only Python3-kernel comes with the package)

$ conda install -c r r –yes
$ conda install -c r r-essentials –yes
$ conda install -c r r-httr
$ conda install -c r r-jsonlite

// if you want to install more packages…

$ conda install -c r r-rserve --yes
$ conda install -c r r-devtools --yes
$ conda install -c r r-rcurl --yes
$ conda install -c r r-RJSONIO --yes
$ conda install -c r r-jpeg --yes
$ conda install -c r r-png --yes

//  if you want to update latest r packages 

$ conda update -c r --all

To use R-kernel on Jupyter, install the native library and set links as below, and verify the version. (for CentOS)

$ /usr/lib64/libpng12.so.0 -> /usr/lib64/libpng12.so.0.50.0
$ /usr/lib64/libXrender.so.1 -> /usr/lib64/libXrender.so.1.3.0
$ /usr/lib64/libXext.so.6 -> /usr/lib64/libXext.so.6.4.0
$ /usr/lib64/libc.so.6 -> libc-2.17.so

In addition, if you’d like to install Deep learning library or sparklyr, command as follows: (for CentOS)

$ conda install -c conda-forge tensorflow
$ conda install -c conda-forge keras
$ conda install -c r r-sparklyr

And to use matplotlib on Jupyter, install native library and set links as below, and verify the version. (for CentOS)

$ /usr/lib64/libGL.so.1 -> /usr/lib64/libGL.so.1.2.0
$ /usr/lib64/libxshmfence.so.1 -> /usr/lib64/libxshmfence.so.1.0.0
$ /usr/lib64/libglapi.so.0 -> /usr/lib64/libglapi.so.0.0.0
$ /usr/lib64/libXdamage.so.1 -> /usr/lib64/libXdamage.so.1.1.0
$ /usr/lib64/libXfixes.so.3 -> /usr/lib64/libXfixes.so.3.1.0
$ /usr/lib64/libXxf86vm.so.1 -> /usr/lib64/libXxf86vm.so.1.0.0

Generate-config

Generate a jupyter-config file for configuring pgcontents.

$ jupyter notebook --generate-config
$ vi /home/metatron/.jupyter/jupyter_notebook_config.py

Open the config file and add the codes below.

c.NotebookApp.notebook_dir = '/user/Metatron/jupyter'// common config

// Basically, it is assumed that the notebook server connected with discovery does not support authentication.
c.NotebookApp.allow_origin = '*'
c.NotebookApp.disable_check_xsrf = True
c.NotebookApp.token = ''
 
// no localhost
c.NotebookApp.ip = '0.0.0.0'

Custom Packages

pymetis

A utility package for Python-kernel used by metatron on Jupyter.

git clone https://github.com/metatron-app/discovery-jupyter-py-utils.git
 
$ cd discovery-jupyter-py-utils/

$ python setup.py sdist
$ pip uninstall pymetis

$ cp dist/pymetis-0.0.3.tar.gz {ANACONDA_HOME}/anaconda3/pkgs/

$ pip install {ANACONDA_HOME}/pkgs/pymetis-x.x.x.tar.gz (current ver. 0.0.3)

RMetis

A utility package for R-kernel used by metatron on Jupyter.

git clone https://github.com/metatron-app/discovery-jupyter-r-utils

$ cd discovery-jupyter-r-utils
$ R CMD build ${This Source Directory – Relative or Absolute path ok. Ex. /home/metatron/discovery-jupyter-r-utils}
  

$ cp RMetis_0.0.3.tar.gz ${ANACONDA_HOME}/pkgs/
$ R CMD INSTALL --no-multiarch ${ANACONDA_HOME}/pkgs/RMetis_x.x.x.tar.gz (current ver. 0.0.3)

Run

When all the above configurations are done, start the Jupyter process with the commands below. After that, connect to http://localhost:8888 and check if everything works fine.

If you need, you can change the port in ~/.jupyter/jupyter_notebook_config.py

$ mkdir {ANACONDA_HOME}/logs
$ nohup jupyter notebook >> {ANACONDA_HOME}/logs/jupyter.log 2>&1 &

Set a Spart directory

To execute the scripts created with Jupyter as an API, you need to install Spark on the same server as that of Metatron. (Run as a spark-driver-node)

After installation, set a directory in the METATRON_SPARK_HOME environment variable.

$ conf/metaron-env.sh
export METATRON_JAVA_OPTS="-Dspark.home.dir={SPARK_HOME}"

 

 


Zeppelin

Download and extract the installer from the link below.

Install

Download binary package from zeppelin home : http://zeppelin.apache.org/download.htmland extract package. (You can follow install guide in zeppelin home)

Custom Packages

Discovery-interpreter

A utility package for Spark-interpreter used by metatron on Zeppelin.

$ git clone https://github.com/metatron-app/discovery-zeppelin-interpreter.git
 $ mvn clean package -P prod -P spark-2.2 -DskipTests (Use “-Dspark.version=${spark version}” instead of -P “spark-2.2”)
 $ cp target/discovery-zeppelin-interpreter-{spark.version}-1.0.0.jar {ZEPPELIN_HOME}/lib/interpreter

Run

When all the above configurations are done, start the Zeppelin process with the command below. After that, connect to http://localhost:8080 and check if everything works fine.

If you need, you can change the port in conf/zeppelin-site.xml

$ ./{ZEPPELIN_HOME}/bin/zeppelin-daemon.sh start

(optional) run in yarn-client mode

If you want to run Zeppelin Spark-interpreter’s master in yarn-client mode, you need to install and setup Zeppelin-Spark-Hadoop configuration.

from https://zeppelin.apache.org/docs/0.7.3/install/yarn_install.html

$ vi {ZEPPELIN_HOME}/conf/zeppelin-env.sh
  
 export MASTER=yarn-client
 export SPARK_HOME=/home/metatron/servers/spark-2.2.0-bin-hadoop2.7
 export HADOOP_CONF_DIR=/home/metatron/servers/hadoop-2.7.2/etc/hadoop

(optional) run with R interpreter

from https://zeppelin.apache.org/docs/0.7.3/interpreter/r.html

To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

Was this post helpful?
Let us know if you liked the post. That’s the only way we can improve.
Yes3
No0

Leave a Reply

Your email address will not be published. Required fields are marked *