Install & Setup notebook (Jupyter/Zeppelin)

Created with Sketch.

Install & Setup notebook (Jupyter/Zeppelin)

With Metatron Discovery, you can analyze various data using ‘Workbook’ and ‘Workbench’.
In additionally for more advanced analysis, it supports interconnect with 3rd party Notebook application.

In this post, we will learn how to install the Jupyter and Zeppelin Notebook server.


Install Jupyter through Anaconda. Anaconda installation is recommended because data analysis requires a lot of Python Library.


  • the latest version of Anaconda)
  • We need Python 3.x You can download here
$ ~/

After the installation, install R-kernel. (Only Python3-kernel comes with the package)

$ conda install -c r r –yes
$ conda install -c r r-essentials –yes
$ conda install -c r r-httr
$ conda install -c r r-jsonlite

// if you want to install more packages…

$ conda install -c r r-rserve --yes
$ conda install -c r r-devtools --yes
$ conda install -c r r-rcurl --yes
$ conda install -c r r-RJSONIO --yes
$ conda install -c r r-jpeg --yes
$ conda install -c r r-png --yes

//  if you want to update latest r packages 

$ conda update -c r --all

To use R-kernel on Jupyter, install the native library and set links as below, and verify the version. (for CentOS)

$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ ->

In addition, if you’d like to install Deep learning library or sparklyr, command as follows: (for CentOS)

$ conda install -c conda-forge tensorflow
$ conda install -c conda-forge keras
$ conda install -c r r-sparklyr

And to use matplotlib on Jupyter, install native library and set links as below, and verify the version. (for CentOS)

$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/
$ /usr/lib64/ -> /usr/lib64/


Generate a jupyter-config file for configuring pgcontents.

$ jupyter notebook --generate-config
$ vi /home/metatron/.jupyter/

Open the config file and add the codes below.

c.NotebookApp.notebook_dir = '/user/Metatron/jupyter'// common config

// Basically, it is assumed that the notebook server connected with discovery does not support authentication.
c.NotebookApp.allow_origin = '*'
c.NotebookApp.disable_check_xsrf = True
c.NotebookApp.token = ''
// no localhost
c.NotebookApp.ip = ''

Custom Packages


A utility package for Python-kernel used by metatron on Jupyter.

git clone
$ cd discovery-jupyter-py-utils/

$ python sdist
$ pip uninstall pymetis

$ cp dist/pymetis-0.0.3.tar.gz {ANACONDA_HOME}/anaconda3/pkgs/

$ pip install {ANACONDA_HOME}/pkgs/pymetis-x.x.x.tar.gz (current ver. 0.0.3)


A utility package for R-kernel used by metatron on Jupyter.

git clone

$ cd discovery-jupyter-r-utils
$ R CMD build ${This Source Directory – Relative or Absolute path ok. Ex. /home/metatron/discovery-jupyter-r-utils}

$ cp RMetis_0.0.3.tar.gz ${ANACONDA_HOME}/pkgs/
$ R CMD INSTALL --no-multiarch ${ANACONDA_HOME}/pkgs/RMetis_x.x.x.tar.gz (current ver. 0.0.3)


When all the above configurations are done, start the Jupyter process with the commands below. After that, connect to http://localhost:8888 and check if everything works fine.

If you need, you can change the port in ~/.jupyter/

$ mkdir {ANACONDA_HOME}/logs
$ nohup jupyter notebook >> {ANACONDA_HOME}/logs/jupyter.log 2>&1 &

Set a Spark directory

To execute the scripts created with Jupyter as an API, you need to install Spark on the same server as that of Metatron. (Run as a spark-driver-node)

After installation, set a directory in the METATRON_SPARK_HOME environment variable.

$ conf/
export METATRON_JAVA_OPTS="-Dspark.home.dir={SPARK_HOME}"




Download and extract the installer from the link below.


Download binary package from zeppelin home : and extract package. (You can follow install guide in zeppelin home)

Custom Packages


A utility package for Spark-interpreter used by metatron on Zeppelin.

$ git clone
 $ mvn clean package -P prod -P spark-2.2 -DskipTests //Use “-Dspark.version=${spark version}” instead of -P “spark-2.2”
 $ cp target/discovery-zeppelin-interpreter-{spark.version}-1.0.0.jar {ZEPPELIN_HOME}/lib/interpreter


When all the above configurations are done, start the Zeppelin process with the command below. After that, connect to http://localhost:8080 and check if everything works fine.

If you need, you can change the port in conf/zeppelin-site.xml

$ ./{ZEPPELIN_HOME}/bin/ start

(optional) run in yarn-client mode

If you want to run Zeppelin Spark-interpreter’s master in yarn-client mode, you need to install and setup Zeppelin-Spark-Hadoop configuration.


$ vi {ZEPPELIN_HOME}/conf/
 export MASTER=yarn-client
 export SPARK_HOME=/home/metatron/servers/spark-2.2.0-bin-hadoop2.7
 export HADOOP_CONF_DIR=/home/metatron/servers/hadoop-2.7.2/etc/hadoop

(optional) run with R interpreter


To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/ If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *