In our last tutorial, we had some brief introduction to Apache Spark. Now, in this tutorial we will have a look into how to setup an environment to work with Apache Spark. To get started, we first need to install Docker. If you don’t have it yet, find out how to install it from this link: https://docs.docker.com/install/
Once your docker is installed successfully, download the container for Spark via Kitematic. Select “all-spark-notebook” for our samples. Note that the download will take a while.
Once your download has finished, it is about time to start your Docker container. When you download the container via Kitematic, it will be started by default. Within the container logs, you can see the URL and port to which Jupyter is mapped. Open the URL and enter the Token. When everything works as expected, you can now create new Notebooks in Jupyter.