gpu-jupyter/README.md

# GPU-Jupyter
#### Leverage Jupyter Notebooks with the power of your NVIDIA GPU and perform GPU calculations using Tensorflow and Pytorch in collaborative notebooks. 

![Jupyterlab Overview](/extra/jupyterlab-overview.png)

First of all, thanks to [hub.docker.com/u/jupyter](https://hub.docker.com/u/jupyter) 
for creating and maintaining a robost  Python, R and Julia toolstack for Data Analytics/Science 
applications. This project uses the NVIDIA CUDA image as a basis image and installs their 
toolstack on top of it to enable GPU calculations in the Jupyter notebooks.

## Contents

1. [Requirements](#requirements)
2. [Quickstart](#quickstart)
3. [Tracing](#tracing)
4. [Deployment](#deployment-in-the-docker-swarm)
5. [Configuration](#configuration)
6. [Issues and Contributing](#issues-and-contributing)


## Requirements

1.  Install [Docker](https://www.docker.com/community-edition#/download) version **1.10.0+**
 and [Docker Compose](https://docs.docker.com/compose/install/) version **1.6.0+**.
2.  A NVIDIA GPU
3.  Get access to use your GPU via the CUDA drivers, check out this 
[medium article](https://medium.com/@christoph.schranz/set-up-your-own-gpu-based-jupyterlab-e0d45fcacf43).
    The CUDA toolkit is not required on the host system, as it will be deployed 
    in [NVIDIA-docker](https://github.com/NVIDIA/nvidia-docker).
4. Clone the Repository or pull the image from 
    [Dockerhub](https://hub.docker.com/repository/docker/cschranz/gpu-jupyter):
    ```bash
    git clone https://github.com/iot-salzburg/gpu-jupyter.git
    cd gpu-jupyter
    ```

## Quickstart

First of all, it is necessary to generate the `Dockerfile` based on the latest toolstack of 
[hub.docker.com/u/jupyter](https://hub.docker.com/u/jupyter).
As soon as you have access to your GPU locally (it can be tested via a Tensorflow or PyTorch 
directly on the host node), you can run these commands to start the jupyter notebook via 
docker-compose (internally):

  ```bash
  ./generate_Dockerfile.sh
  docker build -t gpu-jupyter .build/
  docker run -d -p [port]:8888 gpu-jupyter
  ``` 

Alternatively, you can configure the environment in `docker-compose.yml` and run 
this to deploy the `GPU-Jupyter` via docker-compose (under-the-hood):

  ```bash
  ./generate_Dockerfile.sh
  ./start-local.sh -p 8888  # where -p stands for the port of the service
  ```
  
Both options will run *GPU-Jupyter* by default on [localhost:8888](http://localhost:8888) with the default 
password `asdf`.


## Tracing
  
With these commands we can see if everything worked well:
```bash
docker ps
docker logs [service-name]
```

In order to stop the local deployment, run:

  ```bash
  ./stop-local.sh
  ```
 
 
 ## Deployment in the Docker Swarm
 
A Jupyter instance often requires data from other services. 
If that data-source is containerized in Docker and sharing a port for communication shouldn't be allowed, e.g., for security reasons,
then connecting the data-source with *GPU-Jupyter* within a Docker Swarm is a great option! 

### Set up Docker Swarm and Registry

This step requires a running [Docker Swarm](https://www.youtube.com/watch?v=x843GyFRIIY) on a cluster or at least on this node.
In order to register custom images in a local Docker Swarm cluster, 
a registry instance must be deployed in advance.
Note that the we are using the port 5001, as many services use the default port 5000.

```bash
sudo docker service create --name registry --publish published=5001,target=5000 registry:2
curl 127.0.0.1:5001/v2/
```
This should output `{}`. \

Afterwards, check if the registry service is available using `docker service ls`.


### Configure the shared Docker network

Additionally, *GPU-Jupyter* is connected to the data-source via the same *docker-network*. Therefore, This network must be set to **attachable** in the source's `docker-compose.yml`:

```yml
services:
  data-source-service:
  ...
      networks:
      - default
      - datastack
  ...
networks:
  datastack:
    driver: overlay
    attachable: true  
```
 In this example, 
 * the docker stack was deployed in Docker swarm with the name **elk** (`docker stack deploy ... elk`),
 * the docker network has the name **datastack** within the `docker-compose.yml` file,
 * this network is configured to be attachable in the `docker-compose.yml` file
 * and the docker network has the name **elk_datastack**, see the following output:
    ```bash
    sudo docker network ls
    # ...
    # [UID]        elk_datastack                   overlay             swarm
    # ...
    ```
  The docker network name **elk_datastack** is used in the next step as a parameter.
   
### Start GPU-Jupyter in Docker Swarm

Finally, *GPU-Jupyter* can be deployed in the Docker Swarm with the shared network, using:

```bash
./generate_Dockerfile.sh
./add-to-swarm.sh -p [port] -n [docker-network] -r [registry-port]
# e.g. ./add-to-swarm.sh -p 8848 -n elk_datastack -r 5001
```
where:
* **-p:** port specifies the port on which the service will be available.
* **-n:** docker-network is the name of the attachable network from the previous step, e.g., here it is **elk_datastack**.
* **-r:** registry port is the port that is published by the registry service, see [Set up Docker Swarm and Registry](set-up-docker-swarm-and-registry).

Now, *gpu-jupyter* will be accessable here on [localhost:8848](http://localhost:8848) with the default password `asdf` and shares the network with the other data-source, i.e., all ports of the data-source will be accessable within *GPU-Jupyter*, even if they aren't routed it the source's `docker-compose` file.

Check if everything works well using:
```bash
sudo docker service ps gpu_gpu-jupyter
docker service ps gpu_gpu-jupyter
```

In order to remove the service from the swarm, use:
```bash
./remove-from-swarm.sh
```

## Configuration

Please set a new password using `src/jupyter_notebook_config.json`.
Therefore, hash your password in the form (password)(salt) using a sha1 hash generator, e.g., the sha1 generator of [sha1-online.com](http://www.sha1-online.com/). 
The input with the default password `asdf` is appended by a arbitrary salt `e49e73b0eb0e` to `asdfe49e73b0eb0e` and should yield the hash string as shown in the config below.
**Never give away your own unhashed password!**

Then update the config file as shown below and restart the service.

```json
{
  "NotebookApp": {
    "password": "sha1:e49e73b0eb0e:32edae7a5fd119045e699a0bd04f90819ca90cd6"
  }
}
```

### Update CUDA to another version

To update CUDA to another version, change in `Dockerfile.header`
the line:

    FROM nvidia/cuda:10.1-base-ubuntu18.04
    
and in the `Dockerfile.pytorch` the line:

    cudatoolkit=10.1

Then re-generate and re-run the image, as closer described above:

```bash
./generate_Dockerfile.sh
./start-local.sh -p [port]:8888  
```

## Issues and Contributing

* Please let us know by [filing a new issue](https://github.com/iot-salzburg/gpu-jupyter/issues/new)
* You can contribute by opening a [pull request](https://help.github.com/articles/using-pull-requests/)
linked medium article, fixed typo 2019-12-27 17:06:07 +00:00			`# GPU-Jupyter`
changed typo 2019-12-26 12:52:42 +00:00			`#### Leverage Jupyter Notebooks with the power of your NVIDIA GPU and perform GPU calculations using Tensorflow and Pytorch in collaborative notebooks.`
structure and quickstart 2019-11-14 11:04:45 +00:00
Update README.md 2019-11-14 11:14:20 +00:00			`![Jupyterlab Overview](/extra/jupyterlab-overview.png)`
structure and quickstart 2019-11-14 11:04:45 +00:00
describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00			`First of all, thanks to [hub.docker.com/u/jupyter](https://hub.docker.com/u/jupyter)`
			`for creating and maintaining a robost Python, R and Julia toolstack for Data Analytics/Science`
			`applications. This project uses the NVIDIA CUDA image as a basis image and installs their`
			`toolstack on top of it to enable GPU calculations in the Jupyter notebooks.`
Thanks to dockerhub/jupyter 2019-11-15 10:34:51 +00:00
structure and quickstart 2019-11-14 11:04:45 +00:00			`## Contents`

			`1. [Requirements](#requirements)`
			`2. [Quickstart](#quickstart)`
adding contribution section 2020-02-23 11:25:33 +00:00			`3. [Tracing](#tracing)`
			`4. [Deployment](#deployment-in-the-docker-swarm)`
			`5. [Configuration](#configuration)`
			`6. [Issues and Contributing](#issues-and-contributing)`
structure and quickstart 2019-11-14 11:04:45 +00:00

			`## Requirements`

			`1. Install [Docker](https://www.docker.com/community-edition#/download) version 1.10.0+`
describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00			`and [Docker Compose](https://docs.docker.com/compose/install/) version 1.6.0+.`
			`2. A NVIDIA GPU`
			`3. Get access to use your GPU via the CUDA drivers, check out this`
			`[medium article](https://medium.com/@christoph.schranz/set-up-your-own-gpu-based-jupyterlab-e0d45fcacf43).`
adding contribution section 2020-02-23 11:25:33 +00:00			`The CUDA toolkit is not required on the host system, as it will be deployed`
			`in [NVIDIA-docker](https://github.com/NVIDIA/nvidia-docker).`
			`4. Clone the Repository or pull the image from`
			`[Dockerhub](https://hub.docker.com/repository/docker/cschranz/gpu-jupyter):`
			```bash
			`git clone https://github.com/iot-salzburg/gpu-jupyter.git`
			`cd gpu-jupyter`
			```
structure and quickstart 2019-11-14 11:04:45 +00:00
			`## Quickstart`

describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00			First of all, it is necessary to generate the `Dockerfile` based on the latest toolstack of
			`[hub.docker.com/u/jupyter](https://hub.docker.com/u/jupyter).`
			`As soon as you have access to your GPU locally (it can be tested via a Tensorflow or PyTorch`
			`directly on the host node), you can run these commands to start the jupyter notebook via`
			`docker-compose (internally):`

			```bash
			`./generate_Dockerfile.sh`
run Dockerfile inside .build 2020-02-22 18:02:41 +00:00			`docker build -t gpu-jupyter .build/`
describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00			`docker run -d -p [port]:8888 gpu-jupyter`
			```

made docker-compose descr. more compact 2020-02-22 19:34:39 +00:00			Alternatively, you can configure the environment in `docker-compose.yml` and run
			this to deploy the `GPU-Jupyter` via docker-compose (under-the-hood):
describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00
structure and quickstart 2019-11-14 11:04:45 +00:00			```bash
describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00			`./generate_Dockerfile.sh`
made docker-compose descr. more compact 2020-02-22 19:34:39 +00:00			`./start-local.sh -p 8888 # where -p stands for the port of the service`
structure and quickstart 2019-11-14 11:04:45 +00:00			```

build image in .build/ nvidia/cuda:10.2-base-ubuntu18.04 2020-02-22 19:40:18 +00:00			`Both options will run GPU-Jupyter by default on [localhost:8888](http://localhost:8888) with the default`
			password `asdf`.
made docker-compose descr. more compact 2020-02-22 19:34:39 +00:00

			`## Tracing`
tracing and configuration 2019-11-15 09:36:43 +00:00
			`With these commands we can see if everything worked well:`
			```bash
add section for cuda update 2020-02-23 12:50:14 +00:00			`docker ps`
tracing and configuration 2019-11-15 09:36:43 +00:00			`docker logs [service-name]`
			```

structure and quickstart 2019-11-14 11:04:45 +00:00			`In order to stop the local deployment, run:`

			```bash
			`./stop-local.sh`
			```

tracing and configuration 2019-11-15 09:36:43 +00:00
Update README.md 2019-11-14 11:14:20 +00:00			`## Deployment in the Docker Swarm`
Docker Swarm requirements 2019-11-15 09:13:39 +00:00
			`A Jupyter instance often requires data from other services.`
			`If that data-source is containerized in Docker and sharing a port for communication shouldn't be allowed, e.g., for security reasons,`
rebase to nvidia/cuda:10.1 as 10.2 makes problems, typo, no image in docker-compose required 2020-02-23 11:00:57 +00:00			`then connecting the data-source with GPU-Jupyter within a Docker Swarm is a great option!`
Docker Swarm requirements 2019-11-15 09:13:39 +00:00
update to configurable registry port 2020-01-13 09:02:45 +00:00			`### Set up Docker Swarm and Registry`
Docker Swarm requirements 2019-11-15 09:13:39 +00:00
			`This step requires a running [Docker Swarm](https://www.youtube.com/watch?v=x843GyFRIIY) on a cluster or at least on this node.`
			`In order to register custom images in a local Docker Swarm cluster,`
			`a registry instance must be deployed in advance.`
			`Note that the we are using the port 5001, as many services use the default port 5000.`

			```bash
			`sudo docker service create --name registry --publish published=5001,target=5000 registry:2`
			`curl 127.0.0.1:5001/v2/`
			```
			This should output `{}`. \

			Afterwards, check if the registry service is available using `docker service ls`.


			`### Configure the shared Docker network`

linked medium article, fixed typo 2019-12-27 17:06:07 +00:00			Additionally, GPU-Jupyter is connected to the data-source via the same docker-network. Therefore, This network must be set to attachable in the source's `docker-compose.yml`:
Docker Swarm requirements 2019-11-15 09:13:39 +00:00
			```yml
			`services:`
			`data-source-service:`
			`...`
			`networks:`
			`- default`
			`- datastack`
			`...`
			`networks:`
			`datastack:`
			`driver: overlay`
			`attachable: true`
			```
			`In this example,`
			* the docker stack was deployed in Docker swarm with the name elk (`docker stack deploy ... elk`),
			* the docker network has the name datastack within the `docker-compose.yml` file,
			* this network is configured to be attachable in the `docker-compose.yml` file
			`* and the docker network has the name elk_datastack, see the following output:`
			```bash
			`sudo docker network ls`
			`# ...`
			`# [UID] elk_datastack overlay swarm`
			`# ...`
			```
			`The docker network name elk_datastack is used in the next step as a parameter.`

Deployment 2019-11-15 09:22:57 +00:00			`### Start GPU-Jupyter in Docker Swarm`
Docker Swarm requirements 2019-11-15 09:13:39 +00:00
linked medium article, fixed typo 2019-12-27 17:06:07 +00:00			`Finally, GPU-Jupyter can be deployed in the Docker Swarm with the shared network, using:`
Deployment 2019-11-15 09:22:57 +00:00
			```bash
describe dynamic Dockerfile generation 2020-02-22 13:51:34 +00:00			`./generate_Dockerfile.sh`
update to configurable registry port 2020-01-13 09:02:45 +00:00			`./add-to-swarm.sh -p [port] -n [docker-network] -r [registry-port]`
fixed filename 2020-01-13 09:07:48 +00:00			`# e.g. ./add-to-swarm.sh -p 8848 -n elk_datastack -r 5001`
Deployment 2019-11-15 09:22:57 +00:00			```
			`where:`
update to configurable registry port 2020-01-13 09:02:45 +00:00			`* -p: port specifies the port on which the service will be available.`
			`* -n: docker-network is the name of the attachable network from the previous step, e.g., here it is elk_datastack.`
			`* -r: registry port is the port that is published by the registry service, see [Set up Docker Swarm and Registry](set-up-docker-swarm-and-registry).`
Deployment 2019-11-15 09:22:57 +00:00
update to configurable registry port 2020-01-13 09:02:45 +00:00			Now, gpu-jupyter will be accessable here on [localhost:8848](http://localhost:8848) with the default password `asdf` and shares the network with the other data-source, i.e., all ports of the data-source will be accessable within GPU-Jupyter, even if they aren't routed it the source's `docker-compose` file.
tracing and configuration 2019-11-15 09:36:43 +00:00
			`Check if everything works well using:`
			```bash
			`sudo docker service ps gpu_gpu-jupyter`
			`docker service ps gpu_gpu-jupyter`
			```

			`In order to remove the service from the swarm, use:`
			```bash
			`./remove-from-swarm.sh`
			```

			`## Configuration`

linked medium article, fixed typo 2019-12-27 17:06:07 +00:00			Please set a new password using `src/jupyter_notebook_config.json`.
			`Therefore, hash your password in the form (password)(salt) using a sha1 hash generator, e.g., the sha1 generator of [sha1-online.com](http://www.sha1-online.com/).`
			The input with the default password `asdf` is appended by a arbitrary salt `e49e73b0eb0e` to `asdfe49e73b0eb0e` and should yield the hash string as shown in the config below.
			`Never give away your own unhashed password!`
tracing and configuration 2019-11-15 09:36:43 +00:00
			`Then update the config file as shown below and restart the service.`

			```json
			`{`
			`"NotebookApp": {`
			`"password": "sha1:e49e73b0eb0e:32edae7a5fd119045e699a0bd04f90819ca90cd6"`
			`}`
			`}`
			```
adding contribution section 2020-02-23 11:25:33 +00:00
add section for cuda update 2020-02-23 12:50:14 +00:00			`### Update CUDA to another version`

			To update CUDA to another version, change in `Dockerfile.header`
			`the line:`

			`FROM nvidia/cuda:10.1-base-ubuntu18.04`

			and in the `Dockerfile.pytorch` the line:

			`cudatoolkit=10.1`

			`Then re-generate and re-run the image, as closer described above:`

			```bash
			`./generate_Dockerfile.sh`
			`./start-local.sh -p [port]:8888`
			```
adding contribution section 2020-02-23 11:25:33 +00:00
			`## Issues and Contributing`

			`* Please let us know by [filing a new issue](https://github.com/iot-salzburg/gpu-jupyter/issues/new)`
			`* You can contribute by opening a [pull request](https://help.github.com/articles/using-pull-requests/)`