Run a DNN with Keras on Google Cloud
In reality, this post was intended for my DLAI course’s students, although I think it may be of interest to other students. I am going to share in this blog the teaching material that I am going to generate for the part of DLAI course that will cover the basic principles of Deep Learning from a computational perspective.
VGG-19 is a deep convolutional network for object recognition developed and trained by Oxford’s renowned Visual Geometry Group (VGG), which achieved very good performance on the ImageNet dataset. You can check Karen Simonyan and Andrew Zisserman publication: Very Deep Convolutional Networks for Large-Scale Image Recognition.
The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.
In this lab, we will train the VGG-19 network using the CIFAR-10 dataset with Keras.
Task 1: Create a Google Cloud Account with 300$ promotional credit and create a new project
Task 2: Ask for a quota increase
Increase your Compute Engine Nvidia K80 Quota here. Remember the region that you asked for the increase, you will need it later.
Task 3: Create a firewall rule
Go to the firewall rules page and add a new firewall rule following this example:
This rule will allow our virtual machines to accept incoming connections with Jupyter Notebook and TensorBoard.
Task 4: Create a virtual machine
Go to the compute engine instances page and create a new virtual machine with the following specs:
You have to select the same region as the quota increase.
Link your virtual machine to the previous firewall rule adding the same network tag:
Task 5: Access to your VM
Open the SSH web console by clicking to the SSH button. On your VM, run the following commands to install all the necessary dependencies for this lab:
wget https://raw.githubusercontent.com/jorditorresBCN/dlaimet/master/cloud_install.sh sh cloud_install.sh
This script will take a while, at the end, it will ask you for creating a password, you will need this password to access to Jupyter.
source ~/venv-tf/bin/activate jupyter notebook --ip=0.0.0.0
Then go using your browser to the external ip of your server with port 8888, for example: http://188.8.131.52:8888
Task 6: Run Keras
Run the Keras script following these instructions:
First of all, using your browser with Jupyter and open the dlaimet/keras/vgg-book file. Try to run all the blocks in order to check your Keras installation.
The output should be something like:
ERROR: THERE AREN'T LOCAL FILES, IF YOU WANT TO DOWNLOAD THE DATASET, SET dirname TO None. [Errno 2] No such file or directory: '/app/data/keras/cifar-10/data_batch_1' ...
Modify the variable “dirname” and re-run the code.
Task 7 : Compare with your computer
Using the docker on your computer from past labs:
- Try to run the same file on your computer.
- Compare the remaining times on Google Cloud and your computer.
If you want, you can use TensorBoard, you only have to add the Keras TensorBoard callback to the code. Then open another SHH web window and run TensorBoard. It will be available on the port 6006.
Task 8: Lab Report
Build a lab report with a brief explanation of the previous Tasks. Follow the indications of your teacher about how to create your lab report and how to submit it.
My thanks to Francesc Sastre for helping me with the preparation of this lab.