Nvidia digits is a great way to get started with deep learning and image classification. It’s an open source platform that you can run on your computer to do things like image classification, object detection, and processing. It also contains a REST API so that you can easily do all of this through HTTP requests if you desire. For this tutorial, let’s take a look at image classification using Nvidia digits.
Before you begin, you will need to follow the build instructions to get Nvidia Digits running on your computer. I recommend that you use Ubuntu, otherwise you will have to go through a lot of annoying installations to get this working on Windows. Also, you will need to have a Nvidia GPU. While training on your CPU is possible, it will take significantly longer to complete.
You will first need a dataset to work with. You can find your own images or use an existing dataset like this one. Just make sure that you have a top-level folder and each of your categories in a sub-folder. For example, I have a top level folder named things. I then have two folders in that folder. One folder named planes (containing all my images of planes) and another one named helicopters (containing all my images of helicopters).
Begin by starting Digits and opening your browser to localhost:5000. We will start with building a dataset that can be used with your models. Go to the tab that says dataset and chooses a new classification.
The neat thing about Nvidia digits is that it takes care of sizing your images to the appropriate size and color scheme that the network requires. Set the transform options to fill if your images are smaller than the 256×256 dimension, otherwise, select crop or squash. Then select the top level folder which contains all the other sub-folders of the images you want to classify. Finally, name your dataset and select create.
Training the Model
After you create your dataset, select the tab that says _Models _and create a new classification model.
First, select the dataset that you created in the previous step. You might also want to adjust the learning rate to 0.001 if you find that your network’s loss rate is extremely high. Also, set the batch size to some value between 1-10 if you plan to use the GoogleNet. If you keep it at the default size with the GoogleNet, you will most likely run out of GPU memory unless you are using multiple GPUS. Finally, give your model a name and select create.
The training on your model will begin. Depending on your computer’s hardware, this could take awhile. When it’s done, test the network with an image that you did not train it with and see if the output is as expected.