There are many deep learning frameworks out there and it can lead to confusion as to which one is better for your task. in this article, we will evaluate the different frameworks with the help of this open-source GitHub repository.
Frameworks are like different programming languages. One has its own way of communicating with the systems. This article shows how the frameworks built for deep learning are different in terms of various factors. There can be situations where the code could be written in Java, while you are familiar with Python. Instead of writing the model in Python language you can simply implement Java and work on the model.
The goal of this repo was to make a comparison between different benchmarks helping the data scientists implement their work easily, and to make a GPU comparison with respect to the advancement. The open-source communities have also collaborated to this project making the process easier.
Benchmarking Outcomes
Three types of dataset were used on two different GPUs for framework comparison.
The first was the CIFAR-10 dataset with 50,000 training samples and 10,000 test samples, uniformly distributed over 10 classes. Every image has a depth of 3 and 32×32 shape and has been rescaled from 0-255 to 0-1.
A CNN has been used across different platforms with GPU support — Nvidia K80 and P100, with CUDA and cuDNN. CUDA (Compute Unified Device Architecture) is a parallel computing platform developed by Nvidia. CUDA support is required on these frameworks for the implementation of GPU while training and testing the models. Similarly, cuDNN is a Deep Neural Network library developed by Nvidia for high tuning of the computations like front propagation and backpropagation.
DL Library | K80/CUDA 8/cuDNN 6 | P100/CUDA 8/cuDNN 6 |
Caffe2 | 148 | 54 |
Chainer | 162 | 69 |
CNTK | 163 | 53 |
Gluon | 152 | 62 |
Keras(CNTK) | 194 | 76 |
Keras(TF) | 241 | 76 |
Keras(Theano) | 269 | 93 |
Tensorflow | 173 | 57 |
Lasagne(Theano) | 253 | 65 |
MXNet | 145 | 51 |
PyTorch | 169 | 51 |
Julia – Knet | 159 | * |
* – Not submitted at the time of benchmarking.
Average Time for 1,000 Images: ResNet-50 – Feature Extraction
The next model was the pre-trained ResNet50 split after average pooling at the end (7,7), which creates a 2048D vector. After passing this to a softmax, it squashes the values between 0 and 1, like probabilities. This has been performed on the same Nvidia GPUs and CUDA platforms.
DL Library | K80/CUDA 8/cuDNN 6 | P100/CUDA 8/cuDNN 6 |
Caffe2 | 14.1 | 7.9 |
Chainer | 9.3 | 2.7 |
CNTK | 8.5 | 1.6 |
Keras(CNTK) | 21.7 | 5.9 |
Keras(TF) | 10.2 | 2.9 |
Tensorflow | 6.5 | 1.8 |
MXNet | 7.7 | 2.0 |
PyTorch | 7.7 | 1.9 |
Julia – Knet | 6.3 | * |
* – Not submitted at the time of benchmarking.
A sentiment analysis has been done on the IMDB dataset available on the website. The training set had 25,000 reviews and the test samples were in 25,000, strategically sampled with equal number of positives and negatives. Comparison of the time taken during training is shown below
DL Library | K80/CUDA 8/cuDNN 6 | P100/CUDA 8/cuDNN 6 | Using cuDNN? |
CNTK | 32 | 15 | Yes |
Keras(CNTK) | 86 | 53 | No |
Keras(TF) | 35 | 26 | Yes |
MXNet | 29 | 24 | Yes |
Pytorch | 31 | 16 | Yes |
Tensorflow | 30 | 22 | Yes |
Julia – Knet | 29 | * | Yes |
* – Not submitted at the time of benchmarking.
Study Analysis
- Most frameworks use cuDNN’s algorithm to run an exhaustive search and optimise the algorithm used for the forward-pass of convolutions on your fixed-sized images. For example, this can be implement on the torch platform with the following command “torch.backends.cudnn.benchmark=True”.
- cuDNN improves speed of the computations while training the RNNs. The downside is that running inference on CPU later-on may be more challenging.
- From the analysis we can see that the K80 GPU is less powerful compared to the P100 GPU even though both have CUDA and cuDNN support.
- This particular benchmarking on time required for training and feature extraction exhibits that Pytorch, CNTK and Tensorflow show a high rate of computational speed.
It has been determined that larger number of frameworks use cuDNN to optimize the algorithms during forward-propagation on the images. By comparing these frameworks we found out that the architecture and the data used by each of them was similar. The computation speed and time with respect to all the frameworks has been conducted but . They are simply meant to show how to create the same networks across different frameworks and the performance on these specific examples.
ONNX (Open Neural Network Exchange Format) was useful not only while developing a framework, but also while converting the score of the model. Also, MMdnn tools convert between different framework and visualise the architecture at the same time.
The study was completed with the help of the various teams’ contribution who are working on different frameworks. First, the Keras with Tensorflow has channels-last configuration which needed to specify the parameters at every batch, but now it has been developed and a channel-first is now a native configuration. This repo is the version 1.0, and the team is working on considering other benchmarks to work on and build a comparison model.
Frameworks – Tensorflow, Julia, MXNet, Keras, Theano, R, CNTK, Pytorch, Caffe2, Chainer and Gluon.
The post Evaluation Of Major Deep Learning Frameworks appeared first on Analytics India Magazine.