Hot questions for Using Neural networks in nvidia digits

Question:

I created a GoogleNet Model via Nvidia DIGITS with two classes (called positive and negative).

If I classify an image with DIGITS, it shows me a nice result like positive: 85.56% and negative: 14.44%.

If it pass that model it into pycaffe's classify.py with the same image, I get a result like array([[ 0.38978559, -0.06033826]], dtype=float32)

So, how do I read/interpret this result? How do I calculate the confidence levels (not sure if this is the right term) shown by DIGITS from the results shown by classify.py?


Answer:

This issue led me to the solution.

As the log shows, the network produces three outputs. Classifier#classify only returns the first output. So e.g. by changing predictions = out[self.outputs[0]] to predictions = out[self.outputs[2]], I get the desired values.

Question:

I'm very new to deep learning and i'm trying to obtain a classification with lua.

I've installed digits with torch and lua 5.1 and i've train the following model :

After that, i've made a classification with the digits server to test the exemple and here is the result :

I've exported the model and now i'm trying to do a classification with the following lua code :

local image_url = '/home/delpech/mnist/test/5/04131.png'
local network_url = '/home/delpech/models/snapshot_30_Model.t7'
local network_name = paths.basename(network_url)

print '==> Loading network'
local net = torch.load(network_name)

--local net = torch.load(network_name):unpack():float()
net:evaluate()
print(net)

print '==> Loading synsets'
print 'Loads mapping from net outputs to human readable labels'
local synset_words = {}
--for line in io.lines'/home/delpech/models/labels.txt' do table.insert(synset_words, line:sub(11)) end
for line in io.lines'/home/delpech/models/labels.txt' do table.insert(synset_words, line) end

print 'synset words'
for line in io.lines'/home/delpech/models/labels.txt' do print(line) end

print '==> Loading image and imagenet mean'
local im = image.load(image_url)

print '==> Preprocessing'
local I = image.scale(im,28,28,'bilinear'):float()

print 'Propagate through the network, sort outputs in decreasing order and show 10 best classes'
local _,classes = net:forward(I):view(-1):sort(true)

for i=1,10 do
    print('predicted class '..tostring(i)..': ', synset_words[classes[i]])
end

But here is the output :

delpech@delpech-K55VD:~/models$ lua classify.lua 

==> Downloading image and network
==> Loading network
nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> output]
  (1): nn.MulConstant
  (2): nn.SpatialConvolution(1 -> 20, 5x5)
  (3): nn.SpatialMaxPooling(2x2, 2,2)
  (4): nn.SpatialConvolution(20 -> 50, 5x5)
  (5): nn.SpatialMaxPooling(2x2, 2,2)
  (6): nn.View(-1)
  (7): nn.Linear(800 -> 500)
  (8): nn.ReLU
  (9): nn.Linear(500 -> 10)
  (10): nn.LogSoftMax
}
==> Loading synsets
Loads mapping from net outputs to human readable labels
synset words
0
1
2
3
4
5
6
7
8
9
==> Loading image and imagenet mean
==> Preprocessing
Propagate through the network, sort outputs in decreasing order and show 5 best classes
predicted class 1:  4
predicted class 2:  8
predicted class 3:  0
predicted class 4:  1
predicted class 5:  9
predicted class 6:  6
predicted class 7:  7
predicted class 8:  2
predicted class 9:  5
predicted class 10:     3

And this is actually not the classification provided by digits...


Answer:

OK, after searching in the digits code source, it looked like i've missed two things :

  • you have to get the mean image in the job folder and make the following pre-process :

    print '==> Preprocessing' for i=1,im_mean:size(1) do im[i]:csub(im_mean[i]) end

  • and the fact that i had to load my images in this way and multiply every pixel to 255.

local im = image.load(image_url):type('torch.FloatTensor'):contiguous(); im:mul(255)

Here is the total anwser :

require 'image'
require 'nn'
require 'torch'
require 'paths'

local function main()

print '==> Downloading image and network'
local image_url = '/home/delpech/mnist/test/7/03079.png'
local network_url = '/home/delpech/models/snapshot_30_Model.t7'
local mean_url = '/home/delpech/models/mean.jpg'

print '==> Loading network'
local net = torch.load(network_url)
net:evaluate();

print '==> Loading synsets'
print 'Loads mapping from net outputs to human readable labels'
local synset_words = {}
for line in io.lines'/home/delpech/models/labels.txt' do table.insert(synset_words, line) end

print '==> Loading image and imagenet mean'
local im = image.load(image_url):type('torch.FloatTensor'):contiguous();--:contiguous()
im:mul(255)
local I = image.scale(im,28,28,'bilinear'):float()


local im_mean =  image.load(mean_url):type('torch.FloatTensor'):contiguous();
im_mean:mul(255)
local Imean = image.scale(im,28,28,'bilinear'):float()

print '==> Preprocessing'
for i=1,im_mean:size(1) do
    im[i]:csub(im_mean[i])
end

local _,classes = net:forward(im):sort(true);
for i=1,10 do
  print('predicted class '..tostring(i)..': ', synset_words[classes[i]])
end

end 


main()

Question:

I'm using Torch 7 and lua 5.1. I will need to do object recognition from an input video stream. I've installed DIGITS from NVidia. I've heard there are existing models that are already pre-trained provided by Google (or someone other source). Where I can find them ?


Answer:

I would recommend you to check https://github.com/BVLC/caffe/wiki/Model-Zoo it is a common GitHub repo with a bunch of trained models. Even though it is made for caffe framework there is a library (really easy to use) in torch that lets you use them without problems: https://github.com/szagoruyko/loadcaffe