Hot questions for Using Neural networks in matconvnet

Question:

I am trying to understand how the MNIST example in the Matconvnet is designed. It looks like they are using a LeNet variation, but since I did not use Matconvnet before, I am having difficulties how the connection between the last convolutional layer and first fully connected layer has been established:

net.layers = {} ;
net.layers{end+1} = struct('type', 'conv', ...
                       'weights', {{f*randn(5,5,1,20, 'single'), zeros(1, 20, 'single')}}, ...
                       'stride', 1, ...
                       'pad', 0) ;
net.layers{end+1} = struct('type', 'pool', ...
                       'method', 'max', ...
                       'pool', [2 2], ...
                       'stride', 2, ...
                       'pad', 0) ;
net.layers{end+1} = struct('type', 'conv', ...
                       'weights', {{f*randn(5,5,20,50, 'single'),zeros(1,50,'single')}}, ...
                       'stride', 1, ...
                       'pad', 0) ;
net.layers{end+1} = struct('type', 'pool', ...
                       'method', 'max', ...
                       'pool', [2 2], ...
                       'stride', 2, ...
                       'pad', 0) ;
net.layers{end+1} = struct('type', 'conv', ...
                       'weights', {{f*randn(4,4,50,500, 'single'),  zeros(1,500,'single')}}, ...
                       'stride', 1, ...
                       'pad', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
                       'weights', {{f*randn(1,1,500,10, 'single'), zeros(1,10,'single')}}, ...
                       'stride', 1, ...
                       'pad', 0) ;
net.layers{end+1} = struct('type', 'softmaxloss') ;

Usually, in libraries like Tensorflow and MxNet, the last convolutional layer is flattened and then connected to the fully connected one. Here, as far as I understand they interpret the first fully connected layer, with the weights {{f*randn(4,4,50,500, 'single'), zeros(1,500,'single')}} as a fully connected layer, but this layer still gives a three dimensional activation map as its result. I don't see how the "flattening" happens here. I need help on how the convolutional layer-fully connected layer connection is established here.


Answer:

As far as I know, you should only substitute the fully connected layer with a convolutional layer which has filters with width and height equal to the width and height of the input. And in fact, you don't need to flatten the data before fully connected layer in the Matconvnet (a flat data has got 1x1xDxN shape). In your case, using a kernel with the same spatial size of the input, i.e. 4x4, would operate as FC layers and its output would be 1 x 1 x 500 x B. (B stands for the fourth dimension in the input)

Updated: The architecture of the network and its outputs are visualized here to comprehend the operations flow.

Question:

The output type of the trainNetwork() must be categorical(). How can I create a CNN with float/real output(s)?

I mean the following command gives the following error:

>> convnet = trainNetwork(input_datas, [0.0, 0.1, 0.2, 0.3], networkLayers, opts);
Error using trainNetwork>iAssertCategoricalResponseVector (line 269)
Y must be a vector of categorical responses.

(The error message corresponds the [0.0, 0.1, 0.2, 0.3] vector), But I need real outputs, not categories.

The networkLayers is the following:

>> networkLayers= 

5x1 Layer array with layers:
  1   ''   Image Input       1x6000x1 images with 'zerocenter' normalization
  2   ''   Convolution       10 1x100 convolutions with stride [1  1] and padding [0  0]
  3   ''   Max Pooling       1x20 max pooling with stride [10  10] and padding [0  0]
  4   ''   Fully Connected   200 fully connected layer
  5   ''   Fully Connected   1 fully connected layer

Answer:

For doing this you have to change your last layer. This can be mean squared error function. this issue explain how can you do this

also it is called regression. You must add loss function manually.

Question:

I have made my own IMDB using a set of 51000 images categorized into 43 different categories of road traffic signs. However, when I want to use my own IMDB to train the alexnet network, I get an error which says: Index exceeds matrix dimensions.

      Error in vl_nnloss (line 230)
      t = - log(x(ci)) ;

Do you have an idea what I am doing wrong? I have checked through my IMDB, and the images, labels and sets have been appropriately created as specified in my code. Also, the image array is declared as type single and not uint8.

Here is my training code below

function [net, info] = alexnet_train(imdb, expDir)
    run(fullfile(fileparts(mfilename('fullpath')), '../../', 'matlab', 'vl_setupnn.m')) ;

    % some common options
    opts.train.batchSize = 100;
    opts.train.numEpochs = 20 ;
    opts.train.continue = true ;
    opts.train.gpus = [1] ;
    opts.train.learningRate = [1e-1*ones(1, 10),  1e-2*ones(1, 5)];
    opts.train.weightDecay = 3e-4;
    opts.train.momentum = 0.;
    opts.train.expDir = expDir;
    opts.train.numSubBatches = 1;
    % getBatch options
    bopts.useGpu = numel(opts.train.gpus) >  0 ;


    % network definition!
    % MATLAB handle, passed by reference
    net = dagnn.DagNN() ;


    net.addLayer('conv1', dagnn.Conv('size', [11 11 3 96], 'hasBias', true, 'stride', [4, 4], 'pad', [0 0 0 0]), {'input'}, {'conv1'},  {'conv1f'  'conv1b'});
    net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'relu1'}, {});
    net.addLayer('lrn1', dagnn.LRN('param', [5 1 2.0000e-05 0.7500]), {'relu1'}, {'lrn1'}, {});
    net.addLayer('pool1', dagnn.Pooling('method', 'max', 'poolSize', [3, 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'lrn1'}, {'pool1'}, {});

    net.addLayer('conv2', dagnn.Conv('size', [5 5 48 256], 'hasBias', true, 'stride', [1, 1], 'pad', [2 2 2 2]), {'pool1'}, {'conv2'},  {'conv2f'  'conv2b'});
    net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'relu2'}, {});
    net.addLayer('lrn2', dagnn.LRN('param', [5 1 2.0000e-05 0.7500]), {'relu2'}, {'lrn2'}, {});
    net.addLayer('pool2', dagnn.Pooling('method', 'max', 'poolSize', [3, 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'lrn2'}, {'pool2'}, {});

    net.addLayer('conv3', dagnn.Conv('size', [3 3 256 384], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool2'}, {'conv3'},  {'conv3f'  'conv3b'});
    net.addLayer('relu3', dagnn.ReLU(), {'conv3'}, {'relu3'}, {});

    net.addLayer('conv4', dagnn.Conv('size', [3 3 192 384], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'relu3'}, {'conv4'},  {'conv4f'  'conv4b'});
    net.addLayer('relu4', dagnn.ReLU(), {'conv4'}, {'relu4'}, {});

    net.addLayer('conv5', dagnn.Conv('size', [3 3 192 256], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'relu4'}, {'conv5'},  {'conv5f'  'conv5b'});
    net.addLayer('relu5', dagnn.ReLU(), {'conv5'}, {'relu5'}, {});
    net.addLayer('pool5', dagnn.Pooling('method', 'max', 'poolSize', [3 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'relu5'}, {'pool5'}, {});

    net.addLayer('fc6', dagnn.Conv('size', [6 6 256 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'pool5'}, {'fc6'},  {'conv6f'  'conv6b'});
    net.addLayer('relu6', dagnn.ReLU(), {'fc6'}, {'relu6'}, {});

    net.addLayer('fc7', dagnn.Conv('size', [1 1 4096 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu6'}, {'fc7'},  {'conv7f'  'conv7b'});
    net.addLayer('relu7', dagnn.ReLU(), {'fc7'}, {'relu7'}, {});

    net.addLayer('classifier', dagnn.Conv('size', [1 1 4096 10], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu7'}, {'classifier'},  {'conv8f'  'conv8b'});
    net.addLayer('prob', dagnn.SoftMax(), {'classifier'}, {'prob'}, {});
    net.addLayer('objective', dagnn.Loss('loss', 'log'), {'prob', 'label'}, {'objective'}, {});
    net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prob','label'}, 'error') ;
    % -- end of the network

    % initialization of the weights (CRITICAL!!!!)
    initNet(net, 1/100);

    % do the training!
    info = cnn_train_dag(net, imdb, @(i,b) getBatch(bopts,i,b), opts.train, 'val', find(imdb.images.set == 3)) ;
end

function initNet(net, f)
    net.initParams();

    f_ind = net.layers(1).paramIndexes(1);
    b_ind = net.layers(1).paramIndexes(2);
    net.params(f_ind).value = 10*f*randn(size(net.params(f_ind).value), 'single');
    net.params(f_ind).learningRate = 1;
    net.params(f_ind).weightDecay = 1;

    for l=2:length(net.layers)
        % is a conenter code herevolution layer?
        if(strcmp(class(net.layers(l).block), 'dagnn.Conv'))
            f_ind = net.layers(l).paramIndexes(1);
            b_ind = net.layers(l).paramIndexes(2);

            [h,w,in,out] = size(net.params(f_ind).value);
            net.params(f_ind).value = f*randn(size(net.params(f_ind).value), 'single');
            net.params(f_ind).learningRate = 1;
            net.params(f_ind).weightDecay = 1;

            net.params(b_ind).value = f*randn(size(net.params(b_ind).value), 'single');
            net.params(b_ind).learningRate = 0.5;
            net.params(b_ind).weightDecay = 1;
        end
    end
end

% function on charge of creating a batch of images + labels
function inputs = getBatch(opts, imdb, batch)
    %[227 by 227 by 3] image
    images = imdb.images.data(:,:,:,batch) ;
    labels = imdb.images.labels(1,batch) ;
    if opts.useGpu > 0
        images = gpuArray(images) ;
    end

    inputs = {'input', images, 'label', labels} ;
end

Answer:

Your network is not true. Conv1 layer must be [11 11 3 48]. If it doesn't work check again your network. This error occurs due to your network errors.