Hot questions for Using Neural networks in loops

Question:

I am fitting a recurrent neural net in python using keras library. I am fitting the model with different epoch number by changing parameter nb_epoch in Sequential.fit() function. Currently I'm using for loop which starts over fitting each time I change nb_epoch which is lots of repeating work. Here is my code (the loop is in the bottom of the code, if you want to skip other parts of the code details):

from __future__ import division
import numpy as np
import pandas
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.learning_curve import learning_curve


####################################
###
### Here I do the data processing to create trainX, testX
###
####################################

#model create:
model = Sequential()

#this is the epoch array for different nb_epoch


####################################
###
### Here I define model architecture
###
####################################

model.compile(loss="mse", optimizer="rmsprop")


#################################################
####  Defining arrays for different epoch number 
#################################################
epoch_array = range(100, 2100,100)


# I create the following arrays/matrices to store the result of NN fit 
# different epoch number.

train_DBN_fitted_Y = np.zeros(shape=(len(epoch_array),trainX.shape[0]))
test_DBN_fitted_Y = np.zeros(shape=(len(epoch_array),testX.shape[0]))

###############################################
###
### Following loop is the heart of the question
###
##############################################

i = 0  
for epoch in epoch_array:
      model.fit( trainX, trainY,
            batch_size = 16, nb_epoch = epoch, validation_split = 0.05, verbose = 2)
      trainPredict = model.predict(trainX)
      testPredict = model.predict(testX)
      trainPredict = trainPredict.reshape(trainPredict.shape[0])
      testPredict = testPredict.reshape(testPredict.shape[0])
      train_DBN_fitted_Y[i] = trainPredict
      test_DBN_fitted_Y[i]  = testPredict
      i = i + 1

Now this loops is very inefficient. Because for example, when it sets, say, nb_epoch = 100, it starts training from epoch = 1 and finishes at epoch = 100 like following :

Epoch 1/100
0s - loss: 1.9508 - val_loss: 296.7801
.
.
.
Epoch 100/100
0s - loss: 7.6575 - val_loss: 366.2218

In the next iteration of loop, where it says nb_epoch = 200 it starts training from epoch = 1 again and finishes at epoch = 200. But what I want to do is, in this iteration, start training from where it left in the last iteration of the loop i.e. epoch = 100 and then epoch = 101 and so on....

How can I modify this loop to achieve this?


Answer:

Continuously calling fit is training your model further, starting from the state it was left from the previous call. For it to not continue it would have to reset the weights of your model, which fit does not do. You are just not seeing that it does that, as it is always starting to count epochs beginning at 1.

So in the end the problem is just that it does not print the correct number of epochs (which you cannot change).

If this is bothering you, you can implement your own fit by calling model.train_on_batch periodically.

Question:

I'm implemented a neural network in MATLAB for better understanding of the topic.

I wanted to run the code on my GPU, so I initialized every matrix with gpuArray(), but got no performance boost. Moreover, sometimes the GPU is slower than the CPU. I already learned to use functions like arrayfun, pagefun and so on. In backprop I have a for loop that computes the delta error for every layer, backwards. However, the computation needs the result of the previous computation and I have no idea how to do this with *fun() functions.

My CPU is a i5-3570, my GPU is a GTX 660 Ti. I already tested GPUBench in MATLAB, GPU is x times faster than CPU, so I think the mistake is in my code.

TL;DR

How do I improve this MATLAB code for GPU computing?

    delta_output = (predicted - NN.Y) .* NN.activationGradient(predicted);
    delta_hidden(:, :, m) = (delta_output * NN.Theta_output) .* ...
                            NN.activationGradient(NN.a_hidden(:, :, m));
    for i = m-1:-1:1
        delta_hidden(:, :, i) = (delta_hidden(:, 2:end, i+1) * ...
                                 NN.Theta_hidden(:, :, i)) .* ...
                                 NN.activationGradient(NN.a_hidden(:, :, i));
    end

predicted, NN.y, NN.Theta_* are all gpuArray. I already initialized delta_* as a gpuArray but it doensn't make any difference.


Answer:

The advantage of using the GPU for neural networks comes not from computing the updates for every layer at once - that's inherently serial, as you point out. It comes from being able to compute the update for the weights on thousands of neurons in each layer at once.

So I suspect that you simply do not have a large enough network to make using the GPU advantageous. What is the size of your weight matrix at each layer? If it doesn't contain at least 1000 elements, you're probably not going to see much advantage over the highly-optimised multi-core and intrinsically-vectorised computation that your CPU is doing.

Question:

I was trying to create a Simple Neural Network through MATLAB (reference https://becominghuman.ai/making-a-simple-neural-network-2ea1de81ec20, although the author has coded in JavaScript, I wanted to do the same using MATLAB). I created my own MATLAB Live Script, but I am really confused as to my weights vector I created does not update. I am trying to add a learning rate of 0.20 to the weights(3) element, so as to make it reach 1 (I am trying 6 trials for training the network). I am new to using MATLAB and generally code in Python, so if I the mistake I am making/the thing I am missing is kindly explained or what line of code is wrong, I would be grateful. Thanks a lot!

Here is my piece of code:-

inputs = [0 1 0 0]'
weights = [0 0 0 0]'
desiredresult = 1
disp('Neural Net Result')
res_net = evaluateNeuralNetwork(inputs, weights)
disp('Error')
evaluateNeuralNetError(1, res_net);
learn(inputs, weights)
train(6, inputs, weights)



function result = evaluateNeuralNetwork(inputVector, weightVector)
    result = 0;

    for i = 1:numel(inputVector)
        result = result + (inputVector(i) * weightVector(i));
    end
end

function res = evaluateNeuralNetError(desired, actual)
    res = desired - actual
end

    function learn(inputs, weights)
    learningRate = 0.20

         weights(3) = weights(3) + learningRate
end

function neuralNetResult = train(trials, inputs, weights)
    for i = 1:trials
        neuralNetResult = evaluateNeuralNetwork(inputs,weights)
        learn(inputs, weights)
    end
end

EDIT

Here is the updated (working code) as per accepted answer by Marouen:-

inputs = [0 1 0 0]'
weights = [0 0 0 0]'
desiredresult = 1
disp('Neural Net Result')
res_net = evaluateNeuralNetwork(inputs, weights)
disp('Error')
evaluateNeuralNetError(1, res_net);
learn(inputs, weights)
train(6, inputs, weights)



function result = evaluateNeuralNetwork(inputVector, weightVector)
    result = 0;

    for i = 1:numel(inputVector)
        result = result + (inputVector(i) * weightVector(i));
    end
end

function res = evaluateNeuralNetError(desired, actual)
    res = desired - actual
end

    function weights = learn(inputs, weights)
    learningRate = 0.20

         weights(3) = weights(3) + learningRate
end

function neuralNetResult = train(trials, inputs, weights)
    for i = 1:trials
        disp('Neural Network Result')
        neuralNetResult = evaluateNeuralNetwork(inputs,weights)
        weights = learn(inputs, weights)
        disp('Error')
        evaluateNeuralNetError(1, neuralNetResult)
    end
end

Answer:

Sounds like you missed a loop in the learn function, double check with the original article.

function learn(inputs, weights)
    learningRate = 0.20

    for i =1:length(weights)
        if(inputs(i)> 0)
            weights(i) = weights(i) + learningRate
        end
    end
end

EDIT

You also have to update weights in the loop of train function

weights=learn(inputs, weights)

and add weights as output in learn function declaration

function weights=learn(inputs, weights)

Otherwise weights do not get updated. You can also declare weights as a global variable.

Question:

I'm coding a multiplayer perceptron where I have calculated the sigmoids individually but would like to use a loop instead. How can I implement this into a loop? This is my working code:

public static void main (String args[]) {

    //Initial weights    w1    w2   w3   w4   w5  w6     w7   w8
    double weights[] = {-0.1, 0.4,-0.2,-0.3, -0.2, 0.4, 0.3, -0.2};

    //number of inputs
    int x1 = 1;
    int x2 = 0;

    //out
    double target = 0;

    double sum = 0;

    double Sigmoid1;
    double Sigmoid2;
    double Sigmoid3;

    int i = 0;

    while(i< weights.length){

        Sigmoid1 = (x1 * weights[i] );
                Sigmoid1 = 1 / (1 + Math.exp(-Sigmoid1));

            Sigmoid2 = (x2 * weights[i] );
                Sigmoid2 = 1 / (1 + Math.exp(-Sigmoid2));

            Sigmoid3 = (x1 * weights[2]) + (x2 * weights[4]);
                Sigmoid3 = 1 / (1 + Math.exp(-Sigmoid3));



                 System.out.println("Sigmoid1 is: " + Sigmoid1);
                 System.out.println("Sigmoid2 is: " + Sigmoid2);
                 System.out.println("Sigmoid3 is: " + Sigmoid3);

                 break;

    }

}

}


Answer:

You can create an array of doubles to hold the sigmoid values and weights for each layer. For instance:

double x = {1,0};

int layer1_input_size = 2; // number of inputs to network
int layer2_input_size = 2; // number of inputs to final layer

int layer1_output_size = 2; // number of outputs to first layer (must match inputs to next layer)
int layer2_output_size = 1; // number of outputs of network

// Initialize arrays to hold the values of outputs and weights for each layer
double sigmoid_layer1[] = new double[layer1_output_size];
double sigmoid_layer2[] = new double[layer2_output_size]
double weights_layer1[] = new double[layer1_input_size];
double weights_layer2[] = new double[layer2_input_size];

// iterate over each neuron in layer1
for(int j = 0; j < sigmoid_layer1.length; j++){
    double sum = 0; // sum of weights * inputs (also known as dot product)
    for(int i = 0; i < weights_layer1.length; i++){
        sum += (x[j] * weights_layer1[i]; // for every weight, multiply by corresponding input to layer
    }
    sigmoid_layer1[j] = 1 / (1 + Math.exp(-sum); // sigmoid activation
}
for(int j = 0; j < sigmoid_layer2.length; j++){
    double sum = 0;
    for(int i = 0; i < weights_layer2.length; i++){
        sum += (sigmoid_layer1[j] * weights_layer2[i]; // same as before, only now the inputs to this layer are the outputs from the previous layer
    }
    sigmoid_layer2[j] = 1 / (1 + Math.exp(-sum);
}

The same type of abstraction could be used to allow for a dynamic number of layers as well.

Maybe a little background to explain my answer further: In a neural network or a MultiLayer Perceptron, there are multiple sets (or layers) of computational units (neurons). Each of the neurons in one layer are connected to every neuron in the next layer (at least in the simplest case).The inputs to a layer, are the outputs of the layer before it, and the inputs to the first layer are the inputs to your network.

In your case (as I understand it): Your inputs are in the x array. So x[0] = 1 is the first input, and x[1] = 0 is the second. Your first layer consists of sigmoid1 and sigmoid2. I combined these and held the outputs of the activation functions in the array sigmoid_layer1. Your second layer consists of sigmoid3. The inputs to sigmoid3 are the outputs of sigmoid1 and sigmoid2. The output of sigmoid3 (held in sigmoid_layer2) is the output of your network. The number of weights in the network are determined by the number of inputs to each neuron. For instance: in layer 1 there are two inputs (x[0] and x[1]) and there are two neurons (sigmoid1 and sigmoid2). This means you will need 4 weights where weights_layer1[0] and weights_layer1[1] are the weights for the first neuron, and weights_layer1[2] and weights_layer1[3] are the weights for the second neuron.

This means that your overall network uses 6 weights. 4 in the first layer, and 2 in the second. To initialize these weights manually (as you are doing) it could be done as so:

double weights_layer1[] = {-0.1, 0.4, -0.2, -0.3};
double weights_layer2[] = {-0.2, 0.4};

Please note that there is no flexibility to the number of weights you initialize. If you go with this architecture (2 neurons in the first layer and 1 neuron in the second) then you can only have exactly 4 weights in the first array, and 2 weights in the second.

Question:

I often need to end loop one iteration earlier, how do I do it elegantly? Breaking out of nested loops looks messy to me. For now I workaround it with return, but when later on I want to make a class, having return in the middle of constructor looses sense.

def forward(neurons):
for layerId, layer in enumerate(neurons):
    if layerId == neurons.__len__() - 1:
        return
    for idx, i in enumerate(neurons[layerId]):
        for idx2, n in enumerate(neurons[layerId+1]):
            neurons[layerId+1][idx2] += i * sigmoid(weights[layerId][idx][idx2])

Answer:

One generic solution would be to create a generator from the iterator (that will yield all but the last element the iterator yields):

def but_last(p):
    first = True
    for x in p:
        if not first:
            yield last
        first = False
        last = x

for layerId, layer in but_last(enumerate(neurons)):
    do_your_stuff(layerId, layer)