Hot questions for Using Neural networks in serialization

Question:

I'm new to machine learning and am going through the courses on fast.ai. We're learning about vgg16, and I'm having trouble saving my model. I wonder what I'm doing wrong. When I start my model from scratch, training to learn the difference between cats and dogs, I get:

from __future__ import division,print_function
from vgg16 import Vgg16
import os, json
from glob import glob
import numpy as np
from matplotlib import pyplot as plt
import utils; reload(utils)
from utils import plots


np.set_printoptions(precision=4, linewidth=100)
batch_size=64

path = "dogscats/sample"
vgg = Vgg16()
# Grab a few images at a time for training and validation.
# NB: They must be in subdirectories named based on their category
batches = vgg.get_batches(path+'/train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'/valid', batch_size=batch_size*2)
vgg.finetune(batches)
no_of_epochs = 4
latest_weights_filename = None
for epoch in range(no_of_epochs):
    print ("Running epoch: %d" % epoch)
    vgg.fit(batches, val_batches, nb_epoch=1)
    latest_weights_filename = ('ft%d.h5' % epoch)
    vgg.model.save_weights(path+latest_weights_filename)
print ("Completed %s fit operations" % no_of_epochs)

Found 160 images belonging to 2 classes.
Found 40 images belonging to 2 classes.
Running epoch: 0
Epoch 1/1
160/160 [==============================] - 4s - loss: 1.8980 - acc: 0.6125 - val_loss: 0.5442 - val_acc: 0.8500
Running epoch: 1
Epoch 1/1
160/160 [==============================] - 4s - loss: 0.7194 - acc: 0.8563 - val_loss: 0.2167 - val_acc: 0.9500
Running epoch: 2
Epoch 1/1
160/160 [==============================] - 4s - loss: 0.1809 - acc: 0.9313 - val_loss: 0.1604 - val_acc: 0.9750
Running epoch: 3
Epoch 1/1
160/160 [==============================] - 4s - loss: 0.2733 - acc: 0.9375 - val_loss: 0.1684 - val_acc: 0.9750
Completed 4 fit operations

But now when I go to load one of the weight files, the model starts from scratch! For example, I would have expected the model below to have a val_acc of 0.9750! Am I misunderstanding something or doing something wrong? Why is the val_acc so low with this loaded model?

vgg = Vgg16()
vgg.model.load_weights(path+'ft3.h5')
batches = vgg.get_batches(path+'/train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'/valid', batch_size=batch_size*2)
vgg.finetune(batches)
vgg.fit(batches, val_batches, nb_epoch=1)

Found 160 images belonging to 2 classes.
Found 40 images belonging to 2 classes.
Epoch 1/1
160/160 [==============================] - 6s - loss: 1.3110 - acc: 0.6562 - val_loss: 0.5961 - val_acc: 0.8250

Answer:

The problem lies in a finetune function. When you get deeper into its definition:

def finetune(self, batches):
    model = self.model
    model.pop()
    for layer in model.layers: layer.trainable=False
    model.add(Dense(batches.nb_class, activation='softmax'))
    self.compile()

... one can see that by calling pop function - the last layer of your model is deleted. By doing this you are losing information from a trained model. The last layer is added again with random weights and then training starts again. This is the reason behind the accuracy drop.

Question:

I created a model with the following code:

model = Sequential()
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)

I am trying to create a serialized version of the best version of this model.

What I know is:

Serialization is the process of converting an object into a stream of bytes in order to store the object.

What I don't know is:

How do I convert the model I have created into the above defined stream of bytes?


Answer:

You can use the pickle module:

import pickle
pickle.dump(model, open('./output.bin', 'w'))

Question:

Context I am making a simulation of behavioral evolution. For this I am using neural networks to simulate behavior. There are literally thousands of these neural networks interacting with each other. At the end of any given generation the strongest behavior is copied onto its neighbors. When this happens I need to create an identical yet independent version of the stronger neural network onto the new one; thus replacing it.

The Problem I have looked into deep cloning, it works but it is copying node references, not creating a new identical instance. The difficulty comes from the system structure. I can't see how to ameliorate it though. Speed is a factor here as well. I need this to run millions of iterations a week.

Any help would be greatly appreciated.

System structure

The cell

public class Cell_NN Extends Cell
{
    private Network network;

    //Methods
}

The network

public class Network implements Cloneable, Serializable
{
    private ArrayList<ArrayList<Node>> net;
    private ArrayList<Node> layer;

    //Methods
}

The nodes

public class Node implements Cloneable, Serializable
{
    private ArrayList<Node> nextNodes;
    private ArrayList<Float> weights;

    //Methods
}

The deep clone (which I ripped off someone from stackOverflow)

public Network deepClone()
{
    try {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ObjectOutputStream oos = new ObjectOutputStream(baos);
        oos.writeObject(this);

        ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
        ObjectInputStream ois = new ObjectInputStream(bais);
        Network network = (Network) ois.readObject();
        return (Network) ois.readObject();

    } catch (IOException e) {
        return null;
    } catch (ClassNotFoundException e) {
        return null;
    }
}

I am unsure whether this is allowed but here is my github in case you would like more information: Napier40124399. Project is called HonorsMain_v2 and is public.


Answer:

I think your code contains just a small bug:

    Network network = (Network) ois.readObject();
    return (Network) ois.readObject();

Here you read the network twice from the ObjectInputStream, and return the second instance (which should be null). If you remove the first line, it should work.

The object is deep-cloned, and all references in the new Network instance are correctly wired to each other, not the original objects.

However, performance-wise this serialization / deserialization is pretty expensive. I recommend finding a structure which is faster to process if performance is of the essence.

For instance, two flat arrays for each network, one with the weights (as primitive floats or even ints) and one with int indices to the next nodes would be at least an order of magnitude faster to copy, using System.arraycopy().