Hot questions for Using Neural networks in chainer

Question:

I have an autoencoder model of 4 linear layers written using chainer.Chain. Running the optimizer.setup line in Trainer section give me the following error:

TypeError                                 Traceback (most recent call 
last)
<ipython-input-9-a2aabc58d467> in <module>()
      8 
      9 optimizer = optimizers.AdaDelta()
---> 10 optimizer.setup(sda)
     11 
     12 train_iter = iterators.SerialIterator(train_data,batchsize)

/usr/local/lib/python3.6/dist-packages/chainer/optimizer.py in setup(self, 
link)
    415         """
    416         if not isinstance(link, link_module.Link):
--> 417             raise TypeError('optimization target must be a link')
    418         self.target = link
    419         self.t = 0

TypeError: optimization target must be a link

The link to class StackedAutoEncoder is as follows: StackAutoEncoder link

The link to class NNBase which is used to write class AutoEncoder is as follows: NNBase link

model = chainer.Chain(
    enc1=L.Linear(1764, 200),
    enc2=L.Linear(200, 30),
    dec2=L.Linear(30, 200),
    dec1=L.Linear(200, 1764)
)


sda = StackedAutoEncoder(model, gpu=0)
sda.set_order(('enc1', 'enc2'), ('dec2', 'dec1'))
sda.set_optimizer(Opt.AdaDelta)
sda.set_encode(encode)
sda.set_decode(decode)

from chainer import iterators, training, optimizers
from chainer import Link, Chain, ChainList

optimizer = optimizers.AdaDelta()
optimizer.setup(sda)

train_iter = iterators.SerialIterator(train_data,batchsize)
valid_iter = iterators.SerialIterator(test_data,batchsize)

updater = training.StandardUpdater(train_iter,optimizer)
trainer = training.Trainer(updater,(epoch,"epoch"),out="result")

from chainer.training import extensions
trainer.extend(extensions.Evaluator(valid_iter, sda, device=gpu))

Chain is made of Links. I want to understand why the optimizer is not recognizing the sda which is StackedAutoencoder(model)?


Answer:

StackedAutoencoder inherits NNBase class, which inherits object class, so they are not chainer.Chain class.

You can refer official example for how to define your own network. For example, MNIST example defines MLP as follows:

class MLP(chainer.Chain):

    def __init__(self, n_units, n_out):
        super(MLP, self).__init__()
        with self.init_scope():
            # the size of the inputs to each layer will be inferred
            self.l1 = L.Linear(None, n_units)  # n_in -> n_units
            self.l2 = L.Linear(None, n_units)  # n_units -> n_units
            self.l3 = L.Linear(None, n_out)  # n_units -> n_out

    def forward(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        return self.l3(h2)

Question:

The python module chainer has an introduction where it uses its neural network to recognize handwritten digits from the MNIST database.

Assuming that a particular handwritten digit D.png is labeled as a 3. I'm used to the label appearing as an array as follows:

label = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]

However, chainer labels with an integer instead:

label = 3

The array-label is more intuitive to me because the output prediction is an array as well. In neural networks that don't deal with images, I want the flexibility to give the label to be a specific array.

I have included code below directly from the chainer introduction. If you parse through train or test dataset, notice that all of the labels are integers and not floats.

How would I run training/test data with arrays as labels instead of integers?

import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
from chainer.training import extensions

class MLP(Chain):
    def __init__(self, n_units, n_out):
        super(MLP, self).__init__()
        with self.init_scope():
            # the size of the inputs to each layer will be inferred
            self.l1 = L.Linear(None, n_units)  # n_in -> n_units
            self.l2 = L.Linear(None, n_units)  # n_units -> n_units
            self.l3 = L.Linear(None, n_out)    # n_units -> n_out

    def __call__(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        y = self.l3(h2)
        return y

train, test = datasets.get_mnist()

train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)

model = L.Classifier(MLP(100, 10))  # the input size, 784, is inferred
optimizer = optimizers.SGD()
optimizer.setup(model)

updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (20, 'epoch'), out='result')

trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()

Answer:

The Classifier accepts a tuple containing image or other data as array(float32) and label as int. This is a convention for chainer and how it works there. If you print your label you will see that you are getting an array with dtype int. Both the image/non-image data and the label will be in arrays but with dtype float and int respectively.

So to answer your question: your labels are in array format itself with dtype int(as it should be for labels).

If you want your labels to be 0's and 1's instead of 1 to 10 then use One Hot Encoding(https://blog.cambridgespark.com/robust-one-hot-encoding-in-python-3e29bfcec77e).

Question:

I am kind of new to Chainer and have written a code which trains a simple feed forward neural network. I have a validation set and a train set and want to test on the validation set on each like 500 iterations and if the results are better I want to save my network weights. Can anyone tell me how can I do that?

Here is my code:

optimizer = optimizers.Adam()
optimizer.setup(model)

updater = training.StandardUpdater(train_iter, optimizer, device=0)
trainer = training.Trainer(updater, (10000, 'epoch'), out='result')

trainer.extend(extensions.Evaluator(validation_iter, model, device=0))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/loss',  'validation/main/loss', 'elapsed_time']))
trainer.run()

Answer:

  1. Error on validation set

It is reported by Evaluator, and printed by PrintReport. Thus it should be shown with your code above. And to control the frequency of execution of these extentions, you can specify trigger keyword argument in trainer.extend function. For example, below code specifies printing each 500 iteration.

trainer.extend(extensions.PrintReport(['epoch', 'main/loss', 'validation/main/loss', 'elapsed_time']), trigger=(500, 'iteration'))

You can also specify trigger to Evaluator.

  1. Save network weights

You can use snapshot_object extension.

http://docs.chainer.org/en/stable/reference/generated/chainer.training.extensions.snapshot_object.html

It will be invoked every epoch as default.

If you want to invoke it when the loss improves, I think you can set trigger using MinValueTrigger.

http://docs.chainer.org/en/stable/reference/generated/chainer.training.triggers.MinValueTrigger.html