Hot questions for Using Neural networks in deconvolution

Question:

I am trying to implement a convolutional neural netwrok and I don't understand why using im2col operation is more efficient. It basically stores the input to be multiplied by filter in separate columns. But why shouldn't loops be used directly to calculate convolution instead of first performing im2col ?


Answer:

  1. Well, you are thinking in the right way, In Alex Net almost 95% of the GPU time and 89% on CPU time is spent on the Convolutional Layer and Fully Connected Layer.

  2. The Convolutional Layer and Fully Connected Layer are implemented using GEMM that stands for General Matrix to Matrix Multiplication.

  3. So basically in GEMM, we convert the convolution operation to a Matrix Multiplication operation by using a function called im2col() which arranges the data in a way that the convolution output can be achieved by Matrix Multiplication.

  4. Now, you may have a question instead of directly doing element wise convolution, why are we adding a step in between to arrange the data in a different way and then use GEMM.

  5. The answer to this is, scientific programmers, have spent decades optimizing code to perform large matrix to matrix multiplications, and the benefits from the very regular patterns of memory access outweigh any other losses. We have an optimized CUDA GEMM API in cuBLAS library, Intel MKL has an optimized CPU GEMM while ciBLAS's GEMM API can be used for devices supporting OpenCL.

  6. Element wise convolution performs badly because of the irregular memory accesses involved in it.

  7. In turn, Im2col() arranges the data in a way that the memory accesses are regular for Matrix Multiplication.

  8. Im2col() function adds a lot of data redundancy though, but the performance benefit of using Gemm outweigh this data redundancy.

  9. This is the reason for using Im2col() operation in Neural Nets.

  10. This link explains how Im2col() arranges the data for GEMM: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

Question:

I'm trying to use the Deconvolution2D of keras with Tensorflow backend.

But I got some issues. First, in the output_shape, if I pass None for the batch_size, I get this error :

TypeError: Expected binary or unicode string, got None

And if I change None by the batch size I use, here is the error.. :

InvalidArgumentError (see above for traceback): Conv2DCustomBackpropInput: input and out_backprop must have the same batch size
 [[Node: conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose/output_shape, transpose, Reshape_4)]]

Here is the model I use :

model = Sequential()

reg = lambda: l1l2(l1=1e-7, l2=1e-7)
h = 5
model.add(Dense(input_dim=100, output_dim=nch * 4 * 4, W_regularizer=reg()))
model.add(BatchNormalization(mode=0))
model.add(Reshape((4, 4, nch)))
model.add(Deconvolution2D(256, h,h, output_shape=(128,8,8,256 ), subsample=(2,2), border_mode='same'))
model.add(BatchNormalization(mode=0, axis=1))
model.add(LeakyReLU(0.2))
model.add(Deconvolution2D(256, h,h, output_shape=(128,16,16,256 ), subsample=(2,2), border_mode='same'))
model.add(BatchNormalization(mode=0, axis=1))
model.add(LeakyReLU(0.2))
model.add(Deconvolution2D(64, h,h, output_shape=(128,32,32,64), subsample=(2,2), border_mode='same'))
model.add(BatchNormalization(mode=0, axis=1))
model.add(LeakyReLU(0.2))
model.add(Convolution2D(3, h, h, border_mode='same', W_regularizer=reg()))
model.add(Activation('sigmoid'))
model.summary()

Answer:

This was an annoyance with deconvolution in previous versions of Keras, always having to give a fixed batch size and manually compute output_shape. This also meant that your data set size had to be divisible by 'batch_size' or an error would be raised on the last (smaller) batch.

Fortunately this was fixed in Keras 2.0. Deconvolution2D has been replaced by Conv2DTranspose and you don't even have to give output_shape as an argument any more:

    model.add(Conv2DTranspose(filters=256, kernel_size=(h,h), strides=(2,2), padding='same'))

Question:

I was experimenting with a VAE implementation in Tensorflow for MNIST dataset. To start things off, I trained a VAE based on MLP encoder and decoder. It trains just fine, the loss decreases and it generates plausibly looking digits. Here's a code of the decoder of this MLP-based VAE:

x = sampled_z
x = tf.layers.dense(x, 200, tf.nn.relu)
x = tf.layers.dense(x, 200, tf.nn.relu)
x = tf.layers.dense(x, np.prod(data_shape))
img = tf.reshape(x, [-1] + data_shape)

As a next step, I decided to add convolutional layers. Changing just the encoder worked just fine, but when I use deconvolutions in the decoder (instead of fc layers) I don't get any training at all. The loss function never decreases, and the output is always black. Here's the code of deconvolutional decoder:

x = tf.layers.dense(sampled_z, 24, tf.nn.relu)
x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)
x = tf.reshape(x, [-1, 7, 7, 64])
x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME', activation=tf.nn.sigmoid)
img = tf.reshape(x, [-1, 28, 28])

This seems bizarre, the code looks just fine to me. I narrowed it down to the deconvolutional layers in the decoder, there's something in there that breaks it. E.g. if I add a fully-connected layer (even without the nonlinearity!) after the last deconvolution, it works again! Here's the code:

x = tf.layers.dense(sampled_z, 24, tf.nn.relu)
x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)
x = tf.reshape(x, [-1, 7, 7, 64])
x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME', activation=tf.nn.sigmoid)
x = tf.contrib.layers.flatten(x)
x = tf.layers.dense(x, 28 * 28)
img = tf.reshape(x, [-1, 28, 28])

I'm really a little stuck at this point, does anyone have any idea what might be happening here? I use tf 1.8.0, Adam optimizer, 1e-4 learning rate.

EDIT:

As @Agost pointed out, I should perhaps clarify things about my loss function and the training process. I model the posterior as a Bernoulli distribution and maximizing ELBO as my loss. Inspired by this post. Here's the full code of encoder, decoder, and the loss:

def make_prior():
    mu = tf.zeros(N_LATENT)
    sigma = tf.ones(N_LATENT)
    return tf.contrib.distributions.MultivariateNormalDiag(mu, sigma)


def make_encoder(x_input):
    x_input = tf.reshape(x_input, shape=[-1, 28, 28, 1])
    x = conv(x_input, 32, 3, 2)
    x = conv(x, 64, 3, 2)
    x = conv(x, 128, 3, 2)
    x = tf.contrib.layers.flatten(x)
    mu = dense(x, N_LATENT)
    sigma = dense(x, N_LATENT, activation=tf.nn.softplus)  # softplus is log(exp(x) + 1)
    return tf.contrib.distributions.MultivariateNormalDiag(mu, sigma)    


def make_decoder(sampled_z):
    x = tf.layers.dense(sampled_z, 24, tf.nn.relu)
    x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)
    x = tf.reshape(x, [-1, 7, 7, 64])

    x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)
    x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)
    x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME')

    img = tf.reshape(x, [-1, 28, 28])

    img_distribution = tf.contrib.distributions.Bernoulli(img)
    img = img_distribution.probs
    img_distribution = tf.contrib.distributions.Independent(img_distribution, 2)
    return img, img_distribution


def main():
    mnist = input_data.read_data_sets(os.path.join(experiment_dir(EXPERIMENT), 'MNIST_data'))

    tf.reset_default_graph()

    batch_size = 128

    x_input = tf.placeholder(dtype=tf.float32, shape=[None, 28, 28], name='X')

    prior = make_prior()
    posterior = make_encoder(x_input)

    mu, sigma = posterior.mean(), posterior.stddev()

    z = posterior.sample()
    generated_img, output_distribution = make_decoder(z)

    likelihood = output_distribution.log_prob(x_input)
    divergence = tf.distributions.kl_divergence(posterior, prior)
    elbo = tf.reduce_mean(likelihood - divergence)
    loss = -elbo

    global_step = tf.train.get_or_create_global_step()
    optimizer = tf.train.AdamOptimizer(1e-3).minimize(loss, global_step=global_step)

Answer:

Could it be your use of sigmoid in the final deconv layer restricting output to 0-1, you dont do this in the MLP based autoencoder or when adding a fully-connected after the deconvs so possible data range issue?

Question:

I am trying to implement a segmentation net for images using caffe. For each image having dimension 3x256x256, I have a 256x256 ground truth images. when I launch a train I got this error:

I1019 08:50:55.831014  5847 layer_factory.hpp:74] Creating layer data
I1019 08:50:55.831068  5847 net.cpp:90] Creating Layer data
I1019 08:50:55.831099  5847 net.cpp:368] data -> data
I1019 08:50:55.831149  5847 net.cpp:368] data -> label
I1019 08:50:55.831178  5847 net.cpp:120] Setting up data
I1019 08:50:55.831207  5847 dense_image_data_layer.cpp:41] Opening file /home/ubuntu/full_conv_net/train.txt
I1019 08:50:55.844880  5847 dense_image_data_layer.cpp:51] Shuffling data
I1019 08:50:55.847098  5847 dense_image_data_layer.cpp:56] A total of 15000 examples.
I1019 08:50:55.856138  5847 dense_image_data_layer.cpp:109] output data size: 1,3,256,256
I1019 08:50:55.856648  5847 net.cpp:127] Top shape: 1 3 256 256 (196608)
I1019 08:50:55.856678  5847 net.cpp:127] Top shape: 1 1 256 256 (65536)
I1019 08:50:55.856696  5847 layer_factory.hpp:74] Creating layer label_data_1_split
I1019 08:50:55.856739  5847 net.cpp:90] Creating Layer label_data_1_split
I1019 08:50:55.856768  5847 net.cpp:410] label_data_1_split <- label
I1019 08:50:55.856792  5847 net.cpp:368] label_data_1_split -> label_data_1_split_0
I1019 08:50:55.856828  5847 net.cpp:368] label_data_1_split -> label_data_1_split_1
I1019 08:50:55.856843  5847 net.cpp:120] Setting up label_data_1_split
I1019 08:50:55.856863  5847 net.cpp:127] Top shape: 1 1 256 256 (65536)
I1019 08:50:55.856884  5847 net.cpp:127] Top shape: 1 1 256 256 (65536)
I1019 08:50:55.856896  5847 layer_factory.hpp:74] Creating layer conv1_1
I1019 08:50:55.856921  5847 net.cpp:90] Creating Layer conv1_1
I1019 08:50:55.856940  5847 net.cpp:410] conv1_1 <- data
I1019 08:50:55.856956  5847 net.cpp:368] conv1_1 -> conv1_1
I1019 08:50:55.856976  5847 net.cpp:120] Setting up conv1_1
I1019 08:50:55.857264  5847 net.cpp:127] Top shape: 1 64 256 256 (4194304)
I1019 08:50:55.857321  5847 layer_factory.hpp:74] Creating layer conv1_1_bn
I1019 08:50:55.857347  5847 net.cpp:90] Creating Layer conv1_1_bn
I1019 08:50:55.857367  5847 net.cpp:410] conv1_1_bn <- conv1_1
I1019 08:50:55.857383  5847 net.cpp:357] conv1_1_bn -> conv1_1 (in-place)
I1019 08:50:55.857398  5847 net.cpp:120] Setting up conv1_1_bn
I1019 08:50:55.857949  5847 net.cpp:127] Top shape: 1 64 256 256 (4194304)
I1019 08:50:55.857978  5847 layer_factory.hpp:74] Creating layer relu1_1
I1019 08:50:55.857997  5847 net.cpp:90] Creating Layer relu1_1
I1019 08:50:55.858011  5847 net.cpp:410] relu1_1 <- conv1_1
I1019 08:50:55.858023  5847 net.cpp:357] relu1_1 -> conv1_1 (in-place)
I1019 08:50:55.858043  5847 net.cpp:120] Setting up relu1_1
I1019 08:50:55.858063  5847 net.cpp:127] Top shape: 1 64 256 256 (4194304)
I1019 08:50:55.858077  5847 layer_factory.hpp:74] Creating layer conv1_2
I1019 08:50:55.858094  5847 net.cpp:90] Creating Layer conv1_2
I1019 08:50:55.858113  5847 net.cpp:410] conv1_2 <- conv1_1
I1019 08:50:55.858132  5847 net.cpp:368] conv1_2 -> conv1_2
I1019 08:50:55.858155  5847 net.cpp:120] Setting up conv1_2
I1019 08:50:55.859597  5847 net.cpp:127] Top shape: 1 64 256 256 (4194304)
I1019 08:50:55.859625  5847 layer_factory.hpp:74] Creating layer conv1_2_bn
I1019 08:50:55.859642  5847 net.cpp:90] Creating Layer conv1_2_bn
I1019 08:50:55.859653  5847 net.cpp:410] conv1_2_bn <- conv1_2
I1019 08:50:55.859670  5847 net.cpp:357] conv1_2_bn -> conv1_2 (in-place)
I1019 08:50:55.859691  5847 net.cpp:120] Setting up conv1_2_bn
I1019 08:50:55.861166  5847 net.cpp:127] Top shape: 1 64 256 256 (4194304)
I1019 08:50:55.861192  5847 layer_factory.hpp:74] Creating layer relu1_2
I1019 08:50:55.861207  5847 net.cpp:90] Creating Layer relu1_2
I1019 08:50:55.861217  5847 net.cpp:410] relu1_2 <- conv1_2
I1019 08:50:55.861240  5847 net.cpp:357] relu1_2 -> conv1_2 (in-place)
I1019 08:50:55.861261  5847 net.cpp:120] Setting up relu1_2
I1019 08:50:55.861274  5847 net.cpp:127] Top shape: 1 64 256 256 (4194304)
I1019 08:50:55.861285  5847 layer_factory.hpp:74] Creating layer pool1
I1019 08:50:55.861300  5847 net.cpp:90] Creating Layer pool1
I1019 08:50:55.861318  5847 net.cpp:410] pool1 <- conv1_2
I1019 08:50:55.861335  5847 net.cpp:368] pool1 -> pool1
I1019 08:50:55.861351  5847 net.cpp:368] pool1 -> pool1_mask
I1019 08:50:55.861371  5847 net.cpp:120] Setting up pool1
I1019 08:50:55.861418  5847 net.cpp:127] Top shape: 1 64 128 128 (1048576)
I1019 08:50:55.861438  5847 net.cpp:127] Top shape: 1 64 128 128 (1048576)
I1019 08:50:55.861449  5847 layer_factory.hpp:74] Creating layer conv2_1
I1019 08:50:55.861465  5847 net.cpp:90] Creating Layer conv2_1
I1019 08:50:55.861476  5847 net.cpp:410] conv2_1 <- pool1
I1019 08:50:55.861495  5847 net.cpp:368] conv2_1 -> conv2_1
I1019 08:50:55.861517  5847 net.cpp:120] Setting up conv2_1
I1019 08:50:55.863991  5847 net.cpp:127] Top shape: 1 128 128 128 (2097152)
I1019 08:50:55.864022  5847 layer_factory.hpp:74] Creating layer conv2_1_bn
I1019 08:50:55.864038  5847 net.cpp:90] Creating Layer conv2_1_bn
I1019 08:50:55.864049  5847 net.cpp:410] conv2_1_bn <- conv2_1
I1019 08:50:55.864068  5847 net.cpp:357] conv2_1_bn -> conv2_1 (in-place)
I1019 08:50:55.864099  5847 net.cpp:120] Setting up conv2_1_bn
I1019 08:50:55.864298  5847 net.cpp:127] Top shape: 1 128 128 128 (2097152)
I1019 08:50:55.864323  5847 layer_factory.hpp:74] Creating layer relu2_1
I1019 08:50:55.864341  5847 net.cpp:90] Creating Layer relu2_1
I1019 08:50:55.864351  5847 net.cpp:410] relu2_1 <- conv2_1
I1019 08:50:55.864365  5847 net.cpp:357] relu2_1 -> conv2_1 (in-place)
I1019 08:50:55.864378  5847 net.cpp:120] Setting up relu2_1
I1019 08:50:55.864392  5847 net.cpp:127] Top shape: 1 128 128 128 (2097152)
I1019 08:50:55.864401  5847 layer_factory.hpp:74] Creating layer conv2_2
I1019 08:50:55.864418  5847 net.cpp:90] Creating Layer conv2_2
I1019 08:50:55.864437  5847 net.cpp:410] conv2_2 <- conv2_1
I1019 08:50:55.864451  5847 net.cpp:368] conv2_2 -> conv2_2
I1019 08:50:55.864466  5847 net.cpp:120] Setting up conv2_2
I1019 08:50:55.869410  5847 net.cpp:127] Top shape: 1 128 128 128 (2097152)
I1019 08:50:55.869451  5847 layer_factory.hpp:74] Creating layer conv2_2_bn
I1019 08:50:55.869469  5847 net.cpp:90] Creating Layer conv2_2_bn
I1019 08:50:55.869482  5847 net.cpp:410] conv2_2_bn <- conv2_2
I1019 08:50:55.869494  5847 net.cpp:357] conv2_2_bn -> conv2_2 (in-place)
I1019 08:50:55.869508  5847 net.cpp:120] Setting up conv2_2_bn
I1019 08:50:55.869607  5847 net.cpp:127] Top shape: 1 128 128 128 (2097152)
I1019 08:50:55.869632  5847 layer_factory.hpp:74] Creating layer relu2_2
I1019 08:50:55.869645  5847 net.cpp:90] Creating Layer relu2_2
I1019 08:50:55.869655  5847 net.cpp:410] relu2_2 <- conv2_2
I1019 08:50:55.869669  5847 net.cpp:357] relu2_2 -> conv2_2 (in-place)
I1019 08:50:55.869681  5847 net.cpp:120] Setting up relu2_2
I1019 08:50:55.869695  5847 net.cpp:127] Top shape: 1 128 128 128 (2097152)
I1019 08:50:55.869722  5847 layer_factory.hpp:74] Creating layer pool2
I1019 08:50:55.869740  5847 net.cpp:90] Creating Layer pool2
I1019 08:50:55.869762  5847 net.cpp:410] pool2 <- conv2_2
I1019 08:50:55.869776  5847 net.cpp:368] pool2 -> pool2
I1019 08:50:55.869791  5847 net.cpp:368] pool2 -> pool2_mask
I1019 08:50:55.869818  5847 net.cpp:120] Setting up pool2
I1019 08:50:55.869838  5847 net.cpp:127] Top shape: 1 128 64 64 (524288)
I1019 08:50:55.869849  5847 net.cpp:127] Top shape: 1 128 64 64 (524288)
I1019 08:50:55.869859  5847 layer_factory.hpp:74] Creating layer conv3_1
I1019 08:50:55.869879  5847 net.cpp:90] Creating Layer conv3_1
I1019 08:50:55.869900  5847 net.cpp:410] conv3_1 <- pool2
I1019 08:50:55.869917  5847 net.cpp:368] conv3_1 -> conv3_1
I1019 08:50:55.869937  5847 net.cpp:120] Setting up conv3_1
I1019 08:50:55.879293  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.879324  5847 layer_factory.hpp:74] Creating layer conv3_1_bn
I1019 08:50:55.879345  5847 net.cpp:90] Creating Layer conv3_1_bn
I1019 08:50:55.879370  5847 net.cpp:410] conv3_1_bn <- conv3_1
I1019 08:50:55.879385  5847 net.cpp:357] conv3_1_bn -> conv3_1 (in-place)
I1019 08:50:55.879410  5847 net.cpp:120] Setting up conv3_1_bn
I1019 08:50:55.879452  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.879473  5847 layer_factory.hpp:74] Creating layer relu3_1
I1019 08:50:55.879487  5847 net.cpp:90] Creating Layer relu3_1
I1019 08:50:55.879508  5847 net.cpp:410] relu3_1 <- conv3_1
I1019 08:50:55.879523  5847 net.cpp:357] relu3_1 -> conv3_1 (in-place)
I1019 08:50:55.879545  5847 net.cpp:120] Setting up relu3_1
I1019 08:50:55.879559  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.879580  5847 layer_factory.hpp:74] Creating layer conv3_2
I1019 08:50:55.879595  5847 net.cpp:90] Creating Layer conv3_2
I1019 08:50:55.879611  5847 net.cpp:410] conv3_2 <- conv3_1
I1019 08:50:55.879645  5847 net.cpp:368] conv3_2 -> conv3_2
I1019 08:50:55.879667  5847 net.cpp:120] Setting up conv3_2
I1019 08:50:55.898186  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.898214  5847 layer_factory.hpp:74] Creating layer conv3_2_bn
I1019 08:50:55.898231  5847 net.cpp:90] Creating Layer conv3_2_bn
I1019 08:50:55.898252  5847 net.cpp:410] conv3_2_bn <- conv3_2
I1019 08:50:55.898272  5847 net.cpp:357] conv3_2_bn -> conv3_2 (in-place)
I1019 08:50:55.898293  5847 net.cpp:120] Setting up conv3_2_bn
I1019 08:50:55.898339  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.898360  5847 layer_factory.hpp:74] Creating layer relu3_2
I1019 08:50:55.898375  5847 net.cpp:90] Creating Layer relu3_2
I1019 08:50:55.898394  5847 net.cpp:410] relu3_2 <- conv3_2
I1019 08:50:55.898411  5847 net.cpp:357] relu3_2 -> conv3_2 (in-place)
I1019 08:50:55.898435  5847 net.cpp:120] Setting up relu3_2
I1019 08:50:55.898448  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.898470  5847 layer_factory.hpp:74] Creating layer conv3_3
I1019 08:50:55.898485  5847 net.cpp:90] Creating Layer conv3_3
I1019 08:50:55.898501  5847 net.cpp:410] conv3_3 <- conv3_2
I1019 08:50:55.898520  5847 net.cpp:368] conv3_3 -> conv3_3
I1019 08:50:55.898541  5847 net.cpp:120] Setting up conv3_3
I1019 08:50:55.917057  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.917083  5847 layer_factory.hpp:74] Creating layer conv3_3_bn
I1019 08:50:55.917099  5847 net.cpp:90] Creating Layer conv3_3_bn
I1019 08:50:55.917117  5847 net.cpp:410] conv3_3_bn <- conv3_3
I1019 08:50:55.917135  5847 net.cpp:357] conv3_3_bn -> conv3_3 (in-place)
I1019 08:50:55.917157  5847 net.cpp:120] Setting up conv3_3_bn
I1019 08:50:55.917193  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.917214  5847 layer_factory.hpp:74] Creating layer relu3_3
I1019 08:50:55.917229  5847 net.cpp:90] Creating Layer relu3_3
I1019 08:50:55.917248  5847 net.cpp:410] relu3_3 <- conv3_3
I1019 08:50:55.917265  5847 net.cpp:357] relu3_3 -> conv3_3 (in-place)
I1019 08:50:55.917285  5847 net.cpp:120] Setting up relu3_3
I1019 08:50:55.917301  5847 net.cpp:127] Top shape: 1 256 64 64 (1048576)
I1019 08:50:55.917323  5847 layer_factory.hpp:74] Creating layer pool3
I1019 08:50:55.917340  5847 net.cpp:90] Creating Layer pool3
I1019 08:50:55.917356  5847 net.cpp:410] pool3 <- conv3_3
I1019 08:50:55.917371  5847 net.cpp:368] pool3 -> pool3
I1019 08:50:55.917390  5847 net.cpp:368] pool3 -> pool3_mask
I1019 08:50:55.917405  5847 net.cpp:120] Setting up pool3
I1019 08:50:55.917425  5847 net.cpp:127] Top shape: 1 256 32 32 (262144)
I1019 08:50:55.917441  5847 net.cpp:127] Top shape: 1 256 32 32 (262144)
I1019 08:50:55.917459  5847 layer_factory.hpp:74] Creating layer conv4_1
I1019 08:50:55.917477  5847 net.cpp:90] Creating Layer conv4_1
I1019 08:50:55.917493  5847 net.cpp:410] conv4_1 <- pool3
I1019 08:50:55.917510  5847 net.cpp:368] conv4_1 -> conv4_1
I1019 08:50:55.917529  5847 net.cpp:120] Setting up conv4_1
I1019 08:50:55.954141  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:55.954170  5847 layer_factory.hpp:74] Creating layer conv4_1_bn
I1019 08:50:55.954190  5847 net.cpp:90] Creating Layer conv4_1_bn
I1019 08:50:55.954208  5847 net.cpp:410] conv4_1_bn <- conv4_1
I1019 08:50:55.954226  5847 net.cpp:357] conv4_1_bn -> conv4_1 (in-place)
I1019 08:50:55.954251  5847 net.cpp:120] Setting up conv4_1_bn
I1019 08:50:55.954305  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:55.954330  5847 layer_factory.hpp:74] Creating layer relu4_1
I1019 08:50:55.954344  5847 net.cpp:90] Creating Layer relu4_1
I1019 08:50:55.954365  5847 net.cpp:410] relu4_1 <- conv4_1
I1019 08:50:55.954377  5847 net.cpp:357] relu4_1 -> conv4_1 (in-place)
I1019 08:50:55.954401  5847 net.cpp:120] Setting up relu4_1
I1019 08:50:55.954416  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:55.954426  5847 layer_factory.hpp:74] Creating layer conv4_2
I1019 08:50:55.954449  5847 net.cpp:90] Creating Layer conv4_2
I1019 08:50:55.954463  5847 net.cpp:410] conv4_2 <- conv4_1
I1019 08:50:55.954475  5847 net.cpp:368] conv4_2 -> conv4_2
I1019 08:50:55.954495  5847 net.cpp:120] Setting up conv4_2
I1019 08:50:56.026969  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:56.027006  5847 layer_factory.hpp:74] Creating layer conv4_2_bn
I1019 08:50:56.027035  5847 net.cpp:90] Creating Layer conv4_2_bn
I1019 08:50:56.027046  5847 net.cpp:410] conv4_2_bn <- conv4_2
I1019 08:50:56.027078  5847 net.cpp:357] conv4_2_bn -> conv4_2 (in-place)
I1019 08:50:56.027102  5847 net.cpp:120] Setting up conv4_2_bn
I1019 08:50:56.027133  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:56.027153  5847 layer_factory.hpp:74] Creating layer relu4_2
I1019 08:50:56.027168  5847 net.cpp:90] Creating Layer relu4_2
I1019 08:50:56.027179  5847 net.cpp:410] relu4_2 <- conv4_2
I1019 08:50:56.027204  5847 net.cpp:357] relu4_2 -> conv4_2 (in-place)
I1019 08:50:56.027220  5847 net.cpp:120] Setting up relu4_2
I1019 08:50:56.027242  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:56.027256  5847 layer_factory.hpp:74] Creating layer conv4_3
I1019 08:50:56.027288  5847 net.cpp:90] Creating Layer conv4_3
I1019 08:50:56.027307  5847 net.cpp:410] conv4_3 <- conv4_2
I1019 08:50:56.027320  5847 net.cpp:368] conv4_3 -> conv4_3
I1019 08:50:56.027341  5847 net.cpp:120] Setting up conv4_3
I1019 08:50:56.100551  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:56.100585  5847 layer_factory.hpp:74] Creating layer conv4_3_bn
I1019 08:50:56.100603  5847 net.cpp:90] Creating Layer conv4_3_bn
I1019 08:50:56.100626  5847 net.cpp:410] conv4_3_bn <- conv4_3
I1019 08:50:56.100639  5847 net.cpp:357] conv4_3_bn -> conv4_3 (in-place)
I1019 08:50:56.100666  5847 net.cpp:120] Setting up conv4_3_bn
I1019 08:50:56.100702  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:56.100735  5847 layer_factory.hpp:74] Creating layer relu4_3
I1019 08:50:56.100750  5847 net.cpp:90] Creating Layer relu4_3
I1019 08:50:56.100766  5847 net.cpp:410] relu4_3 <- conv4_3
I1019 08:50:56.100787  5847 net.cpp:357] relu4_3 -> conv4_3 (in-place)
I1019 08:50:56.100801  5847 net.cpp:120] Setting up relu4_3
I1019 08:50:56.100824  5847 net.cpp:127] Top shape: 1 512 32 32 (524288)
I1019 08:50:56.100836  5847 layer_factory.hpp:74] Creating layer pool4
I1019 08:50:56.100862  5847 net.cpp:90] Creating Layer pool4
I1019 08:50:56.100872  5847 net.cpp:410] pool4 <- conv4_3
I1019 08:50:56.100888  5847 net.cpp:368] pool4 -> pool4
I1019 08:50:56.100911  5847 net.cpp:368] pool4 -> pool4_mask
I1019 08:50:56.100926  5847 net.cpp:120] Setting up pool4
I1019 08:50:56.100949  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.100960  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.100977  5847 layer_factory.hpp:74] Creating layer conv5_1
I1019 08:50:56.100993  5847 net.cpp:90] Creating Layer conv5_1
I1019 08:50:56.101009  5847 net.cpp:410] conv5_1 <- pool4
I1019 08:50:56.101022  5847 net.cpp:368] conv5_1 -> conv5_1
I1019 08:50:56.101043  5847 net.cpp:120] Setting up conv5_1
I1019 08:50:56.173538  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.173573  5847 layer_factory.hpp:74] Creating layer conv5_1_bn
I1019 08:50:56.173593  5847 net.cpp:90] Creating Layer conv5_1_bn
I1019 08:50:56.173619  5847 net.cpp:410] conv5_1_bn <- conv5_1
I1019 08:50:56.173632  5847 net.cpp:357] conv5_1_bn -> conv5_1 (in-place)
I1019 08:50:56.173660  5847 net.cpp:120] Setting up conv5_1_bn
I1019 08:50:56.173699  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.173724  5847 layer_factory.hpp:74] Creating layer relu5_1
I1019 08:50:56.173750  5847 net.cpp:90] Creating Layer relu5_1
I1019 08:50:56.173773  5847 net.cpp:410] relu5_1 <- conv5_1
I1019 08:50:56.173785  5847 net.cpp:357] relu5_1 -> conv5_1 (in-place)
I1019 08:50:56.173806  5847 net.cpp:120] Setting up relu5_1
I1019 08:50:56.173820  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.173835  5847 layer_factory.hpp:74] Creating layer conv5_2
I1019 08:50:56.173851  5847 net.cpp:90] Creating Layer conv5_2
I1019 08:50:56.173868  5847 net.cpp:410] conv5_2 <- conv5_1
I1019 08:50:56.173887  5847 net.cpp:368] conv5_2 -> conv5_2
I1019 08:50:56.173907  5847 net.cpp:120] Setting up conv5_2
I1019 08:50:56.247103  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.247139  5847 layer_factory.hpp:74] Creating layer conv5_2_bn
I1019 08:50:56.247158  5847 net.cpp:90] Creating Layer conv5_2_bn
I1019 08:50:56.247177  5847 net.cpp:410] conv5_2_bn <- conv5_2
I1019 08:50:56.247192  5847 net.cpp:357] conv5_2_bn -> conv5_2 (in-place)
I1019 08:50:56.247215  5847 net.cpp:120] Setting up conv5_2_bn
I1019 08:50:56.247246  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.247267  5847 layer_factory.hpp:74] Creating layer relu5_2
I1019 08:50:56.247282  5847 net.cpp:90] Creating Layer relu5_2
I1019 08:50:56.247301  5847 net.cpp:410] relu5_2 <- conv5_2
I1019 08:50:56.247318  5847 net.cpp:357] relu5_2 -> conv5_2 (in-place)
I1019 08:50:56.247341  5847 net.cpp:120] Setting up relu5_2
I1019 08:50:56.247365  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.247376  5847 layer_factory.hpp:74] Creating layer conv5_3
I1019 08:50:56.247402  5847 net.cpp:90] Creating Layer conv5_3
I1019 08:50:56.247419  5847 net.cpp:410] conv5_3 <- conv5_2
I1019 08:50:56.247434  5847 net.cpp:368] conv5_3 -> conv5_3
I1019 08:50:56.247503  5847 net.cpp:120] Setting up conv5_3
I1019 08:50:56.320230  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.320261  5847 layer_factory.hpp:74] Creating layer conv5_3_bn
I1019 08:50:56.320279  5847 net.cpp:90] Creating Layer conv5_3_bn
I1019 08:50:56.320297  5847 net.cpp:410] conv5_3_bn <- conv5_3
I1019 08:50:56.320317  5847 net.cpp:357] conv5_3_bn -> conv5_3 (in-place)
I1019 08:50:56.320339  5847 net.cpp:120] Setting up conv5_3_bn
I1019 08:50:56.320369  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.320389  5847 layer_factory.hpp:74] Creating layer relu5_3
I1019 08:50:56.320405  5847 net.cpp:90] Creating Layer relu5_3
I1019 08:50:56.320426  5847 net.cpp:410] relu5_3 <- conv5_3
I1019 08:50:56.320441  5847 net.cpp:357] relu5_3 -> conv5_3 (in-place)
I1019 08:50:56.320461  5847 net.cpp:120] Setting up relu5_3
I1019 08:50:56.320473  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.320493  5847 layer_factory.hpp:74] Creating layer pool5
I1019 08:50:56.320508  5847 net.cpp:90] Creating Layer pool5
I1019 08:50:56.320529  5847 net.cpp:410] pool5 <- conv5_3
I1019 08:50:56.320546  5847 net.cpp:368] pool5 -> pool5
I1019 08:50:56.320569  5847 net.cpp:368] pool5 -> pool5_mask
I1019 08:50:56.320585  5847 net.cpp:120] Setting up pool5
I1019 08:50:56.320605  5847 net.cpp:127] Top shape: 1 512 8 8 (32768)
I1019 08:50:56.320617  5847 net.cpp:127] Top shape: 1 512 8 8 (32768)
I1019 08:50:56.320634  5847 layer_factory.hpp:74] Creating layer upsample5
I1019 08:50:56.320659  5847 net.cpp:90] Creating Layer upsample5
I1019 08:50:56.320677  5847 net.cpp:410] upsample5 <- pool5
I1019 08:50:56.320690  5847 net.cpp:410] upsample5 <- pool5_mask
I1019 08:50:56.320710  5847 net.cpp:368] upsample5 -> pool5_D
I1019 08:50:56.320731  5847 net.cpp:120] Setting up upsample5
I1019 08:50:56.320755  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.320771  5847 layer_factory.hpp:74] Creating layer conv5_3_D
I1019 08:50:56.320791  5847 net.cpp:90] Creating Layer conv5_3_D
I1019 08:50:56.320806  5847 net.cpp:410] conv5_3_D <- pool5_D
I1019 08:50:56.320821  5847 net.cpp:368] conv5_3_D -> conv5_3_D
I1019 08:50:56.320842  5847 net.cpp:120] Setting up conv5_3_D
I1019 08:50:56.393477  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.393512  5847 layer_factory.hpp:74] Creating layer conv5_3_D_bn
I1019 08:50:56.393533  5847 net.cpp:90] Creating Layer conv5_3_D_bn
I1019 08:50:56.393554  5847 net.cpp:410] conv5_3_D_bn <- conv5_3_D
I1019 08:50:56.393569  5847 net.cpp:357] conv5_3_D_bn -> conv5_3_D (in-place)
I1019 08:50:56.393584  5847 net.cpp:120] Setting up conv5_3_D_bn
I1019 08:50:56.393625  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.393646  5847 layer_factory.hpp:74] Creating layer relu5_3_D
I1019 08:50:56.393661  5847 net.cpp:90] Creating Layer relu5_3_D
I1019 08:50:56.393681  5847 net.cpp:410] relu5_3_D <- conv5_3_D
I1019 08:50:56.393697  5847 net.cpp:357] relu5_3_D -> conv5_3_D (in-place)
I1019 08:50:56.393721  5847 net.cpp:120] Setting up relu5_3_D
I1019 08:50:56.393735  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.393754  5847 layer_factory.hpp:74] Creating layer conv5_2_D
I1019 08:50:56.393776  5847 net.cpp:90] Creating Layer conv5_2_D
I1019 08:50:56.393795  5847 net.cpp:410] conv5_2_D <- conv5_3_D
I1019 08:50:56.393811  5847 net.cpp:368] conv5_2_D -> conv5_2_D
I1019 08:50:56.393832  5847 net.cpp:120] Setting up conv5_2_D
I1019 08:50:56.467021  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.467056  5847 layer_factory.hpp:74] Creating layer conv5_2_D_bn
I1019 08:50:56.467075  5847 net.cpp:90] Creating Layer conv5_2_D_bn
I1019 08:50:56.467093  5847 net.cpp:410] conv5_2_D_bn <- conv5_2_D
I1019 08:50:56.467109  5847 net.cpp:357] conv5_2_D_bn -> conv5_2_D (in-place)
I1019 08:50:56.467124  5847 net.cpp:120] Setting up conv5_2_D_bn
I1019 08:50:56.467166  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.467188  5847 layer_factory.hpp:74] Creating layer relu5_2_D
I1019 08:50:56.467206  5847 net.cpp:90] Creating Layer relu5_2_D
I1019 08:50:56.467269  5847 net.cpp:410] relu5_2_D <- conv5_2_D
I1019 08:50:56.467284  5847 net.cpp:357] relu5_2_D -> conv5_2_D (in-place)
I1019 08:50:56.467298  5847 net.cpp:120] Setting up relu5_2_D
I1019 08:50:56.467319  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.467330  5847 layer_factory.hpp:74] Creating layer conv5_1_D
I1019 08:50:56.467357  5847 net.cpp:90] Creating Layer conv5_1_D
I1019 08:50:56.467380  5847 net.cpp:410] conv5_1_D <- conv5_2_D
I1019 08:50:56.467397  5847 net.cpp:368] conv5_1_D -> conv5_1_D
I1019 08:50:56.467418  5847 net.cpp:120] Setting up conv5_1_D
I1019 08:50:56.540335  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.540366  5847 layer_factory.hpp:74] Creating layer conv5_1_D_bn
I1019 08:50:56.540387  5847 net.cpp:90] Creating Layer conv5_1_D_bn
I1019 08:50:56.540405  5847 net.cpp:410] conv5_1_D_bn <- conv5_1_D
I1019 08:50:56.540418  5847 net.cpp:357] conv5_1_D_bn -> conv5_1_D (in-place)
I1019 08:50:56.540434  5847 net.cpp:120] Setting up conv5_1_D_bn
I1019 08:50:56.540469  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.540491  5847 layer_factory.hpp:74] Creating layer relu5_1_D
I1019 08:50:56.540506  5847 net.cpp:90] Creating Layer relu5_1_D
I1019 08:50:56.540524  5847 net.cpp:410] relu5_1_D <- conv5_1_D
I1019 08:50:56.540541  5847 net.cpp:357] relu5_1_D -> conv5_1_D (in-place)
I1019 08:50:56.540566  5847 net.cpp:120] Setting up relu5_1_D
I1019 08:50:56.540580  5847 net.cpp:127] Top shape: 1 512 16 16 (131072)
I1019 08:50:56.540598  5847 layer_factory.hpp:74] Creating layer deconv_1
I1019 08:50:56.540619  5847 net.cpp:90] Creating Layer deconv_1
I1019 08:50:56.540637  5847 net.cpp:410] deconv_1 <- conv5_1_D
I1019 08:50:56.540654  5847 net.cpp:368] deconv_1 -> deconv_1
I1019 08:50:56.540674  5847 net.cpp:120] Setting up deconv_1
I1019 08:50:56.542743  5847 net.cpp:127] Top shape: 1 2 47 47 (4418)
I1019 08:50:56.542790  5847 layer_factory.hpp:74] Creating layer deconv_1_deconv_1_0_split
I1019 08:50:56.542821  5847 net.cpp:90] Creating Layer deconv_1_deconv_1_0_split
I1019 08:50:56.542842  5847 net.cpp:410] deconv_1_deconv_1_0_split <- deconv_1
I1019 08:50:56.542860  5847 net.cpp:368] deconv_1_deconv_1_0_split -> deconv_1_deconv_1_0_split_0
I1019 08:50:56.542886  5847 net.cpp:368] deconv_1_deconv_1_0_split -> deconv_1_deconv_1_0_split_1
I1019 08:50:56.542907  5847 net.cpp:120] Setting up deconv_1_deconv_1_0_split
I1019 08:50:56.542924  5847 net.cpp:127] Top shape: 1 2 47 47 (4418)
I1019 08:50:56.542943  5847 net.cpp:127] Top shape: 1 2 47 47 (4418)
I1019 08:50:56.542953  5847 layer_factory.hpp:74] Creating layer loss
I1019 08:50:56.542980  5847 net.cpp:90] Creating Layer loss
I1019 08:50:56.543006  5847 net.cpp:410] loss <- deconv_1_deconv_1_0_split_0
I1019 08:50:56.543018  5847 net.cpp:410] loss <- label_data_1_split_0
I1019 08:50:56.543031  5847 net.cpp:368] loss -> loss
I1019 08:50:56.543071  5847 net.cpp:120] Setting up loss
I1019 08:50:56.543094  5847 layer_factory.hpp:74] Creating layer loss
F1019 08:50:56.543150  5847 softmax_loss_layer.cpp:56] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (2209 vs. 65536) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.
*** Check failure stack trace: ***
    @     0x7fa27a2cbdaa  (unknown)
    @     0x7fa27a2cbce4  (unknown)
    @     0x7fa27a2cb6e6  (unknown)
    @     0x7fa27a2ce687  (unknown)
    @     0x7fa27a6ffb58  caffe::SoftmaxWithLossLayer<>::Reshape()
    @     0x7fa27a642bf2  caffe::Net<>::Init()
    @     0x7fa27a644952  caffe::Net<>::Net()
    @     0x7fa27a62abf0  caffe::Solver<>::InitTrainNet()
    @     0x7fa27a62bbc3  caffe::Solver<>::Init()
    @     0x7fa27a62bd96  caffe::Solver<>::Solver()
    @           0x40c5d0  caffe::GetSolver<>()
    @           0x406611  train()
    @           0x404bb1  main
    @     0x7fa2797ddf45  (unknown)
    @           0x40515d  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)

What am I doing wrong? How exactly should I tackle this problem ? Is this a problem with my labeling method ? thanks


Answer:

The problem is that your prediction shape does not match the ground truth shape. You predict foreground/background probabilities per pixel at the resolution of 47x47 pixels: top shape of 'deconv_1' is 1x2x47x47 (batch_size=1, 2 probabilities per pixels, for 47x47 pixels). On the other hand, your ground truth labeling is at the resolution of 256x256.

You need to either decrease the ground truth resolution, or increase the prediction resolution (or both), such that their spatial dimensions would match.

Question:

I would like to use tf.nn.conv2d_transpose to build a deconvolution layer for a GAN network.

I would like to create a function deconv_layer. It generates a new layer, which outputs filter_num filters with expand_size times the resolution of the input.

My code is:

def deconv_layer(x, filter_num, kernel_size=5, expand_size=2):

    x_shape = x.get_shape().as_list()

    with tf.name_scope('deconv_'+str(filter_num)):

        size_in = x_shape[-1]
        size_out = filter_num

        w = tf.Variable(tf.random_normal([kernel_size, kernel_size, size_in, size_out], mean=0.0, stddev=0.125), name="W")
        b = tf.Variable(tf.random_normal([size_out], mean=0.0, stddev=0.125), name="B")

        conv = tf.nn.conv2d_transpose(x, w, output_shape=[-1, x_shape[-3]*expand_size, x_shape[-2]*expand_size, filter_num], strides=[1,expand_size,expand_size,1], padding="SAME")
        act = tf.nn.relu(tf.nn.bias_add(conv, b))

        tf.summary.histogram('weights', w)
        tf.summary.histogram('biases', b)
        tf.summary.histogram('activations', act)

    return act

The error message:

ValueError: input channels does not match filter's input channels
At conv = tf.nn.conv2d_transpose(...)

I am not sure if I use tf.nn.conv2d_transpose properly. I tried to create it based on a convolutional layer.


Answer:

The filter dimension is wrong. According to the docs:

filter: A 4-D Tensor with the same type as value and shape [height, width, output_channels, in_channels]. filter's in_channels dimension must match that of value (input).

You need to change your w size to :

w = tf.Variable(tf.random_normal([kernel_size, kernel_size, size_out, size_in], mean=0.0, stddev=0.125), name="W")