Tensorflow conv2d_transpose: Size of out_backprop doesn't match computed

tf.nn.conv2d_transpose output_shape
conv2dbackpropinput
convtranspose2d output size
cov2d transpose
http tf nn conv2d
tensorflow unpooling

When I build the FCN for segmentation, I want the images to keep the original size of input data, so I use the fully convolution layers. When I choose the fixed input size, such as (224, 224), the transpose conv works fine. However, when I changed the code of using (224, 224) to (h, w), I meet the following error. I googled before, but I didn't figure it out. Can anyone help me? Thanks.

Error information:

InvalidArgumentError (see above for traceback): Conv2DSlowBackpropInput: Size 
of out_backprop doesn't match computed: actual = 62, computed = 
63spatial_dim: 2 input: 500 filter: 16 output: 62 stride: 8 dilation: 1
     [[Node: deconv_layer/conv2d_transpose_2 = 
Conv2DBackpropInput[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], 
padding="SAME", strides=[1, 1, 8, 8], use_cudnn_on_gpu=true, 
_device="/job:localhost/replica:0/task:0/device:GPU:0"] 
(deconv_layer/conv2d_transpose_2-0-VecPermuteNHWCToNCHW- 
LayoutOptimizer/_1961, deconv_layer/deconv3/kernel/read, 
deconv_layer/Add_1)]]
     [[Node: losses/_2091 = _Recv[client_terminated=false, 
recv_device="/job:localhost/replica:0/task:0/device:CPU:0", 
send_device="/job:localhost/replica:0/task:0/device:GPU:0", 
send_device_incarnation=1, tensor_name="edge_4480_losses", 
tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"] 
()]]

Code:

with tf.variable_scope("deconv_layer"):
    deconv_shape1 = block2.get_shape()
    W_t1 = deconv_utils.weight_variable([4, 4, deconv_shape1[3].value, 2048], 
                                        name="deconv1/kernel")
    b_t1 = deconv_utils.bias_variable([deconv_shape1[3].value], 
                                      name="deconv1/biases")
    deconv_t1 = deconv_utils.conv2d_transpose_strided(block4, W_t1, b_t1, 
                                       output_shape=tf.shape(block2))
    fuse1 = tf.add(deconv_t1, block2)
    print("deconv_t1: ", deconv_t1.shape)
    print("fuse_1: ", fuse1.shape)
    tf.identity(fuse1, name="fuse1")

    deconv_shape2 = block1.get_shape()
    W_t2 = deconv_utils.weight_variable([4, 4, deconv_shape2[3].value, 
                        deconv_shape1[3].value], name="deconv2/kernel")
    b_t2 = deconv_utils.bias_variable([deconv_shape2[3].value], 
                                      name="deconv2/biases")
    deconv_t2 = deconv_utils.conv2d_transpose_strided(fuse1, W_t2, b_t2, 
                        output_shape=tf.shape(block1))
    fuse2 = tf.add(deconv_t2, block1)
    print("deconv_t2: ", deconv_t2.shape)
    print("fuse2: ", fuse2.shape)
    tf.identity(fuse2, name="fuse2")

    shape = tf.shape(features)
    deconv_shape3 = tf.stack([shape[0], shape[1], shape[2], num_classes])
    W_t3 = deconv_utils.weight_variable([16, 16, num_classes, 
                       deconv_shape2[3].value], name="deconv3/kernel")
    b_t3 = deconv_utils.bias_variable([num_classes], name="deconv3/biases")
    deconv_t3 = deconv_utils.conv2d_transpose_strided(fuse2, W_t3, b_t3, 
                       output_shape=deconv_shape3, stride=8)
    print("deconv_t3: ", deconv_t3.shape)

The version with out custom functions is here:

    with tf.variable_scope("deconv_layer"):
    deconv1_shape = block2.get_shape()
    shape1 = [4, 4, deconv1_shape[3].value, 2048]
    deconv1_kernel = tf.Variable(initial_value=tf.truncated_normal(shape1, 
                                 stddev=0.02),
                                 trainable=True,
                                 name="deconv1/kernel")
    deconv1 = tf.nn.conv2d_transpose(value=block4,
                                     filter=deconv1_kernel,
                                     # output_shape=[BATCH_SIZE, 
                             tf.shape(block2)[1], tf.shape(block2)[2], 512],
                                     output_shape=tf.shape(block2),
                                     strides=[1, 2, 2, 1],
                                     padding='SAME',
                                     data_format='NHWC'
                                     )
    print('deconv1', deconv1.shape)
    fuse1 = tf.add(deconv1, block2)  # fuse1 = pool4 + deconv2(pool5)
    tf.identity(fuse1, name="fuse1")

    deconv2_shape = block1.get_shape()
    shape2 = [4, 4, deconv2_shape[3].value, deconv1_shape[3].value]
    deconv2_kernel = tf.Variable(initial_value=tf.truncated_normal(shape2, 
                                 stddev=0.02),
                                 trainable=True,
                                 name="deconv2/kernel")
    deconv2 = tf.nn.conv2d_transpose(value=fuse1,
                                     filter=deconv2_kernel,
                                     output_shape=tf.shape(block1),
                                     strides=[1, 2, 2, 1],
                                     padding='SAME',
                                     data_format='NHWC'
                                     )
    print('deconv2', deconv2.shape)
    fuse2 = tf.add(deconv2, block1)
    tf.identity(fuse2, name="fuse2")

    deconv3_shape = tf.stack([tf.shape(features)[0], tf.shape(features)[1], 
                              tf.shape(features)[2], num_classes])
    shape3 = [16, 16, num_classes, deconv2_shape[3].value]
    deconv_final_kernel = tf.Variable(initial_value=tf.truncated_normal(shape3, stddev=0.02),
                                      trainable=True,
                                      name="deconv3/kernel")

    seg_logits = tf.nn.conv2d_transpose(value=fuse2,
                                        filter=deconv_final_kernel,
                                        output_shape=deconv3_shape,
                                        strides=[1, 8, 8, 1],
                                        padding='SAME',
                                        data_format='NHWC') 

The conv Net and Deconv Net in FCN, which are built by different structures, are maybe not consistent with each other. In this case, the conv net use conv with padding='VALID', while the deconv net uses all conv_transpose with padding='SAME. Thus the shapes are not the same, which causes the problem above.

"Size of out_backprop doesn't match computed" when converting Deconvolution output = graph.get_tensor_by_name('conv2d_transpose:0') with tf. Source framework with version (like Tensorflow 1.4.1 with GPU):caffe. Stack Overflow Public questions and answers; Number of rows of out_backprop doesn't match Tensorflow conv2d_transpose: Size of out_backprop doesn't match

This is because of your stride > 1. The calculation can not be correct at all time. This GitHub post explains it.

Tensorflow conv2d_transpose error "Number of rows of out_backprop doesn't The batch size is 10, input to this layer is in the form of 8x8x64, and the output is  "Size of out_backprop doesn't match computed" when converting Deconvolution layer from caffe to tf #735 woinck opened this issue Sep 13, 2019 · 0 comments Comments

I had similar issue while trying to replicate pytorch's transposeconv2d function in tensorflow. I was trying to do padding on input before passing to the conv2d_transpose() function and doing padding again on the deconvolved output. This was the reason why graph was initialized properly but there was error in calculating the gradients. I solved the error by removing all manual paddings and changing padding="SAME" inside the function. I guess this is handeled internally in the function. Correct me if I am wrong. I don't know how much this affects the actual output.

I am very aware how transpose convolution works but I can't find any resource to calculate the output size given input, strides and kernel size specific to Tensorflow  Invalid argument: Conv2DSlowBackpropInput: input and out_backprop must have the same batch size input batch: 64 outbackprop batch: 56 batch_dim: 0. what makes me more puzzled is the training can run some steps, my batch_size=64, when the steps arrives at 101 steps, the training stopped!!!! here is my deconv functions:

The stride of the sliding window for each dimension of input . ValueError, If input/output depth does not match filter 's shape, or if padding is other than '​VALID'  System information Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes. OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04 TensorFlow installed from (source or binary): S

The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds  I am currently working with tf.nn.conv2d_transpose. From the past I am used to Caffe deconvolution layer and tf.nn.conv2d_transpose is kind of TensorFlow equivalent to it. My question here is if someone could point me to detailed behaviour/documenatation of tf.nn.conv2d_transpose. Particularly I am confused by output_shape parameter.

Still a little embarrassed, and use tensorflow to do further testing and learning. Here its meaning is how many sizes the convolution works in this dimension, such as tf.nn.conv2d_transpose( value, filter, output_shape, strides, padding='​SAME', argument: Conv2DCustomBackpropInput: Size of out_backprop doesn'​t  Then conv2d_transpose layer to into 28, 28, 1. (which is basically a mnist pic) I have two questions 1.) This code doesn't work for obvious. Do you have any clue, why? 2.) I am very aware how transpose convolution works but I can't find any resource to calculate the output size given input, strides and kernel size specific to Tensorflow.

Comments
  • Under my current knowledge, in FCN for segmentation, the up sampling is done by conv2d_transpose with "stride > 1". So I think what you mentioned may not be the key of the problem I met.
  • The error message you posted gives a Stride of stride: 8. So you use a stride > 1.
  • I did use stride > 1. My opinion is that when you build the FCN for segmentation, it is necessary to use conv2d_transpose with stride > 1 for up sampling. This code do use it that way. So I believe the error I met is not caused by stride > 1.
  • You don't have to use stride > 1 for upsampling. But you sample up more quickly, when you have a larger stride. Maybe they use sepcial cases, where you compute the output shape, but it is common knowledge that you can not