2-D convolution as a matrix-matrix multiplication

convolution matrix method
1d convolution matrix
convolution filter
convolution in image processing example
difference between matrix multiplication and convolution
1d convolution in matrix form
multi channel convolution
2d convolutional layer

I know that, in the 1D case, the convolution between two vectors, a and b, can be computed as conv(a, b), but also as the product between the T_a and b, where T_a is the corresponding Toeplitz matrix for a.

Is it possible to extend this idea to 2D?

Given a = [5 1 3; 1 1 2; 2 1 3] and b=[4 3; 1 2], is it possible to convert a in a Toeplitz matrix and compute the matrix-matrix product between T_a and b as in the 1-D case?

Yes, it is possible and you should also use a doubly block circulant matrix (which is a special case of Toeplitz matrix). I will give you an example with a small size of kernel and the input, but it is possible to construct Toeplitz matrix for any kernel. So you have a 2d input x and 2d kernel k and you want to calculate the convolution x * k. Also let's assume that k is already flipped. Let's also assume that x is of size n×n and k is m×m.

So you unroll k into a sparse matrix of size (n-m+1)^2 × n^2, and unroll x into a long vector n^2 × 1. You compute a multiplication of this sparse matrix with a vector and convert the resulting vector (which will have a size (n-m+1)^2 × 1) into a n-m+1 square matrix.

I am pretty sure this is hard to understand just from reading. So here is an example for 2×2 kernel and 3×3 input.


Here is a constructed matrix with a vector:

which is equal to .

And this is the same result you would have got by doing a sliding window of k over x.

2-D convolution as a matrix-matrix multiplication, In this article, I will explain how 2D Convolutions are implemented as matrix multiplications. This explanation is based on the notes of the  As for our convolution, we will set it to have the same properties as the previous section except that its output filters is 2. This means that the initial weights matrix, W , must have shape (2, 2, 2, 3) i.e. (output filters, kernel height, kernel width, input image’s channels).

1- Define Input and Filter

Let I be the input signal and F be the filter or kernel.

2- Calculate the final output size

If the I is m1 x n1 and F is m2 x n2 the size of the output will be:

3- Zero-pad the filter matrix

Zero pad the filter to make it the same size as the output.

4- Create Toeplitz matrix for each row of the zero-padded filter

5- Create a doubly blocked Toeplitz matrix

Now all these small Toeplitz matrices should be arranged in a big doubly blocked Toeplitz matrix.

6- Convert the input matrix to a column vector

7- Multiply doubly blocked toeplitz matrix with vectorized input signal

This multiplication gives the convolution result.

8- Last step: reshape the result to a matrix form

For more details and python code take a look at my github repository:

Step by step explanation of 2D convolution implemented as matrix multiplication using toeplitz matrices in python

An Illustrated Explanation of Performing 2D Convolutions Using , T = convmtx2( H , [m n] ) returns the convolution matrix, where the dimensions m and n are a two-element vector. Examples. collapse all. Create a Convolution  Now all that’s left is to perform the matrix multiplication K P and reshape it to the correct shape. The correct shape is a 3 x 3 x 2 matrix (channel dimension last). The correct shape is a 3 x

2-D convolution matrix - MATLAB convmtx2, Convolution as Matrix Multiplication. Step by step explanation of 2D convolution implemented as matrix multiplication using toeplitz matrices  The 2-D Convolution block computes the two-dimensional convolution of two input matrices. Assume that matrix A has dimensions (Ma, Na) and matrix B has dimensions (Mb, Nb). When the block calculates the full output size, the equation for the 2-D discrete convolution is: where and .

The code shown above doesn't produce the unrolled matrix of the right dimensions. The dimension should be (n-k+1)*(m-k+1), (k)(k). k: filter dimension, n: num rows in input matrix, m: num columns.

def unfold_matrix(X, k):
    n, m = X.shape[0:2]
    xx = zeros(((n - k + 1) * (m - k + 1), k**2))
    row_num = 0
    def make_row(x):
        return x.flatten()

    for i in range(n- k+ 1):
        for j in range(m - k + 1):
            #collect block of m*m elements and convert to row
            xx[row_num,:] = make_row(X[i:i+k, j:j+k])
            row_num = row_num + 1

    return xx

For more details, see my blog post:


Convolution as Matrix Multiplication, Then, this vector is multiplied on its left by the convolution matrix in order to obtain Actually the function can support any convolution shape you'd like - full , same CONVOLUTION_SHAPE_SAME = 2; CONVOLUTION_SHAPE_VALID = 3;  T = convmtx2(H,m,n) returns the convolution matrix T for the matrix H. If X is an m-by-n matrix, then reshape(T*X(:),size(H)+[m n]-1) is the same as conv2(X,H).

How Does a Convolution Can Be Expressed as a Matrix , The accumulation (adding these 9 multiplications) is the last thing to do to find out the output value. Note that the matrices are referenced here as [column, row], not​  Instead of using for-loops to perform 2D convolution on images (or any other 2D matrices) we can convert the filter to a Toeplitz matrix and image to a vector and do the convolution just by one matrix multiplication (and of course some post-processing on the result of this multiplication to get the final result)

Example of 2D Convolution, 8- Last step: reshape the result to a matrix form For more details and python code take a look at my github repository: Step by step explanation of 2D convolution  plication (General Matrix Multiplication (GEMM)) routine is commonly used to implement DNN convolution. It is well-known that 2D convolution can be implemented using matrix multiplication by converting one of the input matrices to a Toeplitz matrix. This involves replicating image pixels multi-ple times across different matrix columns. Once the

2-D convolution as a matrix-matrix multiplication, In image processing, a kernel, convolution matrix, or mask is a small matrix. It is used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between a kernel and an image. Contents. 1 Details. 1.1 Origin. 2 Convolution performed—convolution—is not traditional matrix multiplication, despite being  convolution_as_multiplication / Convolution_as_multiplication.ipynb Find file Copy path Ali Salehi - put all codes together as one function 3dee4d5 Sep 4, 2018

  • There would have to be some sort of reshaping at the end correct? That last vector is 4 x 1 but the result of the convolution would be 2 x 2
  • @jvans yes, in the end you should reshape your vector. It is written here: convert the resulting vector (which will have a size (n-m+1)^2 X 1) into a n-m+1 square matrix
  • In your example this is not a Toeplitz matrix. So you answer is only partially correct, is it ?
  • What you mean by Also let's assume that k is already flipped? Is it because we want to perform correlation instead of convolution? What is flipped in terms of numpy operations?
  • @mrgloom Yes, the operation above is a correlation which is why he flips the filter vertically (upside down) first so it then becomes equivalent to a convolution. The numpy is flip(m, 0), which is is equivalent to flipud(m).
  • I think there is an error. The first element of the result should be 10*0 + 20*0 + 30*0 +40*1 = 40. The element in position 2,2 should be 1*10 + 2*20 + 4*30 + 5*40 = 370. I think your result is correct for a matrix F equal to [40 30; 20 10] that is exactly F flipping both rows and columns. There is therefore an error in the procedure
  • It is doing convolution (mathematical convolution, not cross-correlation), so if you are doing it by hand, you need to flip the filter both vertically and horizontally. You can find more information on my GitHub repo.