How can I multiply a vector and a matrix in tensorflow without reshaping?

tensorflow multiply
tensorflow multiply vector and matrix
tensor dot product
tensorflow multiply tensor by scalar
tensorflow matrix
tensorflow batch matmul
tensorflow multiply operator
tensorflow matrix power

This:

import numpy as np
a = np.array([1, 2, 1])
w = np.array([[.5, .6], [.7, .8], [.7, .8]])

print(np.dot(a, w))
# [ 2.6  3. ] # plain nice old matrix multiplication n x (n, m) -> m

import tensorflow as tf

a = tf.constant(a, dtype=tf.float64)
w = tf.constant(w)

with tf.Session() as sess:
    print(tf.matmul(a, w).eval())

results in:

C:\_\Python35\python.exe C:/Users/MrD/.PyCharm2017.1/config/scratches/scratch_31.py
[ 2.6  3. ]
# bunch of errors in windows...
Traceback (most recent call last):
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 671, in _call_cpp_shape_fn_impl
    input_tensors_as_shapes, status)
  File "C:\_\Python35\lib\contextlib.py", line 66, in __exit__
    next(self.gen)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [3], [3,2].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/MrD/.PyCharm2017.1/config/scratches/scratch_31.py", line 14, in <module>
    print(tf.matmul(a, w).eval())
  File "C:\_\Python35\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1765, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1454, in _mat_mul
    transpose_b=transpose_b, name=name)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2329, in create_op
    set_shapes_for_outputs(ret)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1667, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn
    debug_python_shape_fn, require_shape_fn)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [3], [3,2].

Process finished with exit code 1

(not sure why the same exception is raised inside its handling)

The solution suggested in Tensorflow exception with matmul is reshaping the vector to a matrix but this leads to needlessly complicated code - is there still no other way to multiply a vector with a matrix?

Incidentally using expand_dims (as suggested in the link above) with default arguments raises a ValueError - that's not mentioned in the docs and defeats the purpose of having a default argument.

Matmul was coded for rank two or greater tensors. Not sure why to be honest as numpy has it such that it allows for matrix vector multiplication as well.

import numpy as np
a = np.array([1, 2, 1])
w = np.array([[.5, .6], [.7, .8], [.7, .8]])

print(np.dot(a, w))
# [ 2.6  3. ] # plain nice old matix multiplication n x (n, m) -> m
print(np.sum(np.expand_dims(a, -1) * w , axis=0))
# equivalent result [2.6, 3]

import tensorflow as tf

a = tf.constant(a, dtype=tf.float64)
w = tf.constant(w)

with tf.Session() as sess:
  # they all produce the same result as numpy above
  print(tf.matmul(tf.expand_dims(a,0), w).eval())
  print((tf.reduce_sum(tf.multiply(tf.expand_dims(a,-1), w), axis=0)).eval())
  print((tf.reduce_sum(tf.multiply(a, tf.transpose(w)), axis=1)).eval())

  # Note tf.multiply is equivalent to "*"
  print((tf.reduce_sum(tf.expand_dims(a,-1) * w, axis=0)).eval())
  print((tf.reduce_sum(a * tf.transpose(w), axis=1)).eval())

tf.linalg.matvec, [0]] is equivalent to matrix multiplication. Example 3: When a and b are matrices (order 2), the case axes=0 gives the outer product, a tensor of order 4. This: import numpy as np a = np.array([1, 2, 1]) w = np.array([[.5, .6], [.7, .8], [.7, .8]]) print(np.dot(a, w)) # [ 2.6 3. ] # plain nice old matrix multiplication

tf.einsum gives you the ability to do exactly what you need in concise and intuitive form:

with tf.Session() as sess:
    print(tf.einsum('n,nm->m', a, w).eval())
    # [ 2.6  3. ] 

You even get to write your comment explicitly n x (n, m) -> m. It is more readable and intuitive in my opinion.

My favorite use case is when you want to multiply a batch of matrices with a weight vector:

n_in = 10
n_step = 6
input = tf.placeholder(dtype=tf.float32, shape=(None, n_step, n_in))
weights = tf.Variable(tf.truncated_normal((n_in, 1), stddev=1.0/np.sqrt(n_in)))
Y_predict = tf.einsum('ijk,kl->ijl', input, weights)
print(Y_predict.get_shape())
# (?, 6, 1)

So you can easily multiply weights over all batches with no transformations or duplication. This you can not do by expanding dimensions like in other answer. So you avoid the tf.matmul requirement to have matching dimensions for batch and other outer dimensions:

The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication arguments, and any further outer dimensions match.

tf.math.multiply, print(a * b, "\n") # element-wise multiplication print(a @ b, "\n") A scalar has rank 0, a vector has rank 1, a matrix is rank 2. Axis or The tf.reshape operation is fast and cheap as the underlying data does not need to be duplicated. # Shape � Multiplying a Vector by a Matrix To multiply a row vector by a column vector, the row vector must have as many columns as the column vector has rows. Let us define the multiplication between a matrix A and a vector x in which the number of columns in A equals the number of rows in x .

You can use tf.tensordot and set axes=1. For the simple operation of a vector times a matrix, this is a bit cleaner than tf.einsum

tf.tensordot(a, w, 1)

tf.linalg.matmul, IndexedSlices objects, which are easy to multiply by a scalar but more expensive to multiply with arbitrary tensors. ValueError, if scalar is not a 0-D scalar . Matrices: A matrix is a 2D-array of numbers, so each element is identified by two indices instead of just one. If a real valued matrix A has a height of m and a width of n, then we say that A in R m x n. We identify the elements of the matrix as A_(m,n) where m represents the row and n represents the column.

tf.tensordot, Reshapes a tensor. numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)> The tf. reshape does not change the order of or the total number of elements in the tensor,� The second matrix we create will be a TensorFlow tensor shaped 3x3 with integers ones for every element with the data type of int32. tf_int_ones = tf.ones(shape=[3,3], dtype="int32") In this case, we’re using tf.ones operation and we’re assigning it to the Python variable tf_int_ones.

Introduction to Tensors, register_tensor_conversion_function � repeat � required_space_to_batch_paddings � reshape � reverse � reverse_sequence If perm is not given, it is set to (n-10), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. This should be a vector. Tensors are multi-dimensional arrays with a uniform type (called a dtype). You can see all supported dtypes at tf.dtypes.DType. If you're familiar with NumPy, tensors are (kind of) like np.arrays. All tensors are immutable like Python numbers and strings: you can never update the contents of a

tf.math.scalar_mul, required_space_to_batch_paddings � reshape � reverse � reverse_sequence � roll � scan � scatter_nd Broadcast an array for a compatible shape. When doing broadcasted operations such as multiplying a tensor by a scalar, broadcasting (usually) However, broadcast_to does not carry with it any such benefits. A vector is a matrix with just one row or column (but see below). So there are a bunch of mathematical operations that we can do to any matrix. The basic idea, though, is that a matrix is just a 2

Comments
  • Accepted answer works but that's really an API bug - reported: github.com/tensorflow/tensorflow/issues/9055
  • Thanks for making an issue, this behavior bothered me as well. For much nicer solution to this and more use cases see my answer.
  • Oh thanks - well, it's not matrix multiplication then ;) Are those 2 equivalent ? Could you explain a bit what reduce sum does ? Sorry too much fighting with tf today, I 'm dizzy
  • So the "*" multiplication operation supports regular numpy broadcasting sematics(it might be missing some fancy indexing stuff). In the above it will multiply a the vector across each vector in w. Then reduce_sum will collapse a dimension by summing along that dimension. so we go from a * w -> reduce_sum(product) -> ans; ([n * nxm]) -> [nxm] -> [m]. Axis determines which axis to add over in this case we want 0 to get our final result of dimension m.
  • Nope I'm sorry -> print(tf.reduce_sum(a * w, axis=0).eval()) results in ValueError: Dimensions must be equal, but are 3 and 2 for 'mul' (op: 'Mul') with input shapes: [3], [3,2]. in code in question
  • Sorry about the mixup in broadcasting. I've fixed the code and provided both examples that produce the same results in numpy and tf.
  • Thanks - I reported it here: github.com/tensorflow/tensorflow/issues/9055
  • Thanks didn't know about einsum