## Hot questions for Using Neural networks in sparse matrix

Question:

Is there a way to convert a dense tensor into a sparse tensor? Apparently, Tensorflow's Estimator.fit doesn't accept SparseTensors as labels. One reason I would like to pass SparseTensors into Tensorflow's Estimator.fit is to be able to use tensorflow ctc_loss. Here's the code:

import dataset_utils import tensorflow as tf import numpy as np from tensorflow.contrib import grid_rnn, learn, layers, framework def grid_rnn_fn(features, labels, mode): input_layer = tf.reshape(features["x"], [-1, 48, 1596]) indices = tf.where(tf.not_equal(labels, tf.constant(0, dtype=tf.int32))) values = tf.gather_nd(labels, indices) sparse_labels = tf.SparseTensor(indices, values, dense_shape=tf.shape(labels, out_type=tf.int64)) cell_fw = grid_rnn.Grid2LSTMCell(num_units=128) cell_bw = grid_rnn.Grid2LSTMCell(num_units=128) bidirectional_grid_rnn = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, input_layer, dtype=tf.float32) outputs = tf.reshape(bidirectional_grid_rnn[0], [-1, 256]) W = tf.Variable(tf.truncated_normal([256, 80], stddev=0.1, dtype=tf.float32), name='W') b = tf.Variable(tf.constant(0., dtype=tf.float32, shape=[80], name='b')) logits = tf.matmul(outputs, W) + b logits = tf.reshape(logits, [tf.shape(input_layer)[0], -1, 80]) logits = tf.transpose(logits, (1, 0, 2)) loss = None train_op = None if mode != learn.ModeKeys.INFER: #Error occurs here loss = tf.nn.ctc_loss(inputs=logits, labels=sparse_labels, sequence_length=320) ... # returning ModelFnOps def main(_): image_paths, labels = dataset_utils.read_dataset_list('../test/dummy_labels_file.txt') data_dir = "../test/dummy_data/" images = dataset_utils.read_images(data_dir=data_dir, image_paths=image_paths, image_extension='png') print('Done reading images') images = dataset_utils.resize(images, (1596, 48)) images = dataset_utils.transpose(images) labels = dataset_utils.encode(labels) x_train, x_test, y_train, y_test = dataset_utils.split(features=images, test_size=0.5, labels=labels) train_input_fn = tf.estimator.inputs.numpy_input_fn( x={"x": np.array(x_train)}, y=np.array(y_train), num_epochs=1, shuffle=True, batch_size=1 ) classifier = learn.Estimator(model_fn=grid_rnn_fn, model_dir="/tmp/grid_rnn_ocr_model") classifier.fit(input_fn=train_input_fn)

**UPDATE**:

It turns out, this solution from here converts the dense tensor into a sparse one:

indices = tf.where(tf.not_equal(labels, tf.constant(0, dtype=tf.int32))) values = tf.gather_nd(labels, indices) sparse_labels = tf.SparseTensor(indices, values, dense_shape=tf.shape(labels, out_type=tf.int64))

However, I encounter this error now raised by ctc_loss:

ValueError: Shape must be rank 1 but is rank 0 for 'CTCLoss' (op: 'CTCLoss') with input shapes: [?,?,80], [?,2], [?], [].

I have this code that converts dense labels to sparse:

def convert_to_sparse(labels, dtype=np.int32): indices = [] values = [] for n, seq in enumerate(labels): indices.extend(zip([n] * len(seq), range(len(seq)))) values.extend(seq) indices = np.asarray(indices, dtype=dtype) values = np.asarray(values, dtype=dtype) shape = np.asarray([len(labels), np.asarray(indices).max(0)[1] + 1], dtype=dtype) return indices, values, shape

I converted `y_train`

to sparse labels, and place the values inside a `SparseTensor`

:

sparse_y_train = convert_to_sparse(y_train) print(tf.SparseTensor( indices=sparse_y_train[0], values=sparse_y_train[1], dense_shape=sparse_y_train ))

And compared it to the `SparseTensor`

created inside the grid_rnn_fn:

indices = tf.where(tf.not_equal(labels, tf.constant(0, dtype=tf.int32))) values = tf.gather_nd(labels, indices) sparse_labels = tf.SparseTensor(indices, values, dense_shape=tf.shape(labels, out_type=tf.int64))

Here's what I got:

For `sparse_y_train`

:

SparseTensor(indices=Tensor("SparseTensor/indices:0", shape=(33, 2), dtype=int64), values=Tensor("SparseTensor/values:0", shape=(33,), dtype=int32), dense_shape=Tensor("SparseTensor/dense_shape:0", shape=(2,), dtype=int64))

For `sparse_labels`

:

SparseTensor(indices=Tensor("Where:0", shape=(?, 2), dtype=int64), values=Tensor("GatherNd:0", shape=(?,), dtype=int32), dense_shape=Tensor("Shape:0", shape=(2,), dtype=int64))

Which leads me to think that ctc_loss can't seem to handle `SparseTensors`

as labels with dynamic shapes.

Answer:

Yes. It is possible to convert a tensor to a sparse tensor and back:

Let `sparse`

be a sparse tensor and `dense`

be a dense tensor.

**From sparse to dense:**

dense = tf.sparse_to_dense(sparse.indices, sparse.shape, sparse.values)

**From dense to sparse:**

zero = tf.constant(0, dtype=tf.float32) where = tf.not_equal(dense, zero) indices = tf.where(where) values = tf.gather_nd(dense, indices) sparse = tf.SparseTensor(indices, values, dense.shape)

Question:

I'm trying to train a neural network using Keras and Tensorflow backend. My `X`

is text descriptions which I have processed and transformed into sequences. Now, my `y`

is a sparse matrix since it's a multi-label classification and I have many output classes.

>>> y <30405x3387 sparse matrix of type '<type 'numpy.int64'>' with 54971 stored elements in Compressed Sparse Row format>

To train the model, I tried defining a batch generator:

def batch_generator(x, y, batch_size=32): n_batches_per_epoch = x.shape[0]//batch_size for i in range(n_batches_per_epoch): index_batch = range(x.shape[0])[batch_size*i:batch_size*(i+1)] x_batch = x[index_batch,:] y_batch = y[index_batch,:].todense() yield x_batch, np.array(y_batch)

I've divided my data as:

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

I define my model as:

model = Sequential() # Create architecture, add some layers. model.add(Dense(num_classes)) model.add(Activation('sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

And I'm training my model as:

model.fit_generator(generator=batch_generator(x_train, y_train), steps_per_epoch=(x_train[0]/32), epochs=200, callbacks=the_callbacks)

But my model starts with around 55% accuracy and it quickly (in 2 or 3 steps) becomes 99.95%, which makes no sense at all. Am I doing something wrong?

Answer:

You'll need to switch your loss to "categorical_crossentropy" or change your metric to "crossentropy" for multiclass classification.

The "accuracy" metric is actually ambiguous behind the scenes in Keras- it picks binary or multiclass accuracy based on the loss function used.

https://github.com/keras-team/keras/blob/master/keras/engine/training.py#L375