Hot questions for Using Neural networks in chatbot

Question:

I am working on a generative chatbot based on seq2seq in Keras. I used code from this site: https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/

My models looks like this:

# define training encoder
encoder_inputs = Input(shape=(None, n_input))
encoder = LSTM(n_units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]

# define training decoder
decoder_inputs = Input(shape=(None, n_output))
decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(n_output, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# define inference encoder
encoder_model = Model(encoder_inputs, encoder_states)

# define inference decoder
decoder_state_input_h = Input(shape=(n_units,))
decoder_state_input_c = Input(shape=(n_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs [decoder_outputs] + decoder_states)

This neural network is designed to work with one hot encoded vectors, and input to this network seems for example like this:

[[[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]]
  [[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]]]

How can I rebuild these models to work with words? I would like to use word embedding layer, but I have no idea how to connect embedding layer to these models.

My input should be [[1,5,6,7,4], [4,5,7,5,4], [7,5,4,2,1]] where int numbers are representations of words.

I tried everything but I'm still getting errors. Can you help me, please?


Answer:

I finally done it. Here is the code:

Shared_Embedding = Embedding(output_dim=embedding, input_dim=vocab_size, name="Embedding")

encoder_inputs = Input(shape=(sentenceLength,), name="Encoder_input")
encoder = LSTM(n_units, return_state=True, name='Encoder_lstm') 
word_embedding_context = Shared_Embedding(encoder_inputs) 
encoder_outputs, state_h, state_c = encoder(word_embedding_context) 
encoder_states = [state_h, state_c] 
decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True, name="Decoder_lstm")

decoder_inputs = Input(shape=(sentenceLength,), name="Decoder_input")
word_embedding_answer = Shared_Embedding(decoder_inputs) 
decoder_outputs, _, _ = decoder_lstm(word_embedding_answer, initial_state=encoder_states) 
decoder_dense = Dense(vocab_size, activation='softmax', name="Dense_layer") 
decoder_outputs = decoder_dense(decoder_outputs) 

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

encoder_model = Model(encoder_inputs, encoder_states) 

decoder_state_input_h = Input(shape=(n_units,), name="H_state_input") 
decoder_state_input_c = Input(shape=(n_units,), name="C_state_input") 
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c] 
decoder_outputs, state_h, state_c = decoder_lstm(word_embedding_answer, initial_state=decoder_states_inputs) 
decoder_states = [state_h, state_c] 
decoder_outputs = decoder_dense(decoder_outputs)

decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

"model" is training model encoder_model and decoder_model are inference models

Question:

After training my model for almost 2 days 3 files were generated:

best_model.ckpt.data-00000-of-00001
best_model.ckpt.index
best_model.ckpt.meta

where best_model is my model name. When I try to import my model using the following command

with tf.Session() as sess:
  saver = tf.train.import_meta_graph('best_model.ckpt.meta')
  saver.restore(sess, "best_model.ckpt")

I get the following error

Traceback (most recent call last):

File "<stdin>", line 2, in <module>
File "/home/shreyash/.local/lib/python2.7/site-

packages/tensorflow/python/training/saver.py", line 1577, in 
import_meta_graph
    **kwargs)
  File "/home/shreyash/.local/lib/python2.7/site-

packages/tensorflow/python/framework/meta_graph.py", line 498, in import_scoped_meta_graph
    producer_op_list=producer_op_list)

File "/home/shreyash/.local/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 259, in import_graph_def
    raise ValueError('No op named %s in defined operations.' % node.op)

ValueError: No op named attn_add_fun_f32f32f32 in defined operations.

How to fix this?

I have referred this post: TensorFlow, why there are 3 files after saving the model?

  • Tensorflow version 1.0.0 installed using pip
  • Linux version 16.04
  • python 2.7

Answer:

The importer can't find a very specific function in your graph, namely attn_add_fun_f32f32f32, which is likely to be one of attention functions.

Probably you've stepped into this issue. However, they say it's bundled in tensorflow 1.0. Double check that installed tensorflow version contains attention_decoder_fn.py (or, if you are using another library, check that it's there).

If it's there, here are your options:

  • Rename this operation, if possible. You might want to read this discussion for workarounds.
  • Duplicate your graph definition, so that you won't have to call import_meta_graph, but restore the model into the current graph.

Question:

I'm trying to create a validation_set for this chatbot tutorial: Contextual Chatbots with Tensorflow

But I'm having issues with the shape of my data, this is the method I'm using to create both my train and validation sets:

words = []
classes = []
documents = []
ignore_words = ['?']
# loop through each sentence in our intents patterns
for intent in intents['intents']:
    for pattern in intent['patterns']:
        # tokenize each word in the sentence
        w = nltk.word_tokenize(pattern)
        # add to our words list
        words.extend(w)
        # add to documents in our corpus
        documents.append((w, intent['tag']))
        # add to our classes list
        if intent['tag'] not in classes:
            classes.append(intent['tag'])

# stem and lower each word and remove duplicates
words = [stemmer.stem(w.lower()) for w in words if w not in ignore_words]
words = sorted(list(set(words)))

# remove duplicates
classes = sorted(list(set(classes)))


# create our training data
training = []
output = []
# create an empty array for our output
output_empty = [0] * len(classes)

# training set, bag of words for each sentence
for doc in documents:
    # initialize our bag of words
    bag = []
    # list of tokenized words for the pattern
    pattern_words = doc[0]
    # stem each word
    pattern_words = [stemmer.stem(word.lower()) for word in pattern_words]
    # create our bag of words array
    for w in words:
        bag.append(1) if w in pattern_words else bag.append(0)

    # output is a '0' for each tag and '1' for current tag
    output_row = list(output_empty)
    output_row[classes.index(doc[1])] = 1

    training.append([bag, output_row])

# shuffle our features and turn into np.array
random.shuffle(training)
training = np.array(training)

# create train and test lists
x = list(training[:,0])
y = list(training[:,1])

I run this two times with different data and get my training and validation sets. The problem is that I initiate my tensorflow with the shape of my training set:

 net = tflearn.input_data(shape=[None, len(train_x[0])])

So when I go fit the model:

model.fit(train_x, train_y, n_epoch=1000,snapshot_step=100, snapshot_epoch=False, validation_set=(val_x,val_y), show_metric=True)

I get this error:

ValueError: Cannot feed value of shape (23, 55) for Tensor 'InputData/X:0', which has shape '(?, 84)'

Where 23 is the number of questions and 55 the number of unique words of my validation set. And 84 is the number of unique words in the training set.

Because my validation set has a different number of questions/unique words from my training set, I cant validate my training.

Can someone help me creating a valid validation set that indepents from the number of questions? I'm new to Tensorflow and Tflearn so any help would be great.


Answer:

To the best of my understanding, this is what you did: You created a dictionary called words which contains all possible words in a dataset. Then while creating a training dataset, you searched each word of a question in that dictionary words and if it was there you added 1 to your bag of words and 0 otherwise. Issue here is that each question will have different number of words and hence different number of 1's and 0's.

You can get around it by doing the reverse thing: Search each word of the dictionary words in that question of training set and if it's there, add 1 to your bag of words and 0 otherwise. This way all questions will become of same length(=length of dictionary words). Your training set will have dimension now (num_of_questions_in_training, len(words).

Same thing can be done for validation set too: Search each word of the dictionary words in that question of validation set and if it's there, add 1 to your bag of words and 0 otherwise. Again, this way, your validation set will have dimension now (num_of_questions_in_validation, len(words) which solves your problem of dimension mismatch.

So assuming there are 90 words in words, training_set_shape: (?,90), validation_set_shape: (23,90).