## Hot questions for Using Neural networks in scala

Question:

So I am using tensorboard within keras. In tensorflow one could use two different summarywriters for train and validation scalars so that tensorboard could plot them in a same figure. Something like the **figure** in

TensorBoard - Plot training and validation losses on the same graph?

Is there a way to do this in keras?

Thanks.

Answer:

To handle the validation logs with a separate writer, you can write a custom callback that wraps around the original `TensorBoard`

methods.

import os import tensorflow as tf from keras.callbacks import TensorBoard class TrainValTensorBoard(TensorBoard): def __init__(self, log_dir='./logs', **kwargs): # Make the original `TensorBoard` log to a subdirectory 'training' training_log_dir = os.path.join(log_dir, 'training') super(TrainValTensorBoard, self).__init__(training_log_dir, **kwargs) # Log the validation metrics to a separate subdirectory self.val_log_dir = os.path.join(log_dir, 'validation') def set_model(self, model): # Setup writer for validation metrics self.val_writer = tf.summary.FileWriter(self.val_log_dir) super(TrainValTensorBoard, self).set_model(model) def on_epoch_end(self, epoch, logs=None): # Pop the validation logs and handle them separately with # `self.val_writer`. Also rename the keys so that they can # be plotted on the same figure with the training metrics logs = logs or {} val_logs = {k.replace('val_', ''): v for k, v in logs.items() if k.startswith('val_')} for name, value in val_logs.items(): summary = tf.Summary() summary_value = summary.value.add() summary_value.simple_value = value.item() summary_value.tag = name self.val_writer.add_summary(summary, epoch) self.val_writer.flush() # Pass the remaining logs to `TensorBoard.on_epoch_end` logs = {k: v for k, v in logs.items() if not k.startswith('val_')} super(TrainValTensorBoard, self).on_epoch_end(epoch, logs) def on_train_end(self, logs=None): super(TrainValTensorBoard, self).on_train_end(logs) self.val_writer.close()

- In
`__init__`

, two subdirectories are set up for training and validation logs - In
`set_model`

, a writer`self.val_writer`

is created for the validation logs - In
`on_epoch_end`

, the validation logs are separated from the training logs and written to file with`self.val_writer`

Using the MNIST dataset as an example:

from keras.models import Sequential from keras.layers import Dense from keras.datasets import mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.reshape(60000, 784) x_test = x_test.reshape(10000, 784) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 model = Sequential() model.add(Dense(64, activation='relu', input_shape=(784,))) model.add(Dense(10, activation='softmax')) model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test), callbacks=[TrainValTensorBoard(write_graph=False)])

You can then visualize the two curves on a same figure in TensorBoard.

**EDIT:** I've modified the class a bit so that it can be used with eager execution.

The biggest change is that I use `tf.keras`

in the following code. It seems that the `TensorBoard`

callback in standalone Keras does not support eager mode yet.

import os import tensorflow as tf from tensorflow.keras.callbacks import TensorBoard from tensorflow.python.eager import context class TrainValTensorBoard(TensorBoard): def __init__(self, log_dir='./logs', **kwargs): self.val_log_dir = os.path.join(log_dir, 'validation') training_log_dir = os.path.join(log_dir, 'training') super(TrainValTensorBoard, self).__init__(training_log_dir, **kwargs) def set_model(self, model): if context.executing_eagerly(): self.val_writer = tf.contrib.summary.create_file_writer(self.val_log_dir) else: self.val_writer = tf.summary.FileWriter(self.val_log_dir) super(TrainValTensorBoard, self).set_model(model) def _write_custom_summaries(self, step, logs=None): logs = logs or {} val_logs = {k.replace('val_', ''): v for k, v in logs.items() if 'val_' in k} if context.executing_eagerly(): with self.val_writer.as_default(), tf.contrib.summary.always_record_summaries(): for name, value in val_logs.items(): tf.contrib.summary.scalar(name, value.item(), step=step) else: for name, value in val_logs.items(): summary = tf.Summary() summary_value = summary.value.add() summary_value.simple_value = value.item() summary_value.tag = name self.val_writer.add_summary(summary, step) self.val_writer.flush() logs = {k: v for k, v in logs.items() if not 'val_' in k} super(TrainValTensorBoard, self)._write_custom_summaries(step, logs) def on_train_end(self, logs=None): super(TrainValTensorBoard, self).on_train_end(logs) self.val_writer.close()

The idea is the same --

- Check the source code of
`TensorBoard`

callback - See what it does to set up the writer
- Do the same thing in this custom callback

Again, you can use the MNIST data to test it,

from tensorflow.keras.datasets import mnist from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from tensorflow.train import AdamOptimizer tf.enable_eager_execution() (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.reshape(60000, 784) x_test = x_test.reshape(10000, 784) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 y_train = y_train.astype(int) y_test = y_test.astype(int) model = Sequential() model.add(Dense(64, activation='relu', input_shape=(784,))) model.add(Dense(10, activation='softmax')) model.compile(loss='sparse_categorical_crossentropy', optimizer=AdamOptimizer(), metrics=['accuracy']) model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test), callbacks=[TrainValTensorBoard(write_graph=False)])

Question:

I'm trying to train a classifier via PyTorch. However, I am experiencing problems with training when I feed the model with training data.
I get this error on `y_pred = model(X_trainTensor)`

:

RuntimeError: Expected object of scalar type Float but got scalar type Double for argument #4 'mat1'

Here are key parts of my code:

# Hyper-parameters D_in = 47 # there are 47 parameters I investigate H = 33 D_out = 2 # output should be either 1 or 0

# Format and load the data y = np.array( df['target'] ) X = np.array( df.drop(columns = ['target'], axis = 1) ) X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 0.8) # split training/test data X_trainTensor = torch.from_numpy(X_train) # convert to tensors y_trainTensor = torch.from_numpy(y_train) X_testTensor = torch.from_numpy(X_test) y_testTensor = torch.from_numpy(y_test)

# Define the model model = torch.nn.Sequential( torch.nn.Linear(D_in, H), torch.nn.ReLU(), torch.nn.Linear(H, D_out), nn.LogSoftmax(dim = 1) )

# Define the loss function loss_fn = torch.nn.NLLLoss()

for i in range(50): y_pred = model(X_trainTensor) loss = loss_fn(y_pred, y_trainTensor) model.zero_grad() loss.backward() with torch.no_grad(): for param in model.parameters(): param -= learning_rate * param.grad

Answer:

Reference is from this github issue.

When the error is `RuntimeError: Expected object of scalar type Float but got scalar type Double for argument #4 'mat1'`

, you would need to use the `.float()`

function since it says `Expected object of scalar type Float`

.

Therefore, the solution is changing `y_pred = model(X_trainTensor)`

to `y_pred = model(X_trainTensor.float())`

.

Likewise, when you get another error for `loss = loss_fn(y_pred, y_trainTensor)`

, you need `y_trainTensor.long()`

since the error message says `Expected object of scalar type Long`

.

You could also do `model.double()`

, as suggested by @Paddy
.

Question:

import torch.nn as nn import torch import torch.optim as optim import itertools class net1(nn.Module): def __init__(self): super(net1,self).__init__() self.pipe = nn.Sequential( nn.Linear(10,10), nn.ReLU() ) def forward(self,x): return self.pipe(x.long()) class net2(nn.Module): def __init__(self): super(net2,self).__init__() self.pipe = nn.Sequential( nn.Linear(10,20), nn.ReLU(), nn.Linear(20,10) ) def forward(self,x): return self.pipe(x.long()) netFIRST = net1() netSECOND = net2() learning_rate = 0.001 opt = optim.Adam(itertools.chain(netFIRST.parameters(),netSECOND.parameters()), lr=learning_rate) epochs = 1000 x = torch.tensor([1,2,3,4,5,6,7,8,9,10],dtype=torch.long) y = torch.tensor([10,9,8,7,6,5,4,3,2,1],dtype=torch.long) for epoch in range(epochs): opt.zero_grad() prediction = netSECOND(netFIRST(x)) loss = (y.long() - prediction)**2 loss.backward() print(loss) print(prediction) opt.step()

error:

line 49, in prediction = netSECOND(netFIRST(x))

line 1371, in linear; output = input.matmul(weight.t())

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'mat2'

I don't really see what I'm doing wrong. I have tried to turn everything in a `Long`

in every possible way. I don't really get the way typing works for pytorch. Last time I tried something with just one layer and it forced me to use type `int`

.
Could someone explain how the typing is established in pytorch and how to prevent and fix errors like this??
A lot I mean an awful lot of thanks in advance, this problem really bothers me and I can't seem to fix it no matter what I try.

Answer:

The weights are Floats, the inputs are Longs. This is not allowed. In fact, I don't think torch supports anything else than Floats in neural networks.

If you remove *all* calls to long, and define your input as floats, it will work (it does, I tried).

(You will then get another unrelated error: you need to sum your loss)

Question:

Below is my implementation of a neural network with 1 input layer, two hidden layers and 1 output layer :

import breeze.linalg._ import breeze.math._ import breeze.numerics._ object NN extends App { //Forward propogation val x1 = DenseVector(1.0, 0.0, 1.0) val y1 = DenseVector(1.0, 1.0, 1.0) val theta1 = DenseMatrix((1.0, 1.0, 1.0), (1.0, 1.0, 0.0), (1.0, 0.0, 0.0)); val theta2 = DenseMatrix((1.0, 1.0, 1.0), (1.0, 1.0, 0.0), (1.0, 0.0, 0.0)); val theta3 = DenseMatrix((1.0, 1.0, 1.0), (1.0, 1.0, 0.0), (1.0, 0.0, 0.0)); val a1 = x1; val z2 = theta1 * a1; val a2 = (z2.map { x => 1 + sigmoid(x) }) val z3 = theta2 * a2; val a3 = (z3.map { x => 1 + sigmoid(x) }) val z4 = theta3 * a3; val a4 = (z4.map { x => 1 + sigmoid(x) }) //Back propagation val errorLayer4 = a4 - DenseVector(1.0, 1.0, 1.0) val errorLayer3 = (theta3.t * errorLayer4) :* (a3 :* (DenseVector(1.0, 1.0, 1.0) - a3)) val errorLayer2 = (theta2.t * errorLayer3) :* (a2 :* (DenseVector(1.0, 1.0, 1.0) - a2)) //Compute delta values val delta1 = errorLayer2 * a2.t val delta2 = errorLayer3 * a3.t val delta3 = errorLayer4 * a4.t //Gradient descent val m = 1 val alpha = .0001 val x = DenseVector(1.0, 0.0, 1.0) val y = DenseVector(1.0, 1.0, 1.0) val pz1 = delta1 - (alpha / m) * (x.t * (delta1 * x - y)) val p1z1 = (sigmoid(delta1 * x)) + 1.0 println(p1z1); val pz2 = delta2 - (alpha / m) * (x.t * (delta2 * x - y)) val p1z2 = (sigmoid(delta2 * p1z1)) + 1.0 println(p1z2); val pz3 = delta3 - (alpha / m) * (x.t * (delta3 * x - y)) val p1z3 = (sigmoid(delta3 * p1z2)) + 1.0 println(p1z3); }

The output of this network is :

Jun 03, 2016 7:47:50 PM com.github.fommil.netlib.BLAS <clinit> WARNING: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS Jun 03, 2016 7:47:50 PM com.github.fommil.jni.JniLoader liberalLoad INFO: successfully loaded C:\Users\Local\Temp\jniloader3606930058943197684netlib-native_ref-win-x86_64.dll DenseVector(2.0, 2.0, 1.9999999999946196) DenseVector(1.0, 1.0, 1.0000000064265646) DenseVector(1.9971047766732295, 1.9968279599465841, 1.9942769808711798)

I'm using a single training example 101 and output value of 111.
The predicted value given `1,0,1`

is `1.9,1.9,1.9`

when the predicted value should be 1,1,1 .

I think how I'm computing the sigmoid with bias is incorrect, should the bias +1 value be added after sigmoid calculation for layer , in other words use `{ x => sigmoid(x+1) }`

instead of `{ x => 1 + sigmoid(x) }`

?

Answer:

A perceptron-style neuron's output is sigmoid(sum(xi * wi)) where the bias input `x0`

is 1, but the weight is not necessarily 1. You definitely don't sum the 1 outside the sigmoid, but you also don't sum it inside. You need a weight. So it should be equivalent to

sigmoid(w0 + w1*x1 + w2*x2 + ...)

Question:

please consider the following code:

#update W1 = W1 - learningRate * dJdW1 W2 = W2 - learningRate * dJdW2

Where learningRate is double and dJdW1/dJdW1 2d matrices.

I'm getting this error:

ERROR: Runtime error in program block generated from statement block between lines 58 and 61 -- Error evaluating instruction: CP\xb0-*\xb0W2\xb7MATRIX\xb7DOUBLE\xb01.0E-5\xb7SCALAR\xb7DOUBLE\xb7true\xb0dJdW2\xb7MATRIX\xb7DOUBLE\xb0_mVar117\xb7MATRIX\xb7DOUBLE

EDIT 12.7.17:

plus this one...

ordinal not in range(128)'))

The whole DML can be found here

The complete Error can be found here

The whole jupyther notebook can be found here

Answer:

The cellwise scalar matrix operation is fine. Looking at your error, it says that your matrix/vector dimensions are not compatible:

: Block sizes are not matched for binary cell operations: 3x1 vs 2x3 org.apache.sysml.runtime.matrix.data.MatrixBlock.binaryOperations(MatrixBlock.java:2872) org.apache.sysml.runtime.instructions.cp.PlusMultCPInstruction.processInstruction(PlusMultCPInstruction.java:66) org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)

Looking at your Notebook, this comes from:

W2 = W2 - learningRate * dJdW2

W2 is initialized W2 = rand(rows=hiddenLayerSize,cols=outputLayerSize) as a 3x1 matrix, while dJDW2 is a 2x3 matrix.

Question:

The flow should be:

Input -> Word2Vectors -> Output -> NeuralNetwork

I have tried word2vec function of spark but I am confused with the format "MultilayerPerceptronClassifier" need as a input?

Answer:

When you define your `MultilayerPerceptronClassifier`

you have to give as parameter an `Array[Int]`

called `layers`

. These describe the number of neurons per layer in that sequence. The first layer's input dimension must match the length of the `Word2Vec`

output dimension. So you should set the parameter to

val layers = Array[Int](featureDim, 5, 4, 5, ...)

And replace the numbers with the parameters you want your model to have. You should set `featureDim`

to the length of the vectors your `Word2VecModel`

produces. Unfortunately, the attribute with that value is hidden via a `private`

accessor and there is no getter method implemented as of now.

Question:

When computing the delta values for a neural network after running back propagation :

the value of delta(1) will be a scalar value, it should be a vector ?

Update :

Taken from http://www.holehouse.org/mlclass/09_Neural_Networks_Learning.html

Specifically :

Answer:

First, you probably understand that in each layer, we have `n x m`

parameters (or weights) that needs to be learned so it forms a 2-d matrix.

n is the number of nodes in the current layer plus 1 (for bias) m is the number of nodes in the previous layer.

We have `n x m`

parameters because there is one connection between any of the two nodes between the previous and the current layer.

I am pretty sure that Delta (big delta) at layer L is used to accumulate partial derivative terms for every parameter at layer L. So you have a 2D matrix of Delta at each layer as well. To update the i-th row (the i-th node in the current layer) and j-th column (the j-th node in the previous layer) of the matrix,

D_(i,j) = D_(i,j) + a_j * delta_i note a_j is the activation from the j-th node in previous layer, delta_i is the error of the i-th node of the current layer so we accumulate the error proportional to their activation weight.

Thus to answer your question, Delta should be a matrix.