## Hot questions for Using Neural networks in non linear regression

Question:

Is there any difference in the architecture of a neural net for regression (time series prediction) and for classification?

I did some regression testing but I get quite bad results.

I'm currently using a basic feed forward net, with one hidden layer with 2 to 4 neurons, `tanh`

activation function and momentum.

Answer:

It depends on a lot of factors :

In case of classification you can have a binary classification problem (where you want to discriminate between two classes) or multinomial classification problem. In both cases you could use different architectures for achieving the goal of the best data modeling.

In case of sequence regression - you could also use loads of different architectures - starting from normal feedforward network which receives one series as input and returns second as output to a lot different recurent architectures.

So the question you asked is similiar to : are tools useful for building cars different from tools useful for building bridges - it's too ambiguous and you have to specify more details.

Question:

I don't understand why my code wouldn't run. I started with the TensorFlow tutorial to classify the images in the mnist data set using a single layer feedforward neural net. Then modified the code to create a multilayer perceptron that maps out 37 inputs to 1 output. The input and output training data are being loaded from Matlab data file (.mat)

Here is my code..

from __future__ import absolute_import from __future__ import division from __future__ import print_function from scipy.io import loadmat %matplotlib inline import tensorflow as tf from tensorflow.contrib import learn import sklearn import numpy as np import matplotlib.pyplot as plt import seaborn as sns from warnings import filterwarnings filterwarnings('ignore') sns.set_style('white') from sklearn import datasets from sklearn.preprocessing import scale from sklearn.cross_validation import train_test_split from sklearn.datasets import make_moons X = np.array(loadmat("Data/DataIn.mat")['TrainingDataIn']) Y = np.array(loadmat("Data/DataOut.mat")['TrainingDataOut']) X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.5) total_len = X_train.shape[0] # Parameters learning_rate = 0.001 training_epochs = 500 batch_size = 10 display_step = 1 dropout_rate = 0.9 # Network Parameters n_hidden_1 = 19 # 1st layer number of features n_hidden_2 = 26 # 2nd layer number of features n_input = X_train.shape[1] n_classes = 1 # tf Graph input X = tf.placeholder("float32", [None, 37]) Y = tf.placeholder("float32", [None]) def multilayer_perceptron(X, weights, biases): # Hidden layer with RELU activation layer_1 = tf.add(tf.matmul(X, weights['h1']), biases['b1']) layer_1 = tf.nn.relu(layer_1) layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']) layer_2 = tf.nn.relu(layer_2) # Output layer with linear activation out_layer = tf.matmul(layer_2, weights['out']) + biases['out'] return out_layer # Store layers weight & bias weights = { 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1], 0, 0.1)), 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], 0, 0.1)), 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes], 0, 0.1)) } biases = { 'b1': tf.Variable(tf.random_normal([n_hidden_1], 0, 0.1)), 'b2': tf.Variable(tf.random_normal([n_hidden_2], 0, 0.1)), 'out': tf.Variable(tf.random_normal([n_classes], 0, 0.1)) } # Construct model pred = multilayer_perceptron(X, weights, biases) tf.shape(pred) tf.shape(Y) print("Prediction matrix:", pred) print("Output matrix:", Y) # Define loss and optimizer cost = tf.reduce_mean(tf.square(pred-Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Launch the graph with tf.Session() as sess: sess.run(tf.global_variables_initializer()) # Training cycle for epoch in range(training_epochs): avg_cost = 0. total_batch = int(total_len/batch_size) print(total_batch) # Loop over all batches for i in range(total_batch-1): batch_x = X_train[i*batch_size:(i+1)*batch_size] batch_y = Y_train[i*batch_size:(i+1)*batch_size] # Run optimization op (backprop) and cost op (to get loss value) _, c, p = sess.run([optimizer, cost, pred], feed_dict={X: batch_x, Y: batch_y}) # Compute average loss avg_cost += c / total_batch # sample prediction label_value = batch_y estimate = p err = label_value-estimate print ("num batch:", total_batch) # Display logs per epoch step if epoch % display_step == 0: print ("Epoch:", '%04d' % (epoch+1), "cost=", \ "{:.9f}".format(avg_cost)) print ("[*]----------------------------") for i in xrange(5): print ("label value:", label_value[i], \ "estimated value:", estimate[i]) print ("[*]============================") print ("Optimization Finished!") # Test model correct_prediction = tf.equal(tf.argmax(pred), tf.argmax(Y)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) print ("Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))

when I run the code I get error messages:

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-4-6b8af9192775> in <module>() 93 # Run optimization op (backprop) and cost op (to get loss value) 94 _, c, p = sess.run([optimizer, cost, pred], feed_dict={X: batch_x, ---> 95 Y: batch_y}) 96 # Compute average loss 97 avg_cost += c / total_batch ~\AppData\Local\Continuum\Anaconda3\envs\ann\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) 787 try: 788 result = self._run(None, fetches, feed_dict, options_ptr, --> 789 run_metadata_ptr) 790 if run_metadata: 791 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) ~\AppData\Local\Continuum\Anaconda3\envs\ann\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 973 'Cannot feed value of shape %r for Tensor %r, ' 974 'which has shape %r' --> 975 % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) 976 if not self.graph.is_feedable(subfeed_t): 977 raise ValueError('Tensor %s may not be fed.' % subfeed_t) ValueError: Cannot feed value of shape (10, 1) for Tensor 'Placeholder_7:0', which has shape '(?,)'

Answer:

I've encountered this problem before. The difference is that a Tensor of shape `(10, 1)`

looks like `[[1], [2], [3]]`

, while a Tensor of shape `(10,)`

looks like `[1, 2, 3]`

.

You should be able to fix it by changing the line

Y = tf.placeholder("float32", [None])

to:

Y = tf.placeholder("float32", [None, 1])

Question:

My goal is to create a neural network with a single hidden layer (with ReLU activation) that is able to approximate a simple univariate square root function. I have implemented the network with numpy, also did a gradient check, everything seems to be fine, except for the result: for some reason I can only obtain linear approximations, like this: noisy sqrt approx

Tried changing the hyperparameters, without any success. Any ideas?

import numpy as np step_size = 1e-6 input_size, output_size = 1, 1 h_size = 10 train_size = 500 x_train = np.abs(np.random.randn(train_size, 1) * 1000) y_train = np.sqrt(x_train) + np.random.randn(train_size, 1) * 0.5 #initialize weights and biases Wxh = np.random.randn(input_size, h_size) * 0.01 bh = np.zeros((1, h_size)) Why = np.random.randn(h_size, output_size) * 0.01 by = np.zeros((1, output_size)) for i in range(300000): #forward pass h = np.maximum(0, np.dot(x_train, Wxh) + bh1) y_est = np.dot(h, Why) + by loss = np.sum((y_est - y_train)**2) / train_size dy = 2 * (y_est - y_train) / train_size print("loss: ",loss) #backprop at output dWhy = np.dot(h.T, dy) dby = np.sum(dy, axis=0, keepdims=True) dh = np.dot(dy, Why.T) #backprop ReLU non-linearity dh[h <= 0] = 0 #backprop Wxh, and bh dWxh = np.dot(x_train.T, dh) dbh = np.sum(dh1, axis=0, keepdims=True) Wxh += -step_size * dWxh bh += -step_size * dbh Why += -step_size * dWhy by += -step_size * dby

Edit: It seems the problem was the lack of normalization and the data being non-zero centered. After applying these transformation on the training the data, I have managed to obtain the following result: noisy sqrt2

Answer:

I can get your code to produce a sort of piecewise linear approximation:

if I zero-centre and normalise your input and output ranges:

# normalise range and domain x_train -= x_train.mean() x_train /= x_train.std() y_train -= y_train.mean() y_train /= y_train.std()

Plot is produced like so:

x = np.linspace(x_train.min(),x_train.max(),3000) y = np.dot(np.maximum(0, np.dot(x[:,None], Wxh) + bh), Why) + by import matplotlib.pyplot as plt plt.plot(x,y) plt.show()

Question:

I am trying to run a MLP regressor on my dataset with one hidden layer. I am doing a standardization of my data but I want to be clear as whether it matters if I do the standardization after or before splitting the dataset in Training and Test set. I want to know if there will be any difference in my prediction values if I carry out standardization before data split.

Answer:

Yes and no. If mean and variance of the training and test set are different, standardization can lead to a different outcome.

That being said, a good training and test set should be similar enough so that the data points are distributed in a similar way, and post-split standardization should give the same results.

Question:

I am new to **ML**, and I have a dataset:

Where:

X = {X_1, X_2, X_3, X_4, X_5, X_6, X_7}; Y = Y;

I'm trying to find the possible relationship between **X** and **Y** like **Y = M(X)** using **Deep Learning**. To my knowledge, this is a **regression** task since the data type of my target **Y** is real.

I have tried some regression algorithms like LMS and Stepwise regression but none of those gives me a promising result. So I'm turning into the deep neural network solution, so:

- Can
**ANN**do this regression task? - How to design the network, especially the type of layers, activation function, etc.?
- Is there some existing NN architecture I can refer to?

**Any help is appreciated.**

Answer:

I don't have a solution for the machine learning part, but I do have a solution that would maybe work (since you asked for any other solutions).

I will say it might be difficult to use machine learning, since not only do you need to find a relationship (assuming there is one), but you need to find the right type of model (is it linear, quadratic, exponential, sinusoidal, etc.) and then you need to find the parameters for those models.

In the R programming language, it is easy to set up a multiple linear regression, for example. Here is a sketch of the R code you would use to find a linear regression.

data = load("data.Rdata") # or load a table or something regression = lm(Y ~ X1 + X2 + X3 + X4 + X5 + X6 + x7, data = data) print(summary(regression))

Edit: you might get better answers here: https://datascience.stackexchange.com/