Hot questions for Using Neural networks in simulation

Question:

My question is about spiking neural networks. Input of a typical spiking neuron is usually some floating point value, representing its inflow current, typically expressed in mA or similar units, like in a following simple example:

static const float 
      dt = 1.0/1000,  // sampling period
      gL = 0.999,     // leak conductance
      vT = 30.0;      // spiking voltage threshold
float mV = 0;         // membrane voltage

// Leaky integrate-and-fire neuron model step
bool step_lif_neuron(float I) { // given input current "I", returns "true" if neuron had spiked
    mV += (I - mV*gL)*dt;
    if( mV > vT ) { // reset? heaviside function is non-differentiable and discontinuous
        mV = 0;
        return true;
    }
    return false;
}

That is fine, if its purpose is to determine relation of input image to some class, or to turn motor or lamp on or off.. But here comes principal problem: a model does not describe neuron interconnection. We cannot connect a neuron to next neuron, as it usually happens inside brain.

How does one convert a bool isSpiked value of preceeding neuron to float I input value of next neuron?


Answer:

This isn't a typical SO question, but here's the answer.

Of course your model doesn't answer your question as it is a neuron model. For connections (synapses in the brain, or elsewhere), you need a model for synapses. In biology, a presynaptic spike (i.e. an "input spike" to a synapse) causes a time-dependent change of the postsynaptic membrane conductance. The shape of this conductance change in biological approximately has a so-called double exponential shape:

where the presynaptic spike occured at time 0.

This conductance change leads to a (time-dependent) current into the postsynaptic neuron (i.e. the neuron receiving the input). For simplicity, many models model the input current directly. The common shapes are

  • double exponential (realistic)
  • alpha (similar to double exponential)
  • exponential (simpler and still captures the most important property)
  • rectangular (simpler, and convenient for theoretical models)
  • delta shaped (simplest, just a single pulse for one time step).

Here's a comparison scaled for same height at max:

and scaled for same overall current (so integral over the time course):

So how does a spike lead to an input current in another neuron in spiking NN models?

Assuming you model currents directly, you need to make a choice of the time course of the current which you want to use in your model. Then, every time a neuron spikes, you inject a current of the shape you chose into the connected neuron.

As an example, using exponential currents: the postsynaptic neuron has a variable I_syn which gives the synaptic input, each time a presynaptic neuron spikes, it is incremented by the weight of the connection, in all other time steps it decays exponentially with time constant of the synapse (the decay of the exponential).

Pseudocode:

// processing at time step t
I_syn *= exp(-delta_t / tau_synapse)  // delta_t is your simulation time step

foreach presynaptic_spike of neuron j:
   I_syn += weight_of_connection(j)

The topic isn't answered with a plot or two, or with a single equation. I just wanted to point out the main concepts. You can find more details the Compuational Neuroscience textbook of your choice, e.g. in Gerstner's Neuronal Dynamics (which is available via website).

Question:

I am attempting to train a neural network to control a simple entity in a simulated 2D environment, currently by using a genetic algorithm.

Perhaps due to lack of familiarity with the correct terms, my searches have not yielded much information on how to treat fitness and training in cases where all the following conditions hold:

  • There is no data available on correct outputs for given inputs.
  • A performance evaluation can only be made after an extended period of interaction with the environment (with continuous controller input/output invocation).
  • There is randomness inherent in the system.

Currently my approach is as follows:

  • The NN inputs are instantaneous sensor readings of the entity and environment state.
  • The outputs are instantaneous activation levels of its effectors, for example, a level of thrust for an actuator.
  • I generate a performance value by running the simulation for a given NN controller, either for a preset period of simulation time, or until some system state is reached. The performance value is then assigned as appropriate based on observations of behaviour/final state.
  • To prevent over-fitting, I repeat the above a number of times with different random generator seeds for the system, and assign a fitness using some metric such as average/lowest performance value.
  • This is done for every individual at every generation. Within a given generation, for fairness each individual will use the same set of random seeds.

I have a couple of questions.

  1. Is this a reasonable, standard approach to take for such a problem? Unsurprisingly it all adds up to a very computationally expensive process. I'm wondering if there are any methods to avoid having to rerun a simulation from scratch every time I produce a fitness value.

  2. As stated, the same set of random seeds is used for the simulations for each individual in a generation. From one generation to the next, should this set remain static, or should it be different? My instinct was to use different seeds each generation to further avoid over-fitting, and that doing so would not have an adverse effect on the selective force. However, from my results, I'm unsure about this.


Answer:

It is a reasonable approach, but genetic algorithms are not known for being very fast/efficient. Try hillclimbing and see if that is any faster. There are numerous other optimization methods, but nothing is great if you assume the function is a black box that you can only sample from. Reinforcement learning might work.

Using random seeds should prevent overfitting, but may not be necessary depending on how representative a static test is of average, and how easy it is to overfit.

Question:

I apply the below described neural network using a training dataset "two4" which is also visible below. The dataset has 150370 rows.

from keras.models import Sequential
from keras.layers import Dense
from sklearn.cross_validation import train_test_split
import numpy
from sklearn.preprocessing import StandardScaler
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

dataset = numpy.loadtxt("two4.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:22]
scaler = StandardScaler()
X = scaler.fit_transform(X)
Y = dataset[:,22]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33,random_state=seed)
# create model
model = Sequential()
model.add(Dense(12, input_dim=22, init='uniform', activation='relu'))
model.add(Dense(12, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=30, batch_size=10)

After i execute the simulation, it breaks down all the time and the error i get looks like:

 30810/100747 [========>.....................]Traceback (most recent call last):.9989    

  File "<ipython-input-1-adb3fdf3bae0>", line 1, in <module>
    runfile('C:/Users/Dimitris/Desktop/seventh experiment configuration/feedforward_net.py', wdir='C:/Users/Dimitris/Desktop/seventh experiment configuration')

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "C:/Users/Dimitris/Desktop/seventh experiment configuration/feedforward_net.py", line 26, in <module>
    model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=30, batch_size=10)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\keras\models.py", line 432, in fit
    sample_weight=sample_weight)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\keras\engine\training.py", line 1106, in fit
    callback_metrics=callback_metrics)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\keras\engine\training.py", line 830, in _fit_loop
    callbacks.on_batch_end(batch_index, batch_logs)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\keras\callbacks.py", line 60, in on_batch_end
    callback.on_batch_end(batch, logs)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\keras\callbacks.py", line 188, in on_batch_end
    self.progbar.update(self.seen, self.log_values)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\keras\utils\generic_utils.py", line 119, in update
    sys.stdout.write(info)

  File "C:\Users\Dimitris\Anaconda2\envs\keras_env\lib\site-packages\ipykernel\iostream.py", line 317, in write
    self._buffer.write(string)

ValueError: I/O operation on closed file

Do you have any idea what might cause the error?


Answer:

Your problem comes from sending to much data to standard IO port in Spyder. This closes it. Try to set:

history = model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=30, batch_size=10, verbose=0)

Now you could recover epoch metrics values from e.g.:

epoch_loss = history.history["loss"]

A history.history dict stores all training statistics saved each epoch.

Question:

I am writing a C# Windows Forms Application which simulates a simple environment (grid) with two types of objects: plants and herbivores. The herbivores have neural networks which take contents the surrounding few cells as input that decide which direction to move in. The idea is to train the herbivores to eat the plants using a fitness function and a genetic algorithm.

My problem is that if there is nothing surrounding a herbivore, it will decide to move in a particular direction, then, if there is still nothing around it, it will move in the same direction again. What I end up with is a few herbivores that just move in strait lines and never actually encounter any plants at all.

Would adding a clock signal as an input (with each bit as an individual input to the neural network) change this behavior or is this not recommended? I have also thought about adding an input which is just random data (from a Gaussian distribution) to add some unpredictability, but I don't know if this would help or harm the problem. Another idea I am not sure about is if maybe having inputs for the past few moves (as a sort of memory) might solve this issue.


Answer:

I think you need a Recurrent Network. You can keep track of the last N decisions the network has made and then use them as extra inputs to your network so it will have some sort of knowledge about where it was going and for how long. It could at some point evolve in such a way that it starts doing some sort of path finding.

Question:

I'm a junior in neural networks and I have a NN that is trained to fit the input data with the target data and then simulate the NN on a new sample data to get a prediction output.

The problem is the output is normalized values that are between "zero" and "one" and I need to transform (denormalize) them to real values like "decimals".

Could you explain how to do this?

I've read that I have to use an activation function, but I didn't understand how to do this.


Answer:

When the training set was created and you normalize the output values, you probably used min-max normalization (or mean-std normalization):

z = (x - min) / (max - min)

Where z is the normalized output. To get the unnormalized value, you just have to store the min and max values used for normalization, then invert the equation:

x = (max - min) * z + min

For other kinds of normalization, the same procedure is done. Just remember that the normalization factors have to be obtained from the original training set.

Question:

I am trying to create an AI based on the idea of Neurons and Synaptic Connections. This is similar to a Neural Network but different in the sense that it does not use a trial and error like system such as back propagation. I want the AI to form connections between Neurons and then each time the connection is traveled down, it grows in size making it easier for the next time it is fired.

As I understand from researching this, this is how the human mind works. A Neuron fires a connection to another Neuron and that connection grows. This is how habits and addictions form. The paths eventually get so large in size that the firing of that connection becomes almost automatic.

The problem is, the only way I can think of this is with a number. I have an double that represents the size of the Synaptic Connection. Each time the connection is fired, the number increases. I would then just sort the list of connections by the size of the connection.

However, this only solves the problem of priority. It allows the AI to choose which Neuron to go to but I'm struggling in how to make the connection faster or slower based on that number.

I want the AI to be able to have the ability of having trouble recalling information. Yes I know this sounds like a strange thing to want but the goal of this project is to have behavior as close to a human as possible... and we definitely have trouble recalling information at times.

This means that, at times, the AI should have to think about something for a minute before figuring something out. It shouldn't be able to fire in the same amount of time as every single other connection.

My current implementation of a Synaptic Connection is in this class: https://github.com/ianbro/Adamation/blob/master/src/com/ianmann/mind/NeuralPathway.java.

Please let me know if you have any ideas how to implement the difference in speed of Synaptic Connection firing.

---- Update ----

Lets say I ask him what his favorite ice cream flavor is. For me, I don't really have one so I might have to think about it for a while. But if I just had a simple input => output system, then the AI would just grab the highest connection between an icecream flavor and the idea of liking something. But if that number is low, then that might mean he doesn't necessarily have a favorite, like in my case. So he would have to think about it because the connection to be fired that contains the answer is not found easily.


Answer:

I'm not sure whether I grasp what you are trying to do here, but no matter! The obvious (to me) way to speed or slow a connection is to place intermediate steps. Each step is associated with a delay (simply the extra time to compute bridging the connection in a simple model). To speed the connection remove the intermediate steps and so on...