## Hot questions for Using Neural networks in ruby

Question:

I want to train a neural network with the sine() function.

Currently I use this code and the (cerebrum gem):

require 'cerebrum' input = Array.new 300.times do |i| inputH = Hash.new inputH[:input]=[i] sinus = Math::sin(i) inputH[:output] = [sinus] input.push(inputH) end network = Cerebrum.new network.train(input, { error_threshold: 0.00005, iterations: 40000, log: true, log_period: 1000, learning_rate: 0.3 }) res = Array.new 300.times do |i| result = network.run([i]) res.push(result[0]) end puts "#{res}"

But it does not work, if I run the trained network I get some weird output values (instead of getting a part of the sine curve).

So, what I am doing wrong?

Answer:

Cerebrum is a very basic and slow NN implementation. There are better options in Ruby, such as `ruby-fann`

gem.

Most likely your problem is the network is too simple. You have not specified any hidden layers - it looks like the code assigns a default hidden layer with 3 neurons in it for your case.

Try something like:

network = Cerebrum.new({ learning_rate: 0.01, momentum: 0.9, hidden_layers: [100] })

and expect it to take forever to train, plus still not be very good.

Also, your choice of 300 outputs is too broad - to the network it will look mostly like noise and it won't interpolate well between points. A neural network does not somehow figure out "oh, that must be a sine wave" and match to it. Instead it interpolates between the points - the clever bit happens when it does so in multiple dimensions at once, perhaps finding structure that you could not spot so easily with a manual inspection. To give it a reasonable chance of learning something, I suggest you give it much denser points e.g. where you currently have `sinus = Math::sin(i)`

instead use:

sinus = Math::sin(i.to_f/10)

That's still almost 5 iterations through the sine wave. Which should hopefully be enough to prove that the network can learn an arbitrary function.

Question:

After my previous attempt, I managed to train a neural network to express the sine function. I used the ai4r Ruby gem:

require 'ai4r' srand 1 net = Ai4r::NeuralNetwork::Backpropagation.new([1, 60, 1]) net.learning_rate = 0.01 #net.propagation_function = lambda { |x| 1.0 / ( 1.0 + Math::exp( -x ) ) } def normalise(x, xmin, xmax, ymin, ymax) xrange = xmax - xmin yrange = ymax - ymin return ymin + (x - xmin) * (yrange.to_f / xrange) end training_data = Array.new test = Array.new i2 = 0.0 320.times do |i| i2 += 0.1 hash = Hash.new output = Math.sin(i2.to_f) input = i2.to_f hash.store(:input,[normalise(input,0.0,32.0,0.0,1.0)]) hash.store(:expected_result,[normalise(output,-1.0,1.0,0.0,1.0)]) training_data.push(hash) test.push([normalise(output,-1.0,1.0,0.0,1.0)]) end puts "#{test}" puts "#{training_data}" time = Time.now 999999.times do |i| error = 0.0 training_data.each do |d| error+=net.train(d[:input], d[:expected_result]) end if error < 0.26 break end print "Times: #{i}, error: #{error} \r" end time2 = Time.now puts "#{time2}-#{time} = #{time2-time} Sekunden gebraucht." serialized = Marshal.dump(net) File.open("net.saved", "w+") { |file| file.write(serialized) }

Everything worked out fine. The network was trained in 4703.664857 seconds.

The network will be trained much faster when I normalise the input/output to a number between 0 and 1. `ai4r`

uses a sigmoid function, so it's clear that it does not output negative values. But why do I have to normalise the input values? Does this kind of neural network only accept input values < 1?

In the sine example, is it possible to input any number as in:

Input: -10.0 -> Output: 0.5440211108893699 Input: 87654.322 -> Output: -0.6782453567239783 Input: -9878.923 -> Output: -0.9829544956991526

or do I have to define the range?

Answer:

In your structure you have 60 hidden nodes after a single input. This means that each hidden node has only 1 learned weight for a total of 60 values learned. The connection from the hidden layer to the single output node likewise has 60 weights, or learned values. This gives a total of 120 possible learnable dimensions.

Image what each node in the hidden layer is capable of learning: there is a single scaling factor, then a non-linearity. Let's assume that your weights end up looking like:

`[1e-10, 1e-9, 1e-8, ..., .1]`

with each entry being the weight of a node in the hidden layer. Now if you pass in the number 1 to your network your hidden layer will output something to this effect:

`[0, 0, 0, 0, ..., .1, .25, .5, .75, 1]`

(roughly speaking, not actually calculated)

Likewise if you give it something large, like: 1e10 then the first layer would give:

`[0, .25, .5, .75, 1, 1, 1, ..., 1]`

.

The weights of your hidden layer are going to learn to separate in this fashion to be able to handle a large range of inputs by scaling them to a smaller range. The more hidden nodes you have (in that first layer), the less far each node has to separate. In my example they are spaced out by a factor of ten. If you had 1000's, they would be spaced out by a factor of maybe 2.

By normalizing the input range to be between [0,1], you are restricting how far those hidden nodes need to separate before they can start giving meaningful information to the final layer. This allows for faster training (assuming your stopping condition is based on change in loss).

So to directly answer your questions: No, you do not *need* to normalize, but it certainly helps speed up training by reducing the variability and size of the input space.

Question:

This is a litle modified sample program I took from FANN website.

The equation I created is c = pow(a,2) + b.

**Train.c**

#include "fann.h" int main() { const unsigned int num_input = 2; const unsigned int num_output = 1; const unsigned int num_layers = 4; const unsigned int num_neurons_hidden = 3; const float desired_error = (const float) 0.001; const unsigned int max_epochs = 500000; const unsigned int epochs_between_reports = 1000; struct fann *ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output); fann_set_activation_function_hidden(ann, FANN_SIGMOID_SYMMETRIC); fann_set_activation_function_output(ann, FANN_SIGMOID_SYMMETRIC); fann_train_on_file(ann, "sample.data", max_epochs, epochs_between_reports, desired_error); fann_save(ann, "sample.net"); fann_destroy(ann); return 0; }

**Result.c**

#include <stdio.h> #include "floatfann.h" int main() { fann_type *calc_out; fann_type input[2]; struct fann *ann = fann_create_from_file("sample.net"); input[0] = 1; input[1] = 1; calc_out = fann_run(ann, input); printf("sample test (%f,%f) -> %f\n", input[0], input[1], calc_out[0]); fann_destroy(ann); return 0; }

I created my own dataset

dataset.rb

f= File.open("sample.data","w") f.write("100 2 1\n") i=0 while i<100 do first = rand(0..100) second = rand(0..100) third = first ** 2 + second string1 = "#{first} #{second}\n" string2 = "#{third}\n" f.write(string1) f.write(string2) i=i+1 end f.close

**sample.data**

100 2 1 95 27 9052 63 9 3978 38 53 1497 31 84 1045 28 56 840 95 80 9105 10 19 ... ...

sample data first line gives number of samples, number of inputs and last number of outputs.

But I am getting an error
`FANN Error 20: The number of output neurons in the ann (4196752) and data (1) don't match Epochs`

What's the issue here? How does it calculate `4196752`

neurons?

Answer:

Here, using fann_create_standard, the function signature is `fann_create_standard(num_layers, layer1_size, layer2_size, layer3_size...)`

, whilst you are trying to use it differently:

struct fann *ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output);

you construct a network with 4 layers, but only provide data for 3. The 4196752 neurons in the output layer are likely coming from an undefined value.