Hot questions for Using Neural networks in hdf5

Question:

I want to use caffe with a vector label, not integer. I have checked some answers, and it seems HDF5 is a better way. But then I'm stucked with error like:

accuracy_layer.cpp:34] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (50 vs. 200) Number of labels must match number of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.

with HDF5 created as:

f = h5py.File('train.h5', 'w')
f.create_dataset('data', (1200, 128), dtype='f8')
f.create_dataset('label', (1200, 4), dtype='f4')

My network is generated by:

def net(hdf5, batch_size):
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
    n.relu2 = L.ReLU(n.ip2, in_place=True)
    n.ip3 = L.InnerProduct(n.relu1, num_output=4, weight_filler=dict(type='xavier'))
    n.accuracy = L.Accuracy(n.ip3, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip3, n.label)
    return n.to_proto()

with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
f.write(str(net('/home/romulus/code/project/train.h5list', 50)))

with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
f.write(str(net('/home/romulus/code/project/test.h5list', 20)))

It seems I should increase label number and put things in integer rather than array, but if I do this, caffe complains number of data and label is not equal, then exists.

So, what is the correct format to feed multi label data?

Also, I'm so wondering why no one just simply write the data format how HDF5 maps to caffe blobs?


Answer:

Answer to this question's title:

The HDF5 file should have two dataset in root, named "data" and "label", respectively. The shape is (data amount, dimension). I'm using only one-dimension data, so I'm not sure what's the order of channel, width, and height. Maybe it does not matter. dtype should be float or double.

A sample code creating train set with h5py is:

import h5py, os
import numpy as np

f = h5py.File('train.h5', 'w')
# 1200 data, each is a 128-dim vector
f.create_dataset('data', (1200, 128), dtype='f8')
# Data's labels, each is a 4-dim vector
f.create_dataset('label', (1200, 4), dtype='f4')

# Fill in something with fixed pattern
# Regularize values to between 0 and 1, or SigmoidCrossEntropyLoss will not work
for i in range(1200):
    a = np.empty(128)
    if i % 4 == 0:
        for j in range(128):
            a[j] = j / 128.0;
        l = [1,0,0,0]
    elif i % 4 == 1:
        for j in range(128):
            a[j] = (128 - j) / 128.0;
        l = [1,0,1,0]
    elif i % 4 == 2:
        for j in range(128):
            a[j] = (j % 6) / 128.0;
        l = [0,1,1,0]
    elif i % 4 == 3:
        for j in range(128):
            a[j] = (j % 4) * 4 / 128.0;
        l = [1,0,1,1]
    f['data'][i] = a
    f['label'][i] = l

f.close()

Also, the accuracy layer is not needed, simply removing it is fine. Next problem is the loss layer. Since SoftmaxWithLoss has only one output (index of the dimension with max value), it can't be used for multi-label problem. Thank to Adian and Shai, I find SigmoidCrossEntropyLoss is good in this case.

Below is the full code, from data creation, training network, and getting test result:

main.py (modified from caffe lanet example)

import os, sys

PROJECT_HOME = '.../project/'
CAFFE_HOME = '.../caffe/'
os.chdir(PROJECT_HOME)

sys.path.insert(0, CAFFE_HOME + 'caffe/python')
import caffe, h5py

from pylab import *
from caffe import layers as L

def net(hdf5, batch_size):
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
    n.relu2 = L.ReLU(n.ip2, in_place=True)
    n.ip3 = L.InnerProduct(n.relu2, num_output=4, weight_filler=dict(type='xavier'))
    n.loss = L.SigmoidCrossEntropyLoss(n.ip3, n.label)
    return n.to_proto()

with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
    f.write(str(net(PROJECT_HOME + 'train.h5list', 50)))
with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
    f.write(str(net(PROJECT_HOME + 'test.h5list', 20)))

caffe.set_device(0)
caffe.set_mode_gpu()
solver = caffe.SGDSolver(PROJECT_HOME + 'auto_solver.prototxt')

solver.net.forward()
solver.test_nets[0].forward()
solver.step(1)

niter = 200
test_interval = 10
train_loss = zeros(niter)
test_acc = zeros(int(np.ceil(niter * 1.0 / test_interval)))
print len(test_acc)
output = zeros((niter, 8, 4))

# The main solver loop
for it in range(niter):
    solver.step(1)  # SGD by Caffe
    train_loss[it] = solver.net.blobs['loss'].data
    solver.test_nets[0].forward(start='data')
    output[it] = solver.test_nets[0].blobs['ip3'].data[:8]

    if it % test_interval == 0:
        print 'Iteration', it, 'testing...'
        correct = 0
        data = solver.test_nets[0].blobs['ip3'].data
        label = solver.test_nets[0].blobs['label'].data
        for test_it in range(100):
            solver.test_nets[0].forward()
            # Positive values map to label 1, while negative values map to label 0
            for i in range(len(data)):
                for j in range(len(data[i])):
                    if data[i][j] > 0 and label[i][j] == 1:
                        correct += 1
                    elif data[i][j] %lt;= 0 and label[i][j] == 0:
                        correct += 1
        test_acc[int(it / test_interval)] = correct * 1.0 / (len(data) * len(data[0]) * 100)

# Train and test done, outputing convege graph
_, ax1 = subplots()
ax2 = ax1.twinx()
ax1.plot(arange(niter), train_loss)
ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
ax2.set_ylabel('test accuracy')
_.savefig('converge.png')

# Check the result of last batch
print solver.test_nets[0].blobs['ip3'].data
print solver.test_nets[0].blobs['label'].data

h5list files simply contain paths of h5 files in each line:

train.h5list

/home/foo/bar/project/train.h5

test.h5list

/home/foo/bar/project/test.h5

and the solver:

auto_solver.prototxt

train_net: "auto_train.prototxt"
test_net: "auto_test.prototxt"
test_iter: 10
test_interval: 20
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "sed"
solver_mode: GPU

Converge graph:

Last batch result:

[[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
[ 42.69248581 -43.00864792 13.19664764 -3.35134125]
[ -1.36403108 1.38531208 2.77786589 -0.34310576]
[ 2.91686511 -2.88944006 4.34043217 0.32656598]
...
[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
[ 42.69248581 -43.00864792 13.19664764 -3.35134125]
[ -1.36403108 1.38531208 2.77786589 -0.34310576]
[ 2.91686511 -2.88944006 4.34043217 0.32656598]]

[[ 1. 0. 0. 0.]
[ 1. 0. 1. 0.]
[ 0. 1. 1. 0.]
[ 1. 0. 1. 1.]
...
[ 1. 0. 0. 0.]
[ 1. 0. 1. 0.]
[ 0. 1. 1. 0.]
[ 1. 0. 1. 1.]]

I think this code still has many things to improve. Any suggestion is appreciated.

Question:

I am finetuning a network. In a specific case I want to use it for regression, which works. In another case, I want to use it for classification.

For both cases I have an HDF5 file, with a label. With regression, this is just a 1-by-1 numpy array that contains a float. I thought I could use the same label for classification, after changing my EuclideanLoss layer to SoftmaxLoss. However, then I get a negative loss as so:

    Iteration 19200, loss = -118232
    Train net output #0: loss = 39.3188 (* 1 = 39.3188 loss)

Can you explain if, and so what, goes wrong? I do see that the training loss is about 40 (which is still terrible), but does the network still train? The negative loss just keeps on getting more negative.

UPDATE After reading Shai's comment and answer, I have made the following changes: - I made the num_output of my last fully connected layer 6, as I have 6 labels (used to be 1). - I now create a one-hot vector and pass that as a label into my HDF5 dataset as follows

    f['label'] = numpy.array([1, 0, 0, 0, 0, 0])        

Trying to run my network now returns

   Check failed: hdf_blobs_[i]->shape(0) == num (6 vs. 1)       

After some research online, I reshaped the vector to a 1x6 vector. This lead to the following error:

  Check failed: outer_num_ * inner_num_ == bottom[1]->count() (40 vs. 240) 
   Number of labels must match number of predictions; e.g., if softmax axis == 1 
   and prediction shape is (N, C, H, W), label count (number of labels) 
   must be N*H*W, with integer values in {0, 1, ..., C-1}.

My idea is to add 1 label per data set (image) and in my train.prototxt I create batches. Shouldn't this create the correct batch size?


Answer:

Since you moved from regression to classification, you need to output not a scalar to compare with "label" but rather a probability vector of length num-labels to compare with the discrete class "label". You need to change num_output parameter of the layer before "SoftmaxWithLoss" from 1 to num-labels.

I believe currently you are accessing un-initialized memory and I would expect caffe to crash sooner or later in this case.

Update: You made two changes: num_output 1-->6, and you also changed your input label from a scalar to vector. The first change was the only one you needed for using "SoftmaxWithLossLayer". Do not change label from a scalar to a "hot-vector".

Why? Because "SoftmaxWithLoss" basically looks at the 6-vector prediction you output, interpret the ground-truth label as index and looks at -log(p[label]): the closer p[label] is to 1 (i.e., you predicted high probability for the expected class) the lower the loss. Making a prediction p[label] close to zero (i.e., you incorrectly predicted low probability for the expected class) then the loss grows fast.


Using a "hot-vector" as ground-truth input label, may give rise to multi-category classification (does not seems like the task you are trying to solve here). You may find this SO thread relevant to that particular case.

Question:

I have a big dataset (300.000 examples x 33.000 features), which of course does not fit the memory. The data are saved in HDF5 format. The values are mostly zeros (sparse data). They look like this:

           Attr1    52  52  52  52  52  52  52  52 ...
           Attr2    umb umb umb umb umb umb umb umb ...
           CellID   TGC-1 TGG-1 CAG-1 TTC-1 GTG-1 GTA-1 CAA-1 CAC-1 ...

Acc     Gene                                      ...
243485  RP11-.3     0   0   0   0   0   0   0   0 ...
237613  FAM138A     0   0   0   0   0   0   0   0 ...
186092  OR4F5       0   0   0   0   0   0   0   0 ...
238009  RP11-.7     0   0   0   0   0   0   0   0 ...
239945  RP11-.8     0   0   0   0   0   0   0   0 ...
279457  FO538.2     0   0   0   0   0   0   0   0 ...
228463  AP006.2     0   0   0   0   0   0   0   0 ...
...     ...         ... ... ... ... ... ... ... ...

I have done the following that works, to load the whole dataset in TensorFlow (loompy is just a package using hdf5 on the background):

import tensorflow as tf
import numpy as np
import loompy as lp

batch_size = 1000

with loompy.connect(filename, 'r') as ds:
    ds_shape = (batch_size, ds.shape[0])
    ds_dtype = ds[0:1, 0:1].dtype

    labels = np.asarray([ds.ca.CellID, ds.ca.Attr1]).T
    labels_shape = (batch_size, 1)

data_placeholder = tf.placeholder(ds_dtype, ds_shape)
labels_placeholder = tf.placeholder(labels[:,1].dtype, labels_shape)

dataset = tf.data.Dataset.from_tensor_slices((data_placeholder, labels_placeholder))
dataset = dataset.prefetch(batch_size)
iterator = dataset.make_initializable_iterator()
next_element = iterator.get_next()

with tf.Session() as sess:
    with loompy.connect(filename, 'r') as ds:
        for i in range(0, ds.shape[1], batch_size):
            batch = ds[0 : ds_shape[1], i : i + batch_size].T
            batch_labels = np.asarray([ds.ca.CellID[i : i + batch_size],
                                       ds.ca.Attr1[i : i + batch_size]]).T[:,1]

            sess.run(iterator.initializer, feed_dict = {data_placeholder: batch,
                       labels_placeholder: batch_labels.reshape(batch_size, 1)})

            for _ in range(batch_size):
                print(sess.run(next_element))

Output:

(array([0, 0, 0, ..., 0, 0, 0], dtype=int32), array([b'52'], dtype=object))

(array([0, 0, 0, ..., 0, 0, 0], dtype=int32), array([b'52'], dtype=object))

...

This way however, I am not able to split my data in train, test and evaluation sets. Also, I can only shuffle them inside each batch, which is not effective since most times the data on a batch belong to the same class.

How do I manipulate this kind of data to be able to load them as train, test, evaluation sets, and perform shuffling etc. (preferably by utilizing my TitanX GPU as much as possible)?


Answer:

You should definitely try Dask, it allows you to work with data not fitting in memory and it paralyzes computation so that you can use all cores of your cpu. Also I recommend moving your data from hdf to parquet, it allows concurrent reads and writes which speeds things up. Please see the link where Wes McKinney (pandas creator) goes in depth and compares it with other formats.

You could prepare snippets in Dask that prepare train, test and validation sets and read them without exceeding available memory.

Question:

I have a dataset where the images have VARYING number of labels. The number of labels is between 1 and 5. There are 100 classes.

After googling, it seems like HDF5 db with slice layer can deal with multiple labels, as in the following URL.

The only problem is that it supposes a fixed number of labels. Following this, I would have to create a 1x100 matrix, where entry value is 1 for the labeled classes, and 0 for non-label classes, as in the following definition:

layers {
  name: "slice0"
  type: SLICE
  bottom: "label"
  top: "label_matrix"
  slice_param {
      slice_dim: 1
      slice_point: 100
  }
}

where each image contains a a label looking like (1,0,0,...1,...0,....,0,1) where the vector size is 100 dimension.

Now, I apologize that my question becomes somehow vague, but is this a feasible idea? I.e., is there a better approach to this problem?


Answer:

I get that you have 5 types of labels that are not always present for each data point. 1 of the 5 labels is for 100-way classification. Correct so far?

I would suggest always writing all 5 labels into your HDF5 and use a special value for when the label is missing. You can then use the missing_value option to skip computing the loss for that layer for that iteration. Using it requires add loss_param{ ignore_label = Y } to the loss layer in your network prototxt definition where Y is a scalar.

The backpropagated error will only be a function of labels that are present. If input X does not have a valid value for a label, the network will still produce an estimate for that label. But it will not be penalized for it. The output is produced without any effect on how the weights are updated in that iteration. Only outputs for non-missing labels contribute to the error signal and the weight gradients.

It seems that only the Accuracy and SoftmaxWithLossLayer layers support missing_values.

Each label is a 1x5 matrix. The first entry can be for the 100-way classification (e.g. [0-99]) and entries 2:5 have scalars that reflect the values that the other labels can take. The order of the columns is the same for all entries in your dataset. A missing label is marked by a special value of your choosing. This special value has to lie outside the set of valid label values. This will depend on what those labels represent. If a label value of -1 never occurs you can use this to flag a missing label.

Question:

I have the following h5 files listed in train.txt which I am giving to the hdf5 data layer.

/home/foo/data/h5_files/train_data1.h5
/home/foo/data/h5_files/train_data2.h5
/home/foo/data/h5_files/train_data3.h5
/home/foo/data/h5_files/train_data4.h5
/home/foo/data/h5_files/train_data5.h5

I have 3 datasets - X, Meta and Labels in these files. Initially, I kept all these in 1 h5 file, but since caffe can't handle h5 files bigger than 2 GB, I had to divide X (say X consists of 5000 samples), in 5 parts. In the first h5, I have Meta and Labels stored along with the first part, i.e; 1000 samples of X, and in the remaining 4 h5 files, I have 1000 samples each. When I start finetuning, caffe crashes with the following error message

0111 07:46:54.094041 23981 layer_factory.hpp:74] Creating layer data
 net.cpp:76] Creating Layer data
 net.cpp:334] data -> X
 net.cpp:334] data -> Labels
 net.cpp:334] data -> Meta
 net.cpp:105] Setting up data
 hdf5_data_layer.cpp:66] Loading list of HDF5  filenames from: /home/foo/hdf5_train.txt
 hdf5_data_layer.cpp:80] Number of HDF5 files: 5
 hdf5_data_layer.cpp:53] Check failed: hdf_blobs_[i]->num() == num (5000 vs. 1000) 

*** Check failure stack trace: ***
    @     0x7f1eebcab0d0  google::LogMessage::Fail()
    @     0x7f1eebcab029  google::LogMessage::SendToLog()
    @     0x7f1eebcaaa07  google::LogMessage::Flush()
    @     0x7f1eebcad98f  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f1ef18ff045  caffe::HDF5DataLayer<>::LoadHDF5FileData()
    @     0x7f1ef18fdca4  caffe::HDF5DataLayer<>::LayerSetUp()
    @     0x7f1ef196bffc  caffe::Net<>::Init()
    @     0x7f1ef196e0b2  caffe::Net<>::Net()
    @     0x7f1ef18cf3cd  caffe::Solver<>::InitTrainNet()
    @     0x7f1ef18cfa3f  caffe::Solver<>::Init()
    @     0x7f1ef18cfe75  caffe::Solver<>::Solver()
    @           0x40a3c8  caffe::GetSolver<>()
    @           0x404fb1  train()
    @           0x405936  main
    @       0x3a8141ed1d  (unknown)
    @           0x4048a9  (unknown)

The main thing, according to me is 'Check failed: hdf_blobs_[i]->num() == num (5000 vs. 1000)' From which I assume that caffe is reading only the first h5 file, how can I make it read all 5 h5 files? Please help!


Answer:

How do you expect caffe synchronize all your input data across all the files? Do you expect it to read X from the second file and Meta from the first? If you were to implement "HDF5Data" layer, how would you expect the data to be laid out for you?

The way things are implemented in caffe at the moment, ALL variables should be divided between the HDF5 files in the same manner. That is if you decided that X will be divided into 5 files, with e.g., 1000 samples in the first file, 1234 samples in the second etc. Then you must divide Meta and Labels in the same manner: train_data1.h5 will have 1000 samples of X, Meta and Label; train_data2.h5 will have 1234 samples of X, Meta and Label and so forth.

Caffe does not load all the data into memory, it only fetches the batch it needs for the SGD iteration. Therefore, it makes no sense to split the variables across different files. Moreover, It might help if you make the number of samples stored at each file an integer multiplicity of your batch_size.

Question:

I have a hdf5 layer that read the information from the list.txt as

layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "./list.txt"
    batch_size: 4
    shuffle: true
  }
}

where list.txt contains two path files

/home/user/file1.h5
/home/user/file2.h5

while the batch size is 4. What is happen with above code? Can the data choose 4 files to feed the network?


Answer:

You have two hdf5 files, but each file may contain more than a single training example. Thus, effectively, you may have far more than batch_size: 4 examples.

Caffe does not really cares about the actual number of training examples: when it finishes to process all the examples (aka "epoch") it simply starts over reading the samples again. Caffe cycles through all the samples until number of training/testing iteration is reached.

Question:

I am trying to test my caffe model by feeding a blob with all ones in it. So I form a hdf5 file by:

import h5py, os
import numpy as np

SIZE = 227 # fixed size to all images

X = np.ones((1, 3, SIZE, SIZE), dtype='f8')

with h5py.File('test_idty.h5','w') as H:
    H.create_dataset('img', data=X ) 
with open('test_h5_idty_list.txt','w') as L:
    L.write( '/home/wei/deep_metric/test_idty.h5' )

Then, I change my caffe prototxt to be:

layer{
  name:"data"
  type:"HDF5Data"
  top:"img"
  include:{
    phase:TEST
  }
  hdf5_data_param{
    source:"/home/wei/deep_metric/test_h5_idty_list.txt"
    batch_size:1
  }
}

Then, I try to make sure my data is fed correctly by:

net = caffe.Net(Model,Pretrained,caffe.TEST)
data = net.blobs['img'].data.copy()

However, this gives me all zeros in the matrix. Any idea on how to solve it?

Appreciated!


Answer:

In order for "HDF5Data" layer to read it's first batch you need to call net.forward() first. Once a forward pass is done, the tops of the layer has the data read from files.

Question:

I have the following structure in a .txt file:

/path/to/image x y
/path/to/image x y

where x and y are integers.

What I want to do now is: Create a hdf5 file to use in Caffe ('train.prototxt')

My Python code looks like this:

import h5py
import numpy as np
import os

text = 'train'
text_dir = text + '.txt'

data = np.genfromtxt(text_dir, delimiter=" ", dtype=None)

h = h5py.File(text + '.hdf5', 'w')
h.create_dataset('data', data=data[:1])
h.create_dataset('label', data=data[1:])

with open(text + "_hdf5.txt", "w") as textfile:
    textfile.write(os.getcwd() + '/' +text + '.hdf5')

But this does not work! Any ideas what could be wrong?


Answer:

It does not work because your 'data' is /path/to/image instead of the image itself.

See this answer for more information.

Question:

I have the following structure in a .txt file:

/path/to/image x y
/path/to/image x y

where x and y are integers.

What I want to do now is: Create a hdf5 file to use in Caffe ('train.prototxt')

My Python code looks like this:

import h5py, os
import caffe
import numpy as np

SIZE = 256
with open( 'train.txt', 'r' ) as T :
    lines = T.readlines()


count_files = 0
split_after = 1000
count = -1

# If you do not have enough memory split data into
# multiple batches and generate multiple separate h5 files
X = np.zeros( (split_after, 3, SIZE, SIZE), dtype='f4' )
y1 = np.zeros( (split_after, 1), dtype='f4' )
y2 = np.zeros( (split_after, 1), dtype='f4' )

for i,l in enumerate(lines):
    count += 1
    sp = l.split(' ')
    img = caffe.io.load_image( sp[0] )
    img = caffe.io.resize( img, (3, SIZE, SIZE) )

    X[count] = img
    y1[count] = float(sp[1])
    y2[count] = float(sp[2])

    if (count+1) == split_after:
        with h5py.File('train_' + str(count_files) +  '.h5','w') as H:
            H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
            H.create_dataset( 'y1', data=y1 )
            H.create_dataset( 'y2', data=y2 )

            X = np.zeros( (split_after, 3, SIZE, SIZE), dtype='f4' )
            y1 = np.zeros( (split_after, 1), dtype='f4' )
            y2 = np.zeros( (split_after, 1), dtype='f4' )
        with open('train_h5_list.txt','a') as L:
            L.write( 'train_' + str(count_files) + '.h5') # list all h5 files you are going to use
        count_files += 1
        count = 0

In fact I want to estimate angles. That means I have two classes one for vertical angles one for horizontal angles. The first class ranges from 0-10 degrees the second from 10-20 and so on (for both horizontal and vertical angles).

How would the .prototxt look like? Here are my last layers

layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 36
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "y"
  top: "loss"
}

Answer:

You also need to modify the input layer: now you have three tops:

layer {
  type: "HDF5Data"
  name: "data"
  top: "X"
  top: "y1"
  top: "y2"
  # ... params and phase
}

Now, the top of your fc7 serves as a "high level descriptor" of your data, from which you wish to predict y1 and y2. Thus, after layer fc7 you should have:

layer {
  type: "InnerProduct"
  name: "class_y1" 
  bottom: "fc7"
  top: "class_y1"
  #... params num_output: 36 
}
layer {
  type: "SoftmaxWithLoss" # to be replaced with "Softmax" in deploy
  name: "loss_y1"
  bottom: "class_y1"
  bottom: "y1"
  top: "loss_y1"
  # optionally, loss_weight
}

And:

layer {
  type: "InnerProduct"
  name: "class_y2" 
  bottom: "fc7"
  top: "class_y2"
  #... params num_output: 36 
}
layer {
  type: "SoftmaxWithLoss" # to be replaced with "Softmax" in deploy
  name: "loss_y2"
  bottom: "class_y2"
  bottom: "y2"
  top: "loss_y2"
  # optionally, loss_weight
}

Question:

I want to train a net to recognize some RGB in image. (input: 256X256 images and some RGB value).

I wrote a script that creates HDF5 file for float multi-labels:

import h5py, os
import caffe
import numpy as np

SIZE = 256 # images size
with open( '/home/path/images ):
    lines = T.readlines()

X = np.zeros( (len(lines), SIZE, SIZE, 3), dtype='f4' )
r = np.zeros( (len(lines),1), dtype='f4' )
g = np.zeros( (len(lines),1), dtype='f4' )
b = np.zeros( (len(lines),1), dtype='f4' )

for i,l in enumerate(lines):
    sp = l.split(' ')
    img = caffe.io.load_image( sp[0] )
#    img = caffe.io.resize( img, (3, SIZE, SIZE) ) # resize to fixed $
    print img
    X[i] = img
#    print X[i]
    r[i] = float(sp[1])
    g[i] = float(sp[2])
    b[i] = float(sp[3])
    print "R" + str(r[i]) + "G" + str(g[i]) + "B" + str(b[i])
with h5py.File('/home/path/train.h5','$
    H.create_dataset('X', data=X)
    H.create_dataset('r', data=r)
    H.create_dataset('g', data=g)
    H.create_dataset('b', data=b) 

with open('/home/path/train_h5_list.tx$
    L.write( 'train.h5' ) # list all h5 files

Im using a multi-label regression net. when i run TRAIN on this net with my dataset (HDF5) i got this error:

name: "FKPReg"
state {
  phase: TRAIN
}
layer {
  name: "fkp"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "/home/path/train_h5_list.txt"
    batch_size: 64
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 32
    kernel_size: 11
    stride: 2
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 7
    group: 2
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool2"
  top: "norm2"
  lrn_param {
    local_size: 3
    alpha: 5e-05
    beta: 0.75
    norm_region: WITHIN_CHANNEL
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  convolution_param {
    num_output: 32
    pad: 1
    kernel_size: 5
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 5
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  convolution_param {
    num_output: 32
    pad: 1
    kernel_size: 5
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 4
    stride: 2
  }
}
layer {
  name: "drop0"
  type: "Dropout"
  bottom: "pool5"
  top: "pool5"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool5"
  top: "ip1"
  inner_product_param {
    num_output: 100
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "drop1"
  type: "Dropout"
  bottom: "ip1"
  top: "ip1"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 3
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu22"
  type: "ReLU"
  bottom: "ip2"
  top: "ip2"
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I1106 11:47:52.235343 28083 layer_factory.hpp:74] Creating layer fkp
I1106 11:47:52.235384 28083 net.cpp:90] Creating Layer fkp
I1106 11:47:52.235410 28083 net.cpp:368] fkp -> data
I1106 11:47:52.235443 28083 net.cpp:368] fkp -> label
I1106 11:47:52.235481 28083 net.cpp:120] Setting up fkp
I1106 11:47:52.235496 28083 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: /home/path/train_h5_list.txt
I1106 11:47:52.235568 28083 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 140703305845312:
  #000: ../../../src/H5F.c line 1586 in H5Fopen(): unable to open file
    major: File accessibilty
    minor: Unable to open file
  #001: ../../../src/H5F.c line 1275 in H5F_open(): unable to open file: time = Sun Nov  6 11:47:52 2016
, name = 'train.h5', tent_flags = 0
    major: File accessibilty
    minor: Unable to open file
  #002: ../../../src/H5FD.c line 987 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: ../../../src/H5FDsec2.c line 343 in H5FD_sec2_open(): unable to open file: name = 'train.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0
    major: File accessibilty
    minor: Unable to open file
F1106 11:47:52.236398 28083 hdf5_data_layer.cpp:32] Failed opening HDF5 file: train.h5
*** Check failure stack trace: ***
    @     0x7ff809dfcdaa  (unknown)
    @     0x7ff809dfcce4  (unknown)
    @     0x7ff809dfc6e6  (unknown)
    @     0x7ff809dff687  (unknown)
    @     0x7ff80a194406  caffe::HDF5DataLayer<>::LoadHDF5FileData()
    @     0x7ff80a192c98  caffe::HDF5DataLayer<>::LayerSetUp()
    @     0x7ff80a173be3  caffe::Net<>::Init()
    @     0x7ff80a175952  caffe::Net<>::Net()
    @     0x7ff80a15bbf0  caffe::Solver<>::InitTrainNet()
    @     0x7ff80a15cbc3  caffe::Solver<>::Init()
    @     0x7ff80a15cd96  caffe::Solver<>::Solver()
    @           0x40c5d0  caffe::GetSolver<>()
    @           0x406611  train()
    @           0x404bb1  main
    @     0x7ff80930ef45  (unknown)
    @           0x40515d  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)

What am I doing wrong? thanks


Answer:

A few comments

  1. caffe.io.resize( img, (3, SIZE, SIZE) ) - this is WRONG. you need to resize to (SIZE, SIZE) and the transpose to (3, SIZE, SIZE). resize should affect only the spatial dimensions of the image, and transpose should take care of arranging the channel dimension before height and width. Consequently, the shape of X should be (len(lines), 3, SIZE, SIZE).

  2. If your HDF5 file has datasets X, r, g and b, then your "HDF5Data" layer can have "top"s "X", "r", "g" and/or "b". You cannot have "data" or "label" as "top"s since there are no such datasets in the input hdf5 file.

  3. The error message you got states (quite clearly) that

    error message = 'No such file or directory'

    This usually means that train.h5 is not in the search path. Try writing full path to /home/path/train_h5_list.txt.

Question:

I'm preparing to train in Caffe using data in a hdf5 file. This file also contains the per-pixel mean data/image of the training set. In the file 'train_val.prototxt' for the input data layer in the section 'transform_params' it is possible to use a mean_file to normalize the data, usually in binaryproto format, for example for the ImageNet Caffe tutorial example:

transform_param {
  mirror: true
  crop_size: 227
  mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}

For per-channel normalization one can instead use mean_value instead of mean_file.

But is there any way to use mean image data directly from my database (here hdf5) file?

I have extracted the mean from the hdf5 to a numpy file but not sure if that can be used in the prototxt either or converted. I can't find info about this in the Caffe documentation.


Answer:

AFAIK, "HDF5Data" layer does not support transformations. You should subtract the mean values yourself when you store the data to HDF5 files.

If you want to save a numpy array in a binaryproto format, you can see this answer for more details.