Hot questions for Using Neural networks in deep dream


I am interested in a recent blog post by Google that describes the use of nn to make art.

I am particularly interested in one technique:

'In this case we simply feed the network an arbitrary image or photo and let the network analyze the picture. We then pick a layer and ask the network to enhance whatever it detected. Each layer of the network deals with features at a different level of abstraction, so the complexity of features we generate depends on which layer we choose to enhance. For example, lower layers tend to produce strokes or simple ornament-like patterns, because those layers are sensitive to basic features such as edges and their orientations.'

The post is

My question: the post describes this as a 'simple' case--is there an open-source implementation of a nn that could be used for this purpose in a relatively plug-and-play process? For just the technique described, does the network need to be trained?

No doubt for other techniques mentioned in the paper one needs a network already trained on a large number of images, but for the one I've described is there already some kind of open-source network layer visualization package?


UPD: Google posted more detail instructions how they implemented it:

There's also another project:

If you read 1,[2],[3],[4] from your link, you'll see that they used Caffe. This framework already contains the trained networks to play with. You don't need to train anything manually, just download the models using .sh scripts in the models/ folder.

You want "plug-and-play process", it's not so easy because besides the framework, we need the code of the scripts they used and, probably, patch Caffe. I tried to make something using their description. Caffe has Python and Matlab interface but there's more in its internals.

The text below describes my thoughts on how it could be possibly implemented. I'm not sure about my words so it's more like an invitation to research with me than the "plug-and-play process". But as no one still answered, let me put it here. Maybe someone will fix me.


As far as I understand, they run optimization

[sum((net.forwardTo(X, n) - enchanced_layer).^2) + lambda * R(X)] -> min

I.e. look for such input X so that the particular layer of the netword would produce the "enchanced" data instead of the "original" data.

There's a regularization constraint R(X): X should look like "natural image" (without high-frequency noise).

X is our target image. The initial point X0 is the original image. forwardTo(X, n) is what our network produces in the layer n when we feed the input with X. If speak about Caffe, you can make full-forward pass (net.forward) and look at the blob you are interested in (net.blob_vec(n).get_data()).

enchanced_layer - we take the original layer blob and "enchance" signals in it. What does it mean, I don't know. Maybe they just multiply the values by coefficient, maybe something else.

Thus sum((forwardTo(X, n) - enchanced_net).^2) will become zero when your input image produces exactly what you want in the layer n.

lambda is the regularization parameter and R(X) is how X looks natural. I didn't implement it and my results look very noisy. As for it's formula, you can look for it at [2].

I used Matlab and fminlbfgs to optimize.

The key part was to find the gradient of the formula above because the problem has too many dimensions to calculate the gradient numerically.

As I said, I didn't manage to find the gradient of R(X). As for the main part of the formula, I managed to find it this way:

  • Set diff blob at the layer n to forwardTo(X, n) - enchanced_net. (see caffe documentation for set_diff and set_data, set_data is used for forward and waits for data and set_diff is used for backward propagation and waits for data errors).
  • Perform partial backpropagation from layer n-1 to the input.
  • Input diff blob would contain the gradient we need.

Python and Matlab interfaces do NOT contain partial backward propagation but Caffe C++ internals contain it. I added a patch below to make it available in Matlab.

Result of enhancing the 4th layer:

I'm not happy with the results but I think there's something in common with the article.


I am writing a little bit about googles deepdream. It's possible to check with deepdream learned networks, see research blog google the examplbe with the dumbbells. In the example a network is trained to recognize a dumbbell. Then they use deepdream to see what the network has learned and the result is the network was trained bad. Because it recognize a dumbbell plus an arm as a dumbbell.

My question is, how will networks check in practice? With deepdream or which other method?

Best greetings


Generally in machine learning you validate your learned network on a dataset you did not use in the training process (a test set). So in this case, you would have a set of examples with and without dumbbells that was used to train the model, as well as a set (also consisting of dumbbells and without) that were not seen during the training procedure.

When you have your model, you let it predict the labels of the withheld set. You then compare these predicted labels to the actual ones:

  • Every time you predict a dumbbell correctly, you increment the amount of True Positives,
  • in case it correctly predicts the absence of a dumbbell, you increment the amount of True Negatives
  • when it predicted a dumbbell, but it should not be one, increment the amount of False Positives
  • Finally if it predicted no dumbbell, but there is one, you increment the amount of False Negatives

Based on these four, you can then calculate measures such as F1 score or accuracy to calculate the performance of the model. (Have a look at the following wiki: )


I am trying to implement deepdream in C++ in caffe(I want to run it in android). googlenet requires input of shape 224*224*3. In the ipython notebook of deepdream it shows src.reshape(1,3,h,w). Does this mean that only input blob is reshaped or it is propagated through the network? I tried calling the net.Reshape() in C++ and it resulted in:

F0307 01:27:24.529654 31857 inner_product_layer.cpp:64] Check failed: K_ == new_K 
(1024 vs. 319488) Input size incompatible with inner product parameters.

Shouldn't the network be reshaped too? If not what is the implication of just reshaping input blob? I am new to deep learning. So forgive me if it seems trivial.


changing the shape of the input requires reshaping of the entire net. Alas, there are some layer types that do not like to be reshaped. Specifically, "InnerProduct" layer: the number of trainable parameters of an inner product layer depends on the exact input shape and the output shape. Therefore a net with an "InnerProduct" layer cannot be reshaped.

You can use methods described in the "net surgery" example to convert the inner product layers to equivalent convolutional layers (that can be reshaped).