Hot questions for Using Neural networks in opencl


While "googl'ing" and doing some research I were not able to find any serious/popular framework/sdk for scientific GPGPU-Computing and OpenCL on AMD hardware. Is there any literature and/or software I missed?

Especially I am interested in deep learning.

For all I know recommends NVIDIA hardware and CUDA frameworks. Additionally all big deep learning frameworks I know, such as Caffe, Theano, Torch, DL4J, ... are focussed on CUDA and do not plan to support OpenCL/AMD.

Furthermore one can find plenty of scientific papers as well as corresponding literature for CUDA based deep learning tasks but nearly nothing for OpenCL/AMD based solutions.

Is there any chance that new or existing scientific frameworks will show up for OpenCL/AMD based solutions in 2015/16?

What is a good start for deep learning with OpenCL/AMD? Any literature? Tutorials? Miscellaneous sources?


Edit 1 See Mikael Rousson's answer - Amazon is now the way forwards as you can "rent" computational power from them.

Edit 2 I've created a series of guides on how to set up Amazon EC2 Instances for Deep Learning with theano. It's a lot more convenient than running on a personal machine.

Edit 3 It seems that TensorFlow is now far more widely accepted than theano so I have updated the guide accordingly.

I have been in the same situation as yourself as I have a MacBook Pro with Intel Iris graphics. I have spent the best part of a week looking through all possible workarounds and I would be more than welcome to alternatives to those that I offer.

The best solution I currently have is to:

  1. Install the python library tensorflow and utilise what GPU support there is and continue to update to the latest development versions.
  2. Use theano - and use existing GPU support similarly to tensorflow
  3. Buy an NVIDIA graphics card and use it on a PC
  4. If you absolutely need a solution in OpenCL and you are willing to code everything from a high level of understanding (no tutorials) look at DeepCL and possibly pyOpenCl.

I have found that any solution using OpenCL, e.g. pyOpenCl, doesn't yet have user friendly interfaces for Deep Learning i.e. it will take longer to code it in an alternative method than to just code it fast and run on a CPU. With that said though, here are of the best alternative OpenCL libraries for deep learning:

In Development


After a series of pains, I have installed Theano on a machine with AMD graphics card - Radeon HD 5450 (Cedar).

Now, consider a following code.

import numpy
import theano
import theano.tensor as T
rng = numpy.random

N = 400         #number of samples
feats = 784     #dimensionality of features
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
training_steps = 10000

# theano symbolic variables
x = T.matrix("x")
y = T.vector("y")
w = theano.shared(rng.randn(784), name="w")
b = theano.shared(0., name="b")

print("Initial Model:")
print(str(w.get_value()) + " " + str(b.get_value()) )

p_1 = 1/(1 + T.exp(, w) - b))       # probability of target being 1
prediction = p_1 > 0.5                      # prediction threshold
xent = -y * T.log(p_1) - (1-y)*T.log(1-p_1) # cross-entropy loss function
cost = xent.mean() + 0.01 * (w**2).sum()    # cost - to be minimized
gw, gb = T.grad(cost, [w, b])

#compile it
train = theano.function(
                        inputs = [x, y],
                        outputs = [prediction, xent],
                        updates = {w: w - 0.1*gw, b: b - 0.1*gb}    )

predict = theano.function(inputs = [x], outputs = prediction)

#train it
for i in range (training_steps):
    pred, err = train(D[0], D[1])

print("Final Model: ")
print(str(w.get_value()) + " " + str(b.get_value()) )
print("Target values for D: " + str(D[1]))
print("Predictions on D: " + str(D[0]))

I think this code should work just fine. But I get a series of errors:

ERROR (theano.gof.opt): Optimization failure due to: local_gpua_hgemm
ERROR (theano.gof.opt): node: dot(x.T, Elemwise{sub,no_inplace}.0)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
  File "/home/user/anaconda3/lib/python3.5/site-packages/theano/gof/", line 1772, in process_node
    replacements = lopt.transform(node)
  File "/home/user/anaconda3/lib/python3.5/site-packages/theano/sandbox/gpuarray/", line 140, in local_opt
    new_op = maker(node, context_name)
  File "/home/user/anaconda3/lib/python3.5/site-packages/theano/sandbox/gpuarray/", line 732, in local_gpua_hgemm
    if nvcc_compiler.nvcc_version < '7.5':
TypeError: unorderable types: NoneType() < str()

And I get the same set of messages multiple times. Then at the end:

  File "/home/user/anaconda3/lib/python3.5/site-packages/pygpu-0.2.1-py3.5-linux-x86_64.egg/pygpu/", line 286, in __init__
  File "pygpu/gpuarray.pyx", line 1950, in pygpu.gpuarray.GpuKernel.__cinit__ (pygpu/gpuarray.c:24214)
  File "pygpu/gpuarray.pyx", line 467, in pygpu.gpuarray.kernel_init (pygpu/gpuarray.c:7174)
pygpu.gpuarray.UnsupportedException: ('The following error happened while compiling the node', GpuElemwise{Composite{((-i0) - i1)}}[(0, 0)]<gpuarray>(GpuFromHost<None>.0, InplaceGpuDimShuffle{x}.0), '\n', b'Device does not support operation')

Does this mean I cannot use this GPU or I have done something wrong in my code. Moreover, from the errors, it seems there is been a search for nvcc. But I do not have CUDA, I have opencl.

>>> import theano
Mapped name None to device opencl0:0: Cedar


>>> from theano import config
>>> config.device
>>> config.cuda
<theano.configparser.AddConfigVar.<locals>.SubObj object at 0x7fba9dee7d30>
>>> config.nvcc
<theano.configparser.AddConfigVar.<locals>.SubObj object at 0x7fba9e5967f0>
>>> config.gpu
<theano.configparser.AddConfigVar.<locals>.SubObj object at 0x7fbaa9f61828>

So how do I go from here? Is there way to make sure clcc is searched instead of nvcc.

PS_1: hello world works. PS_2: System = 14.04 64 bit


OpenCL is not yet supported by Theano. As a result, only NVIDIA GPUs are supported.

The status of OpenCL is recorded on GitHub.

You need to disable GPU operation by setting device=cpu in your Theano config. There are multiple ways to do this (i.e. via THEANO_FLAGS environment variable or via a .theanorc file; see documentation).

Before running the script, try setting

export THEANO_FLAGS=device=cpu,floatX=float64

Your situation may need additional configuration options. See the documentation for more.


I have been working on programming a convolutional back-propagation neural network recently and I have mainly been using Java to run the program and libGDX for the graphical visualizations. Through heavy research, I have found that to heavily increase performance and efficiency, I should preform the matrix calculations on the graphics card instead of on the CPU.

After looking through sources online, I found that the main way to preform such calculations on the graphics card was through OpenCl. After even more research, I discovered that my main two options for OpenCl support on Java was through LWJGL or JOCL.

libGDX was built on LWJGL, so my first instinct was to see if I could access that built in OpenCL support through the libGDX library, however, after looking around, I found nothing about this whatsoever!

My question is, can I access OpenCl through the libGDX library, and if so, how?

If I can't access LWJGL's OpenCl implementation, should I use JOCL to access GPU mathematical computations, or should I add a second library of LWJGL into my libGDX application?


Not sure if it's in Lwjgl2 in GDX, but I know the LibGDX Lwjgl3 implementation does not include it. But Lwjgl3 is broken up into modules, so you can add the OpenCL module in your Gradle project.

In "core" dependencies, add

compile "org.lwjgl:lwjgl-opencl:3.1.0"

What I don't know is if this OpenCL module has any dependencies on the core of Lwjgl3. If so, you might want to switch to the LibGDX Lwjgl3 backend. To switch to Lwjgl3, in the "desktop" dependencies, add 3 after lwjgl3, so:

compile "com.badlogicgames.gdx:gdx-backend-lwjgl3:$gdxVersion"

If you switch to Lwjgl3, you have to clean up some of the DesktopLauncher imports and class names, basically adding 3 after Lwjgl in the class names (scroll down here for instructions if you need them).

You may have to keep the version number in sync with the version of Lwjgl3 the LibGDX version is using.