`clEnqueueFillBuffer()` fills a buffer correctly only at random

opencl memset
enqueuereadbuffer

I'm trying to fill OpenCL cl_int2 buffer with default values ({-1, -2}), however the OpenCL function clEnqueueFillBuffer() fills my buffer with different values each time I run it – the buffer is filled with the expected values only at random. The function returns error code 0.

Examples of the snippet's output at multiple runs:

  • 0 : -268435456
  • 0 : -2147483648
  • 0 : -536870912
  • 0 : 268435456
  • 0 : 0
  • 0 : -1342177280
  • -1: -2

I'm running OS X 10.11.6 with Radeon HD 6750M and OpenCL version 1.2.

clbParticle_hashmap_lookup_table = clCreateBuffer(context,
                                                  CL_MEM_READ_WRITE,
                                                  sizeof(cl_int2)*this->CUBE_CELLS,
                                                  nullptr,
                                                  &err_code);

// ...

cl_int2 default_hashmap_pattern = { .s = {-1, -2} };

clEnqueueFillBuffer(queue,
                    clbParticle_hashmap_lookup_table,
                    &default_hashmap_pattern,
                    sizeof(cl_int2),
                    0,
                    sizeof(cl_int2)*this->CUBE_CELLS,
                    0,
                    nullptr, nullptr);

clFinish(queue);

// copy and print the data:
size_t   hashmap_lookup_table_size  = sizeof(cl_int2)*this->CUBE_CELLS;
cl_int2* hashmap_lookup_table_bytes = (cl_int2*) malloc(hashmap_lookup_table_size);

clEnqueueReadBuffer(queue,
                    clbParticle_hashmap_lookup_table,
                    CL_TRUE,
                    0,
                    hashmap_lookup_table_size,
                    hashmap_lookup_table_bytes,
                    0,
                    nullptr, nullptr);

clFinish(queue);

cout << endl << "Lookup table: " << endl;
for (int i=0; i<this->CUBE_CELLS; i++)
    cout << setw(10) << hashmap_lookup_table_bytes[i].s[0] << " : "
         << setw(10) << hashmap_lookup_table_bytes[i].s[1] << endl;

The problem is that your fill pattern is larger too large for your GPU. I ran into the same problem trying to fill a pattern with a cl_double which is 64 bits like your cl_int2. I think clEnqueueFillBuffer is invoking a built in kernel which doesn't allow patterns

clEnqueueFillBuffer, clEnqueueFillBuffer. Enqueues a command to fill a buffer object with a pattern of a given pattern size. cl_int  Notes. Enqueues a command to fill a buffer object with a pattern of a given pattern size. The usage information which indicates whether the memory object can be read or written by a kernel and/or the host and is given by the cl_mem_flags argument value specified when buffer is created is ignored by clEnqueueFillBuffer.

I can reproduce this. On a Macbook Sierra, with Radeon Pro 450, following script:

int N = 100000;

float *a = new float[N];

cl_mem a_gpu = clCreateBuffer(context, CL_MEM_READ_WRITE, N * sizeof(float), 0, &err);
checkError(err);
for(int it = 0; it < 100; it++) {

    float value = 123.0f + it;
    err = clEnqueueFillBuffer(queue, a_gpu, &value, sizeof(value), 0, N * sizeof(float), 0, 0, 0);
    checkError(err);
    clFinish(queue);

    err = clEnqueueReadBuffer(queue, a_gpu, CL_TRUE, 0,
                                         sizeof(cl_float) * N, a, 0, NULL, NULL);
    checkError(err);
    clFinish(queue);

    cout << it << " a[N - 1]=" << a[N - 1] << endl;
}
delete[] a;

gives results like:

Using Apple , OpenCL platform: Apple Using OpenCL device: AMD Radeon Pro 450 Compute Engine 0 a[N - 1]=-1.39445e-31 1 a[N - 1]=0 2 a[N - 1]=0 3 a[N - 1]=0 4 a[N - 1]=0 5 a[N - 1]=0 6 a[N - 1]=129 7 a[N - 1]=0 8 a[N - 1]=131 9 a[N - 1]=132 10 a[N - 1]=133 11 a[N - 1]=134 12 a[N - 1]=135 13 a[N - 1]=0 14 a[N - 1]=0 15 a[N - 1]=0 16 a[N - 1]=0 17 a[N - 1]=0 18 a[N - 1]=0 19 a[N - 1]=0 20 a[N - 1]=0 21 a[N - 1]=0 22 a[N - 1]=0 23 a[N - 1]=0 24 a[N - 1]=0 25 a[N - 1]=0 26 a[N - 1]=0 27 a[N - 1]=0 28 a[N - 1]=0 29 a[N - 1]=0 30 a[N - 1]=0 31 a[N - 1]=154 32 a[N - 1]=0

functions.rs.html -- source, I'm trying to fill OpenCL cl_int2 buffer with default values ({-1, -2}), however the OpenCL function. `clEnqueueFillBuffer()` fills a buffer correctly only at random. clEnqueueFillBuffer fails to fill. OpenCL 1.2 provides the function clEnqueueFillBuffer You've set your buffer with the CL_MEM_READ_ONLY flag.

I have experienced this bug ONLY on macOS, since Mar 2017 when I started to learn OpenCL (can't remember the macOS version at then). The GPU is GT 750M (which is probably irrelevant), and the pattern is a cl_double2. Same routine on a GTX 760, Linux, has no such problem. I suspect this is because the OpenCL 1.2 support on macOS is incomplete, as clinfo (compiled and executed on macOS) warns:

NOTE:   your OpenCL library only supports OpenCL 1.0,
        but some installed platforms support OpenCL 1.2.
        Programs using 1.2 features may crash
        or behave unexpectedly

The "corresponding" CUDA API, cudaMemset, can only accept an int-sized pattern. However, the restriction is stated in the CUDA documentation, while the OpenCL documentation clearly used a cl_float4 (same size to cl_double2) as an example. So this is clearly a bug, not an undocumented feature.

But I guess Apple has solved this problem in macOS 10.14, because THEY ARE DEPRECATING OPENCL!

What's the best way to fill a buffer?, Safety /// /// The caller must ensure that correct and appropriate `flags` are Everything else is functionally equivalent and is useful /// only for debugging or Enqueues a command to fill a buffer object with a pattern of a given pattern size. .org/registry/cl/sdk/1.2/docs/man/xhtml/clEnqueueFillBuffer.html)) /// /// [Version​  The enqueueFillBuffer command is from 5.2.3 Filling Buffer Objects. The function clEnqueueFillBuffer enqueues a command to fill a buffer object with a pattern of a given pattern size. The usage information which indicates whether the memory object can be read or written by a kernel and/or the host and is given by the cl_mem_flags argument value specified when buffer is created is ignored by clEnqueueFillBuffer.

ocl-core/functions.rs at master · cogciprocate/ocl-core · GitHub, As you correctly said, clEnqueueFillBuffer is available from OpenCL 1.2. Starting from your OpenCL kernel for resetting to 0 a CL buffer, I listed a couple of main  Dismiss Join GitHub today. GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.

Release 1.0-1 Initial release * Release 1.3.006-1 This update is , BufferCreateType, OpenclVersion, ClVersions, Status, in in the `verify_context​()` documentation below. [FIXME]: CALCULATE CORRECT IMAGE SIZE AND COMPARE WITH FORMAT/ Enqueues a command to fill a buffer object with a pattern of a given pattern size. /// let errcode = unsafe { ffi::clEnqueueFillBuffer(. Hello, While profiling an OpenCL application on my Radeon HD 7970, I noticed that CodeXL 1.6-7247 on Ubuntu 12.04 does not display the enqueueFillBuffer calls on the kernel execution timeline. Of course it is not a kernel call but because such a call between two kernels will delay the beginning of the execution of the second one, it w

clEnqueueFillBuffer - Enqueues a command to fill , Release 1.0-87 Obsoletes registrydecoder because it is only Python 2-based. Now correctly specifies the right version of libewf-devel * Release 3.7.4-1 Version change. so that buffers are hashed prior to being decompressed and the same buffer will only be dirent->d_type in now filled on readdir() [Mark Browning]. CL_INVALID_CONTEXT if the context associated with command_queue and buffer are not the same or if the context associated with command_queue and events in event_wait_list are not the same. CL_INVALID_MEM_OBJECT if buffer is not a valid buffer object.

Comments
  • Can you try with another way of initializing default_hashmap_pattern ? I can see any evident error in the code. Also check the error code of the ReadBuffer
  • @DarkZeros clEnqueueReadBuffer() returns 0. I've tried these initializations of default_hashmap_pattern, however none of them solved the problem; 1) cl_int2 default_hashmap_pattern = { -1, -2 }; 2) cl_int2 default_hashmap_pattern; default_hashmap_pattern.s[0] = -1; default_hashmap_pattern.s[1] = -2;
  • why dont you use an array, instead of a struct? Yes, your target might be to use a struct, but at least you can check if the issue is related to enqueueFillBuffer or not.