Running OpenCL Cycles using Blender 2.71 & AMD GPUs

If you are using Blender on AMD, you probably have noticed that Cycles doesn't support AMD GPUs. However, for some scenes, Cycles is actually working just fine and might be worth a try. Here's a quick how to guide.

First of all, if you are on Linux, make sure you have the AMD proprietary driver installed. You can grab it from the AMD homepage. Next, you need to set up an environment variable in order to make Cycles "see" the AMD OpenCL device. On Windows, you can open a command window (Shift-Right-Click on your Blender installation folder and use "Open command promp here..."), then type in:


On Linux, you can use:

CYCLES_OPENCL_TEST=all ./blender

After Blender has started, you now have to pick your compute device. Under "User preferences", "System", you can find the "Compute device" option. For AMD, pick your GPU code name as the device. The code name for R290 and R9 290X is "Hawaii", for R9 280, R9 280X, HD 7950 and HD 7970 "Tahiti". Other codenames are "Pitcairn" and "Bonaire"; these are the GCN based cards -- I doubt Cycles will run correctly on older cards.


Anyway, once set up, if everything works right, you'll get GPU accelerated Cycles!


However, as of August 2014, using the Catalyst 14.6 drivers and Blender 2.71, not everything works correctly. In one scene, I'm having issues with incorrect intersections which only occur on the GPU device:


Interestingly, I have also written my own tiny OpenCL raytracer where I had similar issues. What was happening is that the compiler computed incorrect ranges from a condition and then later on miss-compiled some code. In my case, I had code like this:

if (cullBackface) {
    if (U<0.0f || V<0.0f || W<0.0f) return false;
} else {
    // Condition A
    if ((U<0.0f || V<0.0f || W<0.0f) &amp;&amp;
            (U>0.0f || V>0.0f || W>0.0f)) return false;

// Some more code
const float T = U*Az + V*Bz + W*Cz;

// Some time later
if (cullBackface) {
    if (T < ray->tMin * det || T > ray->tMax * det)
        return false;
} else {
    // This is not correct (should work on the sign bit directly,
    // but this is what triggered the bug)
    const bool nearClip = fabs (T) < ray->tMin * fabs (det);
    const bool farClip = fabs (T) > ray->tMax * fabs (det);
    if (nearClip || farClip) {
        return false;

What happened was that if cullBackface was statically known to be false, nearClip and farClip would be always true, as the call to fabs (T) was never executed. It seems that the condition A would lead the compiler to believe that U,V or W would have some invalid value after the condition. I'm not claiming that this is the same bug that is happening in Cycles, but given that ray-tracers are very sensitive to these kinds of optimizations, I wouldn't be surprised if something similar affects Cycles.

By the way, the bug-fix is to make condition A a bit more complicated.

const bool lessZero = U < 0 || V < 0 || W < 0;
const bool greaterZero = U > 0 || V > 0 || W > 0;

if (lessZero &amp;&amp; greaterZero) {
    return false;

Before you ask, the ray/triangle intersection code is based on "Watertight Ray/Triangle Intersection"; and trust me, I was really puzzled when it wasn't quite as watertight as expected :)

[Update:] There was a bug in the code. Here's the correct version of the code above:

const float detSignMask = as_float (signbit (det) << 31u);

// This is the code from the repository, which is not equivalent to
// the paper. In the paper, this line would read:
// const bool nearClip = xorf (T, detSignMask) < 0;
const bool nearClip = xorf (T, detSignMask) < (ray->tMin * fabs (det));
const bool farClip = xorf (T, detSignMask) > (ray->tMax * fabs (det));


float xorf (float a, float b)
     return as_float (as_uint (a) ^ as_uint (b));


Comments powered by Disqus