OpenCL conformance

Supported & Unsupported optional OpenCL 3.0 features

This list is only related to CPU devices (cpu & cpu-minimal drivers). Other drivers (CUDA, TCE etc) only support OpenCL 1.2. Note that 3.0 support on CPU devices requires LLVM 14 or newer.

Supported 3.0 features:

  • Shared Virtual Memory

  • C11 atomics

  • 3D Image Writes

  • SPIR-V

  • Program Scope Global Variables

  • Subgroups

  • Generic Address Space

Unsupported 3.0 features:

  • Device-side enqueue

  • Pipes

  • Non-Uniform Work Groups

  • Read-Write Images

  • Creating 2D Images from Buffers

  • sRGB & Depth Images

  • Device and Host Timer Synchronization

  • Intermediate Language Programs

  • Program Initialization and Clean-Up Kernels

  • Work Group Collective Functions

How to run the OpenCL 3.0 conformance test suite

You’ll need to build PoCL with enabled ICD, and the ICD must be one that supports OpenCL version 3.0 (for ocl-icd, this is available since version 2.3.0). This is because while the CTS will run with 1.2 devices, it requires 3.0 headers and 3.0 ICD to build. You’ll also need to enable the suite in the pocl’s external test suite set. This is done by adding -DENABLE_TESTSUITES=conformance -DENABLE_CONFORMANCE=ON to the cmake command line. After this make prepare_examples fetches and prepares the conformance suite for testing. After building pocl with make, the CTS can be run with ctest -L <LABEL> where <LABEL> is a CTest label.

There are three different CTest labels for using CTS, one label covers the full set tests in CTS, the other two contain a smaller subset of CTS tests. The fastest is conformance_suite_micro_main label, which takes approx 10-30 minutes on current (desktop) hardware. The medium sized conformance_suite_mini_main can take 1-2 hours on current hardware. The full sized CTS is available with label conformance_suite_full_main. This can take 10-30 hrs on current hardware.

If PoCL is compiled with SPIR-V support, three more labels are available, where _main suffix is replaced by _spirv (e.g. conformance_suite_mini_spirv) These labels will run the same tests as the _main variant, but use offline compilation to produce SPIR-V and use that to create programs, instead of default creating from OpenCL C source.

Note that running ctest -L conformance_suite_micro will run both variants (the online and offline compilation) since the -L option takes a regexp.

Additionally, there is a new cmake label, conformance_30_only to run tests which are only relevant to OpenCL 3.0.

CPU device version 1.2 should also work with CTS 3.0 (tests will be skipped).

Conformance tests results (precision of builtin math library functions)

Note that it’s impossible to test double precision on the entire range, therefore the results may vary.

x86-64 CPU with AVX2+FMA, LLVM 4.0, tested on Nov 1, 2017

NAME

Worst ULP

WHERE

add

0.00

{0x0p+0, 0x0p+0}

addD

0.00

{0x0p+0, 0x0p+0}

assignment

0.00

0x0p+0

assignmentD

0.00

0x0p+0

cbrt

0.50

-0x1.5629d2p+116

cbrtD

0.59

0x1.0000000000136p+1022

ceil

0.00

0x0p+0

ceilD

0.00

0x0p+0

copysign

0.00

{0x0p+0, 0x0p+0}

copysignD

0.00

{0x0p+0, 0x0p+0}

cos

2.37

0x1.1338ccp+20

cosD

2.27

-0x1.d10000000074p+380

cosh

2.41

-0x1.602166p+2

coshD

1.43

-0x1.98000000003efp+5

cospi

1.94

0x1.d73b56p-2

cospiD

2.46

-0x1.adffffffffa91p-2

divide

0.00

{0x0p+0, 0x0p+0}

divideD

0.00

{0x0p+0, 0x0p+0}

exp

0.95

-0x1.762532p+2

expD

0.94

0x1.2f0000000023dp+7

exp10

0.79

-0x1.309022p+5

exp10D

0.64

-0x1.34ffffffffcc9p+8

exp2

0.79

-0x1.fa3d0ep+6

exp2D

0.75

-0x1.ff00000000417p+9

expm1

1.00

-0x1.7a0002p-25

expm1D

0.99

-0x1.26p+5

fabs

0.00

0x0p+0

fabsD

0.00

0x0p+0

fdim

0.00

{0x0p+0, 0x0p+0}

fdimD

0.00

{0x0p+0, 0x0p+0}

floor

0.00

0x0p+0

floorD

0.00

0x0p+0

fma

0.00

{0x0p+0, 0x0p+0, 0x0p+0}

fmaD

0.00

{0x0p+0, 0x0p+0, 0x0p+0}

fmax

0.00

{0x0p+0, 0x0p+0}

fmaxD

0.00

{0x0p+0, 0x0p+0}

fmin

0.00

{0x0p+0, 0x0p+0}

fminD

0.00

{0x0p+0, 0x0p+0}

fmod

0.00

{0x0p+0, 0x0p+0}

fmodD

0.00

{0x0p+0, 0x0p+0}

fract

{ 0.00, 0.00}

{0x0p+0, 0x0p+0}

fractD

{ 0.00, 0.00}

{0x0p+0, 0x0p+0}

frexp

{ 0.00, 0}

0x0p+0

frexpD

{ 0.00, 0}

0x0p+0

hypot

1.93

{0x1.17c998p-127, -0x1.5fedb8p-127}

hypotD

1.73

{0x1.5d2ebeed7663cp-1022, 0x1.67457048a2318p-1022}

ldexp

0.00

{0x0p+0, 0}

ldexpD

0.00

{0x0p+0, 0}

log10

0.50

0x1.7fee2ep-1

log10D

0.50

0x1.9100000000639p+1022

log

0.63

0x1.7fcb3ep-1

logD

0.75

0x1.7d00000000381p+0

log1p

1.00

-0x1.fa0002p-126

log1pD

1.00

-0x1.e000000000001p-1022

log2

0.59

0x1.1107a2p+0

log2D

0.72

0x1.120000000063dp+0

logb

0.00

0x0p+0

logbD

0.00

0x0p+0

mad

0.00

{0x0p+0, 0x0p+0, 0x0p+0} no ULP check

madD

0.00

{0x0p+0, 0x0p+0, 0x0p+0} no ULP check

maxmag

0.00

{0x0p+0, 0x0p+0}

maxmagD

0.00

{0x0p+0, 0x0p+0}

minmag

0.00

{0x0p+0, 0x0p+0}

minmagD

0.00

{0x0p+0, 0x0p+0}

modf

{ 0.00, 0.00}

{0x0p+0, 0x0p+0}

modfD

{ 0.00, 0.00}

{0x0p+0, 0x0p+0}

multiply

0.00

{0x0p+0, 0x0p+0}

multiplyD

0.00

{0x0p+0, 0x0p+0}

nan

0.00

0x0p+0

nanD

0.00

0x0p+0

nextafter

0.00

{0x0p+0, 0x0p+0}

nextafterD

0.00

{0x0p+0, 0x0p+0}

pow

0.82

{0x1.91237cp-1, 0x1.4da146p+8}

powD

0.80

{0x1.2bfb4b18164c9p+65, -0x1.b78438ae9c3bdp-8}

pown

0.65

{-0x1.9p+6, -2}

pownD

0.62

{-0x1.7ffffffffffffp+1, 3}

powr

0.82

{0x1.91237cp-1, 0x1.4da146p+8}

powrD

0.80

{0x1.2bfb4b18164c9p+65, -0x1.b78438ae9c3bdp-8}

remainder

0.00

{0x0p+0, 0x0p+0}

remainderD

0.00

{0x0p+0, 0x0p+0}

remquo

{ 0.00, 0}

0x0p+0

remquoD

{ 0.00, 0}

0x0p+0

rint

0.00

0x0p+0

rintD

0.00

0x0p+0

rootn

0.69

{-0x1.e2fe6ep-74, -141}

rootnD

0.68

{-0x1.8000000000001p+1, 3}

round

0.00

0x0p+0

roundD

0.00

0x0p+0

rsqrt

1.49

0x1.019566p+124

rsqrtD

1.49

0x1.01ffffffffa39p+1016

sin

2.48

-0x1.09f07ap+21

sinD

1.87

-0x1.f2fffffffffbap+32

sincos

{ 2.48, 2.37}

{0x1.09f07ap+21, 0x1.1338ccp+20}

sincosD

{ 1.87, 2.27}

{0x1.f2fffffffffbap+32, 0x1.d10000000074p+380}

sinh

2.32

0x1.e76078p+2

sinhD

1.53

-0x1.3100000000278p+4

sinpi

2.13

-0x1.45f3ep-9

sinpiD

2.50

-0x1.46000000000dap-7

sqrt

0.00

0x0p+0

sqrtD

0.00

0x0p+0

subtract

0.00

{0x0p+0, 0x0p+0}

subtractD

0.00

{0x0p+0, 0x0p+0}

tan

4.35

-0x1.b4eba2p+22

tanD

4.00

-0x1.2f000000003edp+333

tanh

1.18

-0x1.ca742ap-1

tanhD

1.19

0x1.f400000000395p-1

tanpi

4.21

-0x1.f99d16p-3

tanpiD

4.09

0x1.f6000000001d3p-3

trunc

0.00

0x0p+0

truncD

0.00

0x0p+0