Corrade::Cpu namespace new in Git master

Compile-time and runtime CPU instruction set detection and dispatch.

This namespace provides tags for x86, ARM and WebAssembly instruction sets, which can be used for either system introspection or for choosing a particular implementation based on the available instruction set. These tags build on top of the CORRADE_TARGET_SSE2, CORRADE_TARGET_SSE3 etc. preprocessor macros and provide a runtime feature detection as well.

This library is built if WITH_UTILITY is enabled when building Corrade. To use this library with CMake, request the Utility component of the Corrade package and link to the Corrade::Utility target:

find_package(Corrade REQUIRED Utility)

# ...
target_link_libraries(your-app PRIVATE Corrade::Utility)

This namespace together with all related macros is additionally available in a form of a single-header library. See also Downloading and building Corrade and Using Corrade with CMake for more information.

Usage

The Cpu namespace contains tags such as Cpu::Avx2, Cpu::Sse2, Cpu::Neon or Cpu::Simd128. These tags behave similarly to enum values and their combination result in Cpu::Features, which is similar to the Containers::EnumSet class — they support the same bitwise operations, can be tested for subsets and supersets, and are printable with Utility::Debug.

The most advanced base CPU instruction set enabled at compile time is then exposed through the Cpu::DefaultBase variable, which is an alias to one of those tags, and it matches the architecture-specific CORRADE_TARGET_SSE2 etc. macros. Since it's a constexpr variable, it's usable in a compile-time context. The most straightforward use is shown in the following C++17 snippet:

Utility::Debug{} << "Base compiled instruction set:" << Cpu::DefaultBase;

if constexpr(Cpu::DefaultBase >= Cpu::Avx2) {
    // AVX2 code
} else {
    // scalar code
}

Dispatching on available CPU instruction set at compile time

The main purpose of these tags, however, is to provide means for a compile-time overload resolution. In other words, picking the best candidate among a set of functions implemented with various instruction sets. As an example, let's say you have three different implementations of a certain algorithm transforming numeric data. One is using AVX2 instructions, another is a slower variant using just SSE 4.2 and as a fallback there's one with just regular scalar code. To distinguish them, the functions have the same name, but use a different tag type:

void transform(Cpu::ScalarT, Containers::ArrayView<float> data);
void transform(Cpu::Sse42T, Containers::ArrayView<float> data);
void transform(Cpu::Avx2T, Containers::ArrayView<float> data);

Then you can either call a particular implementation directly — for example to test it — or you can pass Cpu::DefaultBase, and it'll pick the best overload candidate for the set of CPU instruction features enabled at compile time:

transform(Cpu::DefaultBase, data);
  • If the user code was compiled with AVX2 or higher enabled, the Cpu::Avx2 overload will be picked.
  • Otherwise, if just AVX, SSE 4.2 or anything else that includes SSE 4.2 was enabled, the Cpu::Sse42 overload will be picked.
  • Otherwise (for example when compiling for generic x86-64 that has just the SSE2 feature set), the Cpu::Scalar overload will be picked. If you wouldn't provide this overload, the compilation would fail for such a target — which is useful for example to enforce a certain CPU feature set to be enabled in order to use a certain API.

Runtime detection and manual dispatch

So far that was all compile-time detection, which has use mainly when a binary can be optimized directly for the machine it will run on. But such approach is not practical when shipping to a heterogenous set of devices. Instead, the usual workflow is that the majority of code uses the lowest common denominator (such as SSE2 on x86), with the most demanding functions having alternative implementations — picked at runtime — that make use of more advanced instructions for better performance.

Runtime detection is exposed through Cpu::runtimeFeatures(). It will detect CPU features on platforms that support it, and fall back to Cpu::compiledFeatures() on platforms that don't. You can then match the returned Cpu::Features against particular tags to decide which variant to use:

Cpu::Features features = Cpu::runtimeFeatures();
Utility::Debug{} << "Instruction set available at runtime:" << features;

if(features & Cpu::Avx2)
    transform(Cpu::Avx2, data);
else if(features & Cpu::Sse41)
    transform(Cpu::Sse41, data);
else
    transform(Cpu::Scalar, data);

While such approach gives you the most control, manually managing the dispatch branches is error prone and the argument passthrough may also add nontrivial overhead. See below for an efficient automatic runtime dispatch.

Usage with extra instruction sets

Besides the base instruction set, which on x86 is Sse2 through Avx512f, with each tag being a superset of the previous one, there are extra instruction sets such as Popcnt or AvxFma. Basic compile-time detection for these is still straightforward, only now using Default instead of DefaultBase:

Utility::Debug{} << "Base and extra instruction sets:" << Cpu::Default;

if constexpr(Cpu::Default >= (Cpu::Avx2|Cpu::AvxFma)) {
    // AVX2+FMA code
} else {
    // scalar code
}

The process of defining and dispatching to function variants that include extra instruction sets gets moderately more complex, however. As shown on the diagram below, those are instruction sets that neither fit into the hierarchy nor are unambiguously included in a later instruction set. For example, some CPUs are known to have Avx and just AvxFma, some Avx and just AvxF16c and there are even CPUs with Avx2 but no AvxFma.

x86 instruction family tree SSE2 SSE2 SSE3 SSE3 SSE3->SSE2 SSSE3 SSSE3 SSSE3->SSE3 SSE41 SSE4.1 SSE41->SSSE3 SSE42 SSE4.2 SSE42->SSE41 AVX AVX AVX->SSE42 AVX2 AVX2 AVX2->AVX AVX512F AVX512F AVX512F->AVX2 FMA3 FMA3 FMA3->AVX F16C F16C F16C->AVX POPCNT POPCNT LZCNT LZCNT BMI1 BMI1 BMI2 BMI2

While there's no possibility of having a total ordering between all possible combinations for dispatching, the following approach is chosen:

  1. The base instruction set has the main priority. For example, if both an Avx2 and a Sse2 variant are viable candidates, the Avx2 variant gets picked, even if the Sse2 variant uses extra instruction sets that the Avx2 doesn't.
  2. After that, the variant with the most extra instruction sets is chosen. For example, an Avx + AvxFma variant is chosen over plain Avx.

On the declaration side, the desired base instruction set gets ORed with as many extra instruction sets as needed, and then wrapped in a CORRADE_CPU_DECLARE() macro. For example, a lookup algorithm may have a Sse41 implementation which however also relies on Popcnt and Lzcnt, and a fallback Sse2 implementation that uses neither:

int lookup(CORRADE_CPU_DECLARE(Cpu::Sse2), );
int lookup(CORRADE_CPU_DECLARE(Cpu::Sse41|Cpu::Popcnt|Cpu::Lzcnt), );

And a concrete overload gets picked at compile-time by passing a desired combination of CPU tags as well — or Default for the set of features enabled at compile time — this time wrapped in a CORRADE_CPU_SELECT():

int found = lookup(CORRADE_CPU_SELECT(Cpu::Default), );

Enabling instruction sets for particular functions

On GCC and Clang, a machine target has to be enabled in order to use a particular CPU instruction set or its intrinsics. While it's possible to do that for the whole compilation unit by passing for example -mavx2 to the compiler, it would force you to create dedicated files for every architecture variant you want to support. Instead, it's possible to equip particular functions with target attributes defined by CORRADE_ENABLE_SSE2 and related macros, which then makes a particular instruction set enabled for given function.

In contrast, MSVC doesn't restrict intrinsics usage in any way, so you can freely call e.g. AVX2 intrinsics even if the whole file is compiled with just SSE2 enabled. The CORRADE_ENABLE_SSE2 and related macros are thus defined to be empty on this compiler.

For developer convenience, the CORRADE_ENABLE_SSE2 etc. macros are defined only on matching architectures, and generally only if the compiler itself has given feature set implemented and usable. Which means you can easily use them to #ifdef your variants to be compiled only where it makes sense, or even guard intrinsics includes with them to avoid including potentially heavy headers you won't use anyway. In comparison, using the CORRADE_TARGET_SSE2 etc. macros would only make the variant available if the whole compilation unit has a corresponding -m or /arch: option passed to the compiler.

Finally, the CORRADE_ENABLE() function allows multiple instruction sets to be enabled at the same time in a more concise way and consistently on both GCC and Clang.

Definitions of the lookup() function variants from above would then look like below with the target attributes added. The extra instruction sets get explicitly enabled as well, in contrast a scalar variant would have no target-specific annotations at all.

int lookup(CORRADE_CPU_DECLARE(Cpu::Scalar), ) {
    
}
#ifdef CORRADE_ENABLE_SSE2
CORRADE_ENABLE_SSE2 int lookup(CORRADE_CPU_DECLARE(Cpu::Sse2), ) {
    
}
#endif
#if defined(CORRADE_ENABLE_SSE41) && \
    defined(CORRADE_ENABLE_POPCNT) && \
    defined(CORRADE_ENABLE_LZCNT)
CORRADE_ENABLE(SSE41,POPCNT,LZCNT) int lookup(
    CORRADE_CPU_DECLARE(Cpu::Sse41|Cpu::Popcnt|Cpu::Lzcnt), )
{
    
}
#endif

Automatic runtime dispatch

Similarly to how the best-matching function variant can be picked at compile time, there's a possibility to do the same at runtime without maintaining a custom dispatch code for each case as was shown above. To avoid having to dispatch on every call and to remove the argument passthrough overhead, all variants need to have the same function signature, separate from the CPU tags. That's achievable by putting them into lambdas with a common signature, and returning that lambda from a wrapper function that contains the CPU tag. After that, a runtime dispatcher function that is created with the CORRADE_CPU_DISPATCHER_BASE() macro. The transform() variants from above would then look like this instead:

using TransformT = void(*)(Containers::ArrayView<float>);

TransformT transformImplementation(Cpu::ScalarT) {
    return [](Containers::ArrayView<float> data) {  };
}
TransformT transformImplementation(Cpu::Sse42T) {
    return [](Containers::ArrayView<float> data) {  };
}
TransformT transformImplementation(Cpu::Avx2T) {
    return [](Containers::ArrayView<float> data) {  };
}

CORRADE_CPU_DISPATCHER_BASE(transformImplementation)

The macro creates an overload of the same name, but taking Features instead, and internally dispatches to one of the overloads using the same rules as in the compile-time dispatch. Which means you can now call it with e.g. runtimeFeatures(), get a function pointer back and then call it with the actual arguments:

/* Dispatch once and cache the function pointer */
TransformT transform = transformImplementation(Cpu::runtimeFeatures());

/* Call many times */
transform(data);

Automatic runtime dispach with extra instruction sets

If the variants are tagged with extra instruction sets instead of just the base instruction set like in the lookup() case shown above, you'll use the CORRADE_CPU_DISPATCHER() macro instead. There, to avoid a combinatorial explosion of cases to check, you're expected to list the actual extra tags the overloads use. Which is usually just one or two out of the whole set:

using LookupT = int(*)();

LookupT lookupImplementation(CORRADE_CPU_DECLARE(Cpu::Scalar)) {
    
}
LookupT lookupImplementation(CORRADE_CPU_DECLARE(Cpu::Sse2)) {
    
}
LookupT lookupImplementation(CORRADE_CPU_DECLARE(Cpu::Sse41|Cpu::Popcnt|Cpu::Lzcnt)) {
    
}

CORRADE_CPU_DISPATCHER(lookupImplementation, Cpu::Popcnt, Cpu::Lzcnt)

If some extra instruction sets are always used together (like it is above with Popcnt and Lzcnt), you can reduce the amount of tested combinations by specifying them as a single ORed argument instead:

CORRADE_CPU_DISPATCHER(lookupImplementation, Cpu::Popcnt|Cpu::Lzcnt)

On the call side, there's no difference compared to using just the base instruction sets. The created dispatcher function takes Features as well.

Automatic cached dispatch

Ultimately, the dispatch can be performed implicitly, exposing only the final function or a function pointer, with no additional steps needed from the user side. There's three possible scenarios with varying performance tradeoffs. Continuing from the lookupImplementation() example above:

  • On Linux and Android with API 30+ it's possible to use the GNU IFUNC mechanism, where the dynamic linker performs a dispatch during the early startup. This is the fastest variant of runtime dispatch, as it results in an equivalent of a regular dynamic library function call. Assuming a dispatcher was created using either CORRADE_CPU_DISPATCHER() or CORRADE_CPU_DISPATCHER_BASE(), it's implemented using the CORRADE_CPU_DISPATCHED_IFUNC() macro:

    CORRADE_CPU_DISPATCHED_IFUNC(lookupImplementation, int lookup())
  • On platforms where IFUNC isn't available, a function pointer can be used for runtime dispatch instead. It's one additional indirection, which may have a visible effect if the dispatched-to code is relatively tiny and is called from within a tight loop. Assuming a dispatcher was created using either CORRADE_CPU_DISPATCHER() or CORRADE_CPU_DISPATCHER_BASE(), it's implemented using the CORRADE_CPU_DISPATCHED_POINTER() macro:

    CORRADE_CPU_DISPATCHED_POINTER(lookupImplementation, int(*lookup)())
  • For the least amount of overhead, the compile-time dispatch can be used, with arguments passed through by hand. Similarly to IFUNC, this will also result in a regular function, but without the indirect overhead. Furthermore, since it's a direct call to the lambda inside, compiler optimizations will fully inline its contents, removing any remaining overhead and allowing LTO and other inter-procedural optimizations that wouldn't be possible with the indirect calls. This option is best suited for scenarios where it's possible to build and optimize code for a single target platform. In this case it calls directly to the original variants, so no macro is needed and CORRADE_CPU_DISPATCHER() / CORRADE_CPU_DISPATCHER_BASE() is not needed either:

    int lookup() {
        return lookupImplementation(CORRADE_CPU_SELECT(Cpu::Default))();
    }

With all three cases, you end up with either a function or a function pointer. The macro signatures are deliberately similar to each other and to the direct function declaration to make it possible to unify them under a single wrapper macro in case a practical use case needs to handle more than one variant.

Finally, when exposed in a header as appropriate, both the function and the function pointer variant can be then called the same way:

#ifdef LOOKUP_USES_FUNCTION_POINTER
int (*lookup)();
#else
int lookup();
#endif

int found = lookup();

Classes

struct Avx2T
AVX2 tag type.
struct Avx512fT
AVX-512 Foundation tag type.
struct AvxF16cT
AVX F16C tag type.
struct AvxFmaT
AVX FMA tag type.
struct AvxT
AVX tag type.
struct Bmi1T
BMI1 tag type.
struct Bmi2T
BMI2 tag type.
class Features
Feature set.
struct LzcntT
LZCNT tag type.
struct NeonFmaT
NEON FMA tag type.
struct NeonFp16T
NEON FP16 tag type.
struct NeonT
NEON tag type.
struct PopcntT
POPCNT tag type.
struct ScalarT
Scalar tag type.
struct Simd128T
SIMD128 tag type.
struct Sse2T
SSE2 tag type.
struct Sse3T
SSE3 tag type.
struct Sse41T
SSE4.1 tag type.
struct Sse42T
SSE4.2 tag type.
struct Ssse3T
SSSE3 tag type.
template<class T>
struct TypeTraits
Traits class for CPU detection tag types.

Typedefs

using DefaultBaseT = ScalarT
Default base tag type.
using DefaultExtraT = Implementation::Tags<0>
Default extra tag type.
using DefaultT = Implementation::Tags<static_cast<unsigned int>TypeTraits<DefaultBaseT>::Index)|DefaultExtraT::Value>
Default tag type.

Functions

template<class T>
auto tag() -> T constexpr
Tag for a tag type.
template<class T>
auto features() -> Features constexpr
Feature set for a tag type.
auto compiledFeatures() -> Features constexpr
CPU instruction sets enabled at compile time.
auto runtimeFeatures() -> Features
Detect available CPU instruction sets at runtime.

Variables

ScalarT Scalar constexpr
Scalar tag.
Sse2T Sse2 constexpr
SSE2 tag.
Sse3T Sse3 constexpr
SSE3 tag.
Ssse3T Ssse3 constexpr
SSSE3 tag.
Sse41T Sse41 constexpr
SSE4.1 tag.
Sse42T Sse42 constexpr
SSE4.2 tag.
PopcntT Popcnt constexpr
POPCNT tag.
LzcntT Lzcnt constexpr
LZCNT tag.
Bmi1T Bmi1 constexpr
BMI1 tag.
Bmi2T Bmi2 constexpr
BMI2 tag.
AvxT Avx constexpr
AVX tag.
AvxF16cT AvxF16c constexpr
AVX F16C tag.
AvxFmaT AvxFma constexpr
AVX FMA tag.
Avx2T Avx2 constexpr
AVX2 tag.
Avx512fT Avx512f constexpr
AVX-512 Foundation tag.
NeonT Neon constexpr
NEON tag type.
NeonFmaT NeonFma constexpr
NEON FMA tag type.
NeonFp16T NeonFp16 constexpr
NEON FP16 tag type.
Simd128T Simd128 constexpr
SIMD128 tag type.
DefaultBaseT DefaultBase constexpr
Default base tag.
DefaultExtraT DefaultExtra constexpr
Default extra tags.
DefaultT Default constexpr
Default tags.

Typedef documentation

typedef ScalarT Corrade::Cpu::DefaultBaseT

Default base tag type.

See the DefaultBase tag for more information.

typedef Implementation::Tags<0> Corrade::Cpu::DefaultExtraT

Default extra tag type.

See the DefaultExtra tag for more information.

typedef Implementation::Tags<static_cast<unsigned int>TypeTraits<DefaultBaseT>::Index)|DefaultExtraT::Value> Corrade::Cpu::DefaultT

Default tag type.

See the Default tag for more information.

Function documentation

template<class T>
T Corrade::Cpu::tag() constexpr

Tag for a tag type.

Returns a tag corresponding to tag type T. The following two expressions are equivalent:

foo(Cpu::Avx2);
foo(Cpu::tag<Cpu::Avx2T>());

template<class T>
Features Corrade::Cpu::features() constexpr

Feature set for a tag type.

Returns Features with a tag corresponding to tag type T, avoiding a need to form the tag value in order to pass it to Features::Features(T). The following two expressions are equivalent:

Cpu::Features a = Cpu::Avx2;
Cpu::Features b = Cpu::features<Cpu::Avx2T>();

Features Corrade::Cpu::compiledFeatures() constexpr

CPU instruction sets enabled at compile time.

On x86 returns a combination of Sse2, Sse3, Ssse3, Sse41, Sse42, Popcnt, Lzcnt, Bmi1, Bmi2, Avx, AvxF16c, AvxFma, Avx2 and Avx512f based on what all CORRADE_TARGET_SSE2 etc. preprocessor variables are defined.

On ARM, returns a combination of Neon, NeonFma and NeonFp16 based on what all CORRADE_TARGET_NEON etc. preprocessor variables are defined.

On WebAssembly, returns Simd128 based on whether the CORRADE_TARGET_SIMD128 preprocessor variable is defined.

On other platforms or if no known CPU instruction set is enabled, the returned value is equal to Scalar, which in turn is equivalent to empty (or default-constructed) Features.

Features Corrade::Cpu::runtimeFeatures()

Detect available CPU instruction sets at runtime.

On x86 and GCC, Clang or MSVC uses the CPUID builtin to check for the Sse2, Sse3, Ssse3, Sse41, Sse42, Popcnt, Lzcnt, Bmi1, Bmi2, Avx, AvxF16c, AvxFma, Avx2 and Avx512f runtime features. Avx needs OS support as well, if it's not present, no following flags including Bmi1 and Bmi2 are checked either. On compilers other than GCC, Clang and MSVC the function is constexpr and delegates into compiledFeatures().

On ARM and Linux or Android API level 18+ uses getauxval(), or on ARM macOS and iOS uses sysctlbyname() to check for the Neon, NeonFma and NeonFp16. Neon and NeonFma are implicitly supported on ARM64. On other platforms the function is constexpr and delegates into compiledFeatures().

On WebAssembly an attempt to use SIMD instructions without runtime support results in a WebAssembly compilation error and thus runtime detection is largely meaningless. While this may change once the feature detection proposal is implemented, at the moment the function is constexpr and delegates into compiledFeatures().

On other platforms or if no known CPU instruction set is detected, the returned value is equal to Scalar, which in turn is equivalent to empty (or default-constructed) Features.

Variable documentation

ScalarT Corrade::Cpu::Scalar constexpr

Scalar tag.

Code that isn't explicitly optimized with any advanced CPU instruction set. Fallback if no other CPU instruction set is chosen or available. The next most widely supported instruction sets are Sse2 on x86, Neon on ARM and Simd128 on WebAssembly.

Sse2T Corrade::Cpu::Sse2 constexpr

SSE2 tag.

Streaming SIMD Extensions 2. Available only on x86, supported by all 64-bit x86 processors and is present on majority of contemporary 32-bit x86 processors as well. Superset of Scalar, implied by Sse3.

Sse3T Corrade::Cpu::Sse3 constexpr

SSE3 tag.

Streaming SIMD Extensions 3. Available only on x86. Superset of Sse2, implied by Ssse3.

Ssse3T Corrade::Cpu::Ssse3 constexpr

SSSE3 tag.

Supplemental Streaming SIMD Extensions 3. Available only on x86. Superset of Sse3, implied by Sse41.

Note that certain older AMD processors have SSE4a but neither SSSE3 nor SSE4.1. Both can be however treated as a subset of SSE4.1 to a large extent, and it's recommended to use Sse41 to handle those.

Sse41T Corrade::Cpu::Sse41 constexpr

SSE4.1 tag.

Streaming SIMD Extensions 4.1. Available only on x86. Superset of Ssse3, implied by Sse42.

Note that certain older AMD processors have SSE4a but neither SSSE3 nor SSE4.1. Both can be however treated as a subset of SSE4.1 to a large extent, and it's recommended to use Sse41 to handle those.

Sse42T Corrade::Cpu::Sse42 constexpr

SSE4.2 tag.

Streaming SIMD Extensions 4.2. Available only on x86. Superset of Sse41, implied by Avx.

PopcntT Corrade::Cpu::Popcnt constexpr

POPCNT tag.

POPCNT instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.

LzcntT Corrade::Cpu::Lzcnt constexpr

LZCNT tag.

LZCNT instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.

Note that this instruction has encoding compatible with an earlier BSR instruction which has a slightly different behavior. To avoid wrong results if it isn't available, prefer to always detect its presence with runtimeFeatures() instead of a compile-time check.

Bmi1T Corrade::Cpu::Bmi1 constexpr

BMI1 tag.

BMI1 instructions, including TZCNT. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.

Note that the TZCNT instruction has encoding compatible with an earlier BSF instruction which has a slightly different behavior. To avoid wrong results if it isn't available, prefer to always detect its presence with runtimeFeatures() instead of a compile-time check.

Bmi2T Corrade::Cpu::Bmi2 constexpr

BMI2 tag.

BMI2 instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.

AvxT Corrade::Cpu::Avx constexpr

AVX tag.

Advanced Vector Extensions. Available only on x86. Superset of Sse42, implied by Avx2.

AvxF16cT Corrade::Cpu::AvxF16c constexpr

AVX F16C tag.

F16C instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.

AvxFmaT Corrade::Cpu::AvxFma constexpr

AVX FMA tag.

FMA3 instruction set. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.

Avx2T Corrade::Cpu::Avx2 constexpr

AVX2 tag.

Advanced Vector Extensions 2. Available only on x86. Superset of Avx, implied by Avx512f.

Avx512fT Corrade::Cpu::Avx512f constexpr

AVX-512 Foundation tag.

AVX-512 Foundation. Available only on x86. Superset of Avx2.

NeonT Corrade::Cpu::Neon constexpr

NEON tag type.

ARM NEON. Available only on ARM. Superset of Scalar, implied by NeonFp16.

NeonFmaT Corrade::Cpu::NeonFma constexpr

NEON FMA tag type.

ARM NEON with FMA instructions. Available only on ARM. Superset of Neon, implied by NeonFp16.

NeonFp16T Corrade::Cpu::NeonFp16 constexpr

NEON FP16 tag type.

ARM NEON with ARMv8.2-a FP16 vector arithmetic. Available only on ARM. Superset of NeonFma.

Simd128T Corrade::Cpu::Simd128 constexpr

SIMD128 tag type.

128-bit WebAssembly SIMD. Available only on WebAssembly. Superset of Scalar.

DefaultBaseT Corrade::Cpu::DefaultBase constexpr

Default base tag.

Highest base instruction set available on given architecture with current compiler flags. Ordered by priority, on CORRADE_TARGET_X86 it's one of these:

On CORRADE_TARGET_ARM it's one of these:

On CORRADE_TARGET_WASM it's one of these:

In addition to the above, DefaultExtra contains a combination of extra instruction sets available together with the base instruction set, and Default is a combination of both. See also compiledFeatures() which returns a combination of base tags instead of just the highest available, together with the extra instruction sets, and runtimeFeatures() which is capable of detecting the available CPU feature set at runtime.

DefaultExtraT Corrade::Cpu::DefaultExtra constexpr

Default extra tags.

Instruction sets available in addition to DefaultBase on given architecture with current compiler flags. On CORRADE_TARGET_X86 it's a combination of these:

No extra instruction sets are currently defined for CORRADE_TARGET_ARM or CORRADE_TARGET_WASM.

In addition to the above, Default is a combination of both DefaultBase and the extra instruction sets. See also compiledFeatures() which returns these together with a combination of all base instruction sets available, and runtimeFeatures() which is capable of detecting the available CPU feature set at runtime.

DefaultT Corrade::Cpu::Default constexpr

Default tags.

A combination of DefaultBase and DefaultExtra, see their documentation for more information.