namespace new in Git master
CpuCompile-time and runtime CPU instruction set detection and dispatch.
This namespace provides tags for x86, ARM and WebAssembly instruction sets, which can be used for either system introspection or for choosing a particular implementation based on the available instruction set. These tags build on top of the CORRADE_
This library is built if WITH_UTILITY
is enabled when building Corrade. To use this library with CMake, request the Utility
component of the Corrade
package and link to the Corrade::Utility
target:
find_package(Corrade REQUIRED Utility) # ... target_link_libraries(your-app PRIVATE Corrade::Utility)
This namespace together with all related macros is additionally available in a form of a single-header library. See also Downloading and building Corrade and Using Corrade with CMake for more information.
Usage
The Cpu namespace contains tags such as Cpu::
The most advanced base CPU instruction set enabled at compile time is then exposed through the Cpu::constexpr
variable, it's usable in a compile-time context. The most straightforward use is shown in the following C++17 snippet:
Utility::Debug{} << "Base compiled instruction set:" << Cpu::DefaultBase; if constexpr(Cpu::DefaultBase >= Cpu::Avx2) { // AVX2 code } else { // scalar code }
Dispatching on available CPU instruction set at compile time
The main purpose of these tags, however, is to provide means for a compile-time overload resolution. In other words, picking the best candidate among a set of functions implemented with various instruction sets. As an example, let's say you have three different implementations of a certain algorithm transforming numeric data. One is using AVX2 instructions, another is a slower variant using just SSE 4.2 and as a fallback there's one with just regular scalar code. To distinguish them, the functions have the same name, but use a different tag type:
void transform(Cpu::ScalarT, Containers::ArrayView<float> data); void transform(Cpu::Sse42T, Containers::ArrayView<float> data); void transform(Cpu::Avx2T, Containers::ArrayView<float> data);
Then you can either call a particular implementation directly — for example to test it — or you can pass Cpu::
transform(Cpu::DefaultBase, data);
- If the user code was compiled with AVX2 or higher enabled, the Cpu::
Avx2 overload will be picked. - Otherwise, if just AVX, SSE 4.2 or anything else that includes SSE 4.2 was enabled, the Cpu::
Sse42 overload will be picked. - Otherwise (for example when compiling for generic x86-64 that has just the SSE2 feature set), the Cpu::
Scalar overload will be picked. If you wouldn't provide this overload, the compilation would fail for such a target — which is useful for example to enforce a certain CPU feature set to be enabled in order to use a certain API.
Runtime detection and manual dispatch
So far that was all compile-time detection, which has use mainly when a binary can be optimized directly for the machine it will run on. But such approach is not practical when shipping to a heterogenous set of devices. Instead, the usual workflow is that the majority of code uses the lowest common denominator (such as SSE2 on x86), with the most demanding functions having alternative implementations — picked at runtime — that make use of more advanced instructions for better performance.
Runtime detection is exposed through Cpu::
Cpu::Features features = Cpu::runtimeFeatures(); Utility::Debug{} << "Instruction set available at runtime:" << features; if(features & Cpu::Avx2) transform(Cpu::Avx2, data); else if(features & Cpu::Sse41) transform(Cpu::Sse41, data); else transform(Cpu::Scalar, data);
While such approach gives you the most control, manually managing the dispatch branches is error prone and the argument passthrough may also add nontrivial overhead. See below for an efficient automatic runtime dispatch.
Usage with extra instruction sets
Besides the base instruction set, which on x86 is Sse2 through Avx512f, with each tag being a superset of the previous one, there are extra instruction sets such as Popcnt or AvxFma. Basic compile-time detection for these is still straightforward, only now using Default instead of DefaultBase:
Utility::Debug{} << "Base and extra instruction sets:" << Cpu::Default; if constexpr(Cpu::Default >= (Cpu::Avx2|Cpu::AvxFma)) { // AVX2+FMA code } else { // scalar code }
The process of defining and dispatching to function variants that include extra instruction sets gets moderately more complex, however. As shown on the diagram below, those are instruction sets that neither fit into the hierarchy nor are unambiguously included in a later instruction set. For example, some CPUs are known to have Avx and just AvxFma, some Avx and just AvxF16c and there are even CPUs with Avx2 but no AvxFma.
While there's no possibility of having a total ordering between all possible combinations for dispatching, the following approach is chosen:
- The base instruction set has the main priority. For example, if both an Avx2 and a Sse2 variant are viable candidates, the Avx2 variant gets picked, even if the Sse2 variant uses extra instruction sets that the Avx2 doesn't.
- After that, the variant with the most extra instruction sets is chosen. For example, an Avx + AvxFma variant is chosen over plain Avx.
On the declaration side, the desired base instruction set gets ORed with as many extra instruction sets as needed, and then wrapped in a CORRADE_
int lookup(CORRADE_CPU_DECLARE(Cpu::Sse2), …); int lookup(CORRADE_CPU_DECLARE(Cpu::Sse41|Cpu::Popcnt|Cpu::Lzcnt), …);
And a concrete overload gets picked at compile-time by passing a desired combination of CPU tags as well — or Default for the set of features enabled at compile time — this time wrapped in a CORRADE_
int found = lookup(CORRADE_CPU_SELECT(Cpu::Default), …);
Enabling instruction sets for particular functions
On GCC and Clang, a machine target has to be enabled in order to use a particular CPU instruction set or its intrinsics. While it's possible to do that for the whole compilation unit by passing for example -mavx2
to the compiler, it would force you to create dedicated files for every architecture variant you want to support. Instead, it's possible to equip particular functions with target attributes defined by CORRADE_
In contrast, MSVC doesn't restrict intrinsics usage in any way, so you can freely call e.g. AVX2 intrinsics even if the whole file is compiled with just SSE2 enabled. The CORRADE_
For developer convenience, the CORRADE_#ifdef
your variants to be compiled only where it makes sense, or even guard intrinsics includes with them to avoid including potentially heavy headers you won't use anyway. In comparison, using the CORRADE_-m
or /arch:
option passed to the compiler.
Finally, the CORRADE_
Definitions of the lookup()
function variants from above would then look like below with the target attributes added. The extra instruction sets get explicitly enabled as well, in contrast a scalar variant would have no target-specific annotations at all.
int lookup(CORRADE_CPU_DECLARE(Cpu::Scalar), …) { … } #ifdef CORRADE_ENABLE_SSE2 CORRADE_ENABLE_SSE2 int lookup(CORRADE_CPU_DECLARE(Cpu::Sse2), …) { … } #endif #if defined(CORRADE_ENABLE_SSE41) && \ defined(CORRADE_ENABLE_POPCNT) && \ defined(CORRADE_ENABLE_LZCNT) CORRADE_ENABLE(SSE41,POPCNT,LZCNT) int lookup( CORRADE_CPU_DECLARE(Cpu::Sse41|Cpu::Popcnt|Cpu::Lzcnt), …) { … } #endif
Automatic runtime dispatch
Similarly to how the best-matching function variant can be picked at compile time, there's a possibility to do the same at runtime without maintaining a custom dispatch code for each case as was shown above. To avoid having to dispatch on every call and to remove the argument passthrough overhead, all variants need to have the same function signature, separate from the CPU tags. That's achievable by putting them into lambdas with a common signature, and returning that lambda from a wrapper function that contains the CPU tag. After that, a runtime dispatcher function that is created with the CORRADE_transform()
variants from above would then look like this instead:
using TransformT = void(*)(Containers::ArrayView<float>); TransformT transformImplementation(Cpu::ScalarT) { return [](Containers::ArrayView<float> data) { … }; } TransformT transformImplementation(Cpu::Sse42T) { return [](Containers::ArrayView<float> data) { … }; } TransformT transformImplementation(Cpu::Avx2T) { return [](Containers::ArrayView<float> data) { … }; } CORRADE_CPU_DISPATCHER_BASE(transformImplementation)
The macro creates an overload of the same name, but taking Features instead, and internally dispatches to one of the overloads using the same rules as in the compile-time dispatch. Which means you can now call it with e.g. runtimeFeatures(), get a function pointer back and then call it with the actual arguments:
/* Dispatch once and cache the function pointer */ TransformT transform = transformImplementation(Cpu::runtimeFeatures()); /* Call many times */ transform(data);
Automatic runtime dispach with extra instruction sets
If the variants are tagged with extra instruction sets instead of just the base instruction set like in the lookup()
case shown above, you'll use the CORRADE_
using LookupT = int(*)(…); LookupT lookupImplementation(CORRADE_CPU_DECLARE(Cpu::Scalar)) { … } LookupT lookupImplementation(CORRADE_CPU_DECLARE(Cpu::Sse2)) { … } LookupT lookupImplementation(CORRADE_CPU_DECLARE(Cpu::Sse41|Cpu::Popcnt|Cpu::Lzcnt)) { … } CORRADE_CPU_DISPATCHER(lookupImplementation, Cpu::Popcnt, Cpu::Lzcnt)
If some extra instruction sets are always used together (like it is above with Popcnt and Lzcnt), you can reduce the amount of tested combinations by specifying them as a single ORed argument instead:
CORRADE_CPU_DISPATCHER(lookupImplementation, Cpu::Popcnt|Cpu::Lzcnt)
On the call side, there's no difference compared to using just the base instruction sets. The created dispatcher function takes Features as well.
Automatic cached dispatch
Ultimately, the dispatch can be performed implicitly, exposing only the final function or a function pointer, with no additional steps needed from the user side. There's three possible scenarios with varying performance tradeoffs. Continuing from the lookupImplementation()
example above:
On Linux and Android with API 30+ it's possible to use the GNU IFUNC mechanism, where the dynamic linker performs a dispatch during the early startup. This is the fastest variant of runtime dispatch, as it results in an equivalent of a regular dynamic library function call. Assuming a dispatcher was created using either CORRADE_
CPU_ DISPATCHER() or CORRADE_ CPU_ DISPATCHER_ BASE(), it's implemented using the CORRADE_ CPU_ DISPATCHED_ IFUNC() macro: CORRADE_CPU_DISPATCHED_IFUNC(lookupImplementation, int lookup(…))
On platforms where IFUNC isn't available, a function pointer can be used for runtime dispatch instead. It's one additional indirection, which may have a visible effect if the dispatched-to code is relatively tiny and is called from within a tight loop. Assuming a dispatcher was created using either CORRADE_
CPU_ DISPATCHER() or CORRADE_ CPU_ DISPATCHER_ BASE(), it's implemented using the CORRADE_ CPU_ DISPATCHED_ POINTER() macro: CORRADE_CPU_DISPATCHED_POINTER(lookupImplementation, int(*lookup)(…))
For the least amount of overhead, the compile-time dispatch can be used, with arguments passed through by hand. Similarly to IFUNC, this will also result in a regular function, but without the indirect overhead. Furthermore, since it's a direct call to the lambda inside, compiler optimizations will fully inline its contents, removing any remaining overhead and allowing LTO and other inter-procedural optimizations that wouldn't be possible with the indirect calls. This option is best suited for scenarios where it's possible to build and optimize code for a single target platform. In this case it calls directly to the original variants, so no macro is needed and CORRADE_
CPU_ DISPATCHER() / CORRADE_ CPU_ DISPATCHER_ BASE() is not needed either: int lookup(…) { return lookupImplementation(CORRADE_CPU_SELECT(Cpu::Default))(…); }
With all three cases, you end up with either a function or a function pointer. The macro signatures are deliberately similar to each other and to the direct function declaration to make it possible to unify them under a single wrapper macro in case a practical use case needs to handle more than one variant.
Finally, when exposed in a header as appropriate, both the function and the function pointer variant can be then called the same way:
#ifdef LOOKUP_USES_FUNCTION_POINTER int (*lookup)(…); #else int lookup(…); #endif int found = lookup(…);
Classes
- struct Avx2T
- AVX2 tag type.
- struct Avx512fT
- AVX-512 Foundation tag type.
- struct AvxF16cT
- AVX F16C tag type.
- struct AvxFmaT
- AVX FMA tag type.
- struct AvxT
- AVX tag type.
- struct Bmi1T
- BMI1 tag type.
- struct Bmi2T
- BMI2 tag type.
- class Features
- Feature set.
- struct LzcntT
- LZCNT tag type.
- struct NeonFmaT
- NEON FMA tag type.
- struct NeonFp16T
- NEON FP16 tag type.
- struct NeonT
- NEON tag type.
- struct PopcntT
- POPCNT tag type.
- struct ScalarT
- Scalar tag type.
- struct Simd128T
- SIMD128 tag type.
- struct Sse2T
- SSE2 tag type.
- struct Sse3T
- SSE3 tag type.
- struct Sse41T
- SSE4.1 tag type.
- struct Sse42T
- SSE4.2 tag type.
- struct Ssse3T
- SSSE3 tag type.
-
template<class T>struct TypeTraits
- Traits class for CPU detection tag types.
Typedefs
- using DefaultBaseT = ScalarT
- Default base tag type.
- using DefaultExtraT = Implementation::Tags<0>
- Default extra tag type.
- using DefaultT = Implementation::Tags<static_cast<unsigned int>TypeTraits<DefaultBaseT>::Index)|DefaultExtraT::Value>
- Default tag type.
Functions
-
template<class T>auto tag() -> T constexpr
- Tag for a tag type.
-
template<class T>auto features() -> Features constexpr
- Feature set for a tag type.
- auto compiledFeatures() -> Features constexpr
- CPU instruction sets enabled at compile time.
- auto runtimeFeatures() -> Features
- Detect available CPU instruction sets at runtime.
Variables
- ScalarT Scalar constexpr
- Scalar tag.
- Sse2T Sse2 constexpr
- SSE2 tag.
- Sse3T Sse3 constexpr
- SSE3 tag.
- Ssse3T Ssse3 constexpr
- SSSE3 tag.
- Sse41T Sse41 constexpr
- SSE4.1 tag.
- Sse42T Sse42 constexpr
- SSE4.2 tag.
- PopcntT Popcnt constexpr
- POPCNT tag.
- LzcntT Lzcnt constexpr
- LZCNT tag.
- Bmi1T Bmi1 constexpr
- BMI1 tag.
- Bmi2T Bmi2 constexpr
- BMI2 tag.
- AvxT Avx constexpr
- AVX tag.
- AvxF16cT AvxF16c constexpr
- AVX F16C tag.
- AvxFmaT AvxFma constexpr
- AVX FMA tag.
- Avx2T Avx2 constexpr
- AVX2 tag.
- Avx512fT Avx512f constexpr
- AVX-512 Foundation tag.
- NeonT Neon constexpr
- NEON tag type.
- NeonFmaT NeonFma constexpr
- NEON FMA tag type.
- NeonFp16T NeonFp16 constexpr
- NEON FP16 tag type.
- Simd128T Simd128 constexpr
- SIMD128 tag type.
- DefaultBaseT DefaultBase constexpr
- Default base tag.
- DefaultExtraT DefaultExtra constexpr
- Default extra tags.
- DefaultT Default constexpr
- Default tags.
Typedef documentation
typedef ScalarT Corrade:: Cpu:: DefaultBaseT
#include <Corrade/Cpu.h>
Default base tag type.
See the DefaultBase tag for more information.
typedef Implementation::Tags<0> Corrade:: Cpu:: DefaultExtraT
#include <Corrade/Cpu.h>
Default extra tag type.
See the DefaultExtra tag for more information.
typedef Implementation::Tags<static_cast<unsigned int>TypeTraits<DefaultBaseT>::Index)|DefaultExtraT::Value> Corrade:: Cpu:: DefaultT
#include <Corrade/Cpu.h>
Default tag type.
See the Default tag for more information.
Function documentation
#include <Corrade/Cpu.h>
template<class T>
T Corrade:: Cpu:: tag() constexpr
Tag for a tag type.
Returns a tag corresponding to tag type T
. The following two expressions are equivalent:
foo(Cpu::Avx2); foo(Cpu::tag<Cpu::Avx2T>());
#include <Corrade/Cpu.h>
template<class T>
Features Corrade:: Cpu:: features() constexpr
Feature set for a tag type.
Returns Features with a tag corresponding to tag type T
, avoiding a need to form the tag value in order to pass it to Features::
Cpu::Features a = Cpu::Avx2; Cpu::Features b = Cpu::features<Cpu::Avx2T>();
Features Corrade:: Cpu:: compiledFeatures() constexpr
#include <Corrade/Cpu.h>
CPU instruction sets enabled at compile time.
On x86 returns a combination of Sse2, Sse3, Ssse3, Sse41, Sse42, Popcnt, Lzcnt, Bmi1, Bmi2, Avx, AvxF16c, AvxFma, Avx2 and Avx512f based on what all CORRADE_
On ARM, returns a combination of Neon, NeonFma and NeonFp16 based on what all CORRADE_
On WebAssembly, returns Simd128 based on whether the CORRADE_
On other platforms or if no known CPU instruction set is enabled, the returned value is equal to Scalar, which in turn is equivalent to empty (or default-constructed) Features.
Features Corrade:: Cpu:: runtimeFeatures()
#include <Corrade/Cpu.h>
Detect available CPU instruction sets at runtime.
On x86 and GCC, Clang or MSVC uses the CPUID builtin to check for the Sse2, Sse3, Ssse3, Sse41, Sse42, Popcnt, Lzcnt, Bmi1, Bmi2, Avx, AvxF16c, AvxFma, Avx2 and Avx512f runtime features. Avx needs OS support as well, if it's not present, no following flags including Bmi1 and Bmi2 are checked either. On compilers other than GCC, Clang and MSVC the function is constexpr
and delegates into compiledFeatures().
On ARM and Linux or Android API level 18+ uses getauxval(), or on ARM macOS and iOS uses sysctlbyname() to check for the Neon, NeonFma and NeonFp16. Neon and NeonFma are implicitly supported on ARM64. On other platforms the function is constexpr
and delegates into compiledFeatures().
On WebAssembly an attempt to use SIMD instructions without runtime support results in a WebAssembly compilation error and thus runtime detection is largely meaningless. While this may change once the feature detection proposal is implemented, at the moment the function is constexpr
and delegates into compiledFeatures().
On other platforms or if no known CPU instruction set is detected, the returned value is equal to Scalar, which in turn is equivalent to empty (or default-constructed) Features.
Variable documentation
ScalarT Corrade:: Cpu:: Scalar constexpr
#include <Corrade/Cpu.h>
Scalar tag.
Code that isn't explicitly optimized with any advanced CPU instruction set. Fallback if no other CPU instruction set is chosen or available. The next most widely supported instruction sets are Sse2 on x86, Neon on ARM and Simd128 on WebAssembly.
Sse2T Corrade:: Cpu:: Sse2 constexpr
#include <Corrade/Cpu.h>
SSE2 tag.
Streaming SIMD Extensions 2. Available only on x86, supported by all 64-bit x86 processors and is present on majority of contemporary 32-bit x86 processors as well. Superset of Scalar, implied by Sse3.
Sse3T Corrade:: Cpu:: Sse3 constexpr
#include <Corrade/Cpu.h>
SSE3 tag.
Streaming SIMD Extensions 3. Available only on x86. Superset of Sse2, implied by Ssse3.
Ssse3T Corrade:: Cpu:: Ssse3 constexpr
#include <Corrade/Cpu.h>
SSSE3 tag.
Supplemental Streaming SIMD Extensions 3. Available only on x86. Superset of Sse3, implied by Sse41.
Note that certain older AMD processors have SSE4a but neither SSSE3 nor SSE4.1. Both can be however treated as a subset of SSE4.1 to a large extent, and it's recommended to use Sse41 to handle those.
Sse41T Corrade:: Cpu:: Sse41 constexpr
#include <Corrade/Cpu.h>
SSE4.1 tag.
Streaming SIMD Extensions 4.1. Available only on x86. Superset of Ssse3, implied by Sse42.
Note that certain older AMD processors have SSE4a but neither SSSE3 nor SSE4.1. Both can be however treated as a subset of SSE4.1 to a large extent, and it's recommended to use Sse41 to handle those.
Sse42T Corrade:: Cpu:: Sse42 constexpr
#include <Corrade/Cpu.h>
SSE4.2 tag.
Streaming SIMD Extensions 4.2. Available only on x86. Superset of Sse41, implied by Avx.
PopcntT Corrade:: Cpu:: Popcnt constexpr
#include <Corrade/Cpu.h>
POPCNT tag.
POPCNT instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.
LzcntT Corrade:: Cpu:: Lzcnt constexpr
#include <Corrade/Cpu.h>
LZCNT tag.
LZCNT instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.
Note that this instruction has encoding compatible with an earlier BSR
instruction which has a slightly different behavior. To avoid wrong results if it isn't available, prefer to always detect its presence with runtimeFeatures() instead of a compile-time check.
Bmi1T Corrade:: Cpu:: Bmi1 constexpr
#include <Corrade/Cpu.h>
BMI1 tag.
BMI1 instructions, including TZCNT
. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.
Note that the TZCNT
instruction has encoding compatible with an earlier BSF
instruction which has a slightly different behavior. To avoid wrong results if it isn't available, prefer to always detect its presence with runtimeFeatures() instead of a compile-time check.
Bmi2T Corrade:: Cpu:: Bmi2 constexpr
#include <Corrade/Cpu.h>
BMI2 tag.
BMI2 instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.
AvxT Corrade:: Cpu:: Avx constexpr
#include <Corrade/Cpu.h>
AVX tag.
Advanced Vector Extensions. Available only on x86. Superset of Sse42, implied by Avx2.
AvxF16cT Corrade:: Cpu:: AvxF16c constexpr
#include <Corrade/Cpu.h>
AVX F16C tag.
F16C instructions. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.
AvxFmaT Corrade:: Cpu:: AvxFma constexpr
#include <Corrade/Cpu.h>
AVX FMA tag.
FMA3 instruction set. Available only on x86. This instruction set is treated as an extra, i.e. is neither a superset of nor implied by any other instruction set. See Usage with extra instruction sets for more information.
Avx2T Corrade:: Cpu:: Avx2 constexpr
#include <Corrade/Cpu.h>
AVX2 tag.
Advanced Vector Extensions 2. Available only on x86. Superset of Avx, implied by Avx512f.
Simd128T Corrade:: Cpu:: Simd128 constexpr
#include <Corrade/Cpu.h>
SIMD128 tag type.
128-bit WebAssembly SIMD. Available only on WebAssembly. Superset of Scalar.
DefaultBaseT Corrade:: Cpu:: DefaultBase constexpr
#include <Corrade/Cpu.h>
Default base tag.
Highest base instruction set available on given architecture with current compiler flags. Ordered by priority, on CORRADE_
- Avx512f if CORRADE_
TARGET_ AVX512F is defined - Avx2 if CORRADE_
TARGET_ AVX2 is defined - Avx if CORRADE_
TARGET_ AVX is defined - Sse42 if CORRADE_
TARGET_ SSE42 is defined - Sse41 if CORRADE_
TARGET_ SSE41 is defined - Ssse3 if CORRADE_
TARGET_ SSSE3 is defined - Sse3 if CORRADE_
TARGET_ SSE3 is defined - Sse2 if CORRADE_
TARGET_ SSE2 is defined - Scalar otherwise
On CORRADE_
- NeonFp16 if CORRADE_
TARGET_ NEON_ FP16 is defined - NeonFma if CORRADE_
TARGET_ NEON_ FMA is defined - Neon if CORRADE_
TARGET_ NEON is defined - Scalar otherwise
On CORRADE_
- Simd128 if CORRADE_
TARGET_ SIMD128 is defined - Scalar otherwise
In addition to the above, DefaultExtra contains a combination of extra instruction sets available together with the base instruction set, and Default is a combination of both. See also compiledFeatures() which returns a combination of base tags instead of just the highest available, together with the extra instruction sets, and runtimeFeatures() which is capable of detecting the available CPU feature set at runtime.
DefaultExtraT Corrade:: Cpu:: DefaultExtra constexpr
#include <Corrade/Cpu.h>
Default extra tags.
Instruction sets available in addition to DefaultBase on given architecture with current compiler flags. On CORRADE_
- Popcnt if CORRADE_
TARGET_ POPCNT is defined - Lzcnt if CORRADE_
TARGET_ LZCNT is defined - Bmi1 if CORRADE_
TARGET_ BMI1 is defined - Bmi2 if CORRADE_
TARGET_ BMI2 is defined - AvxFma if CORRADE_
TARGET_ AVX_ FMA is defined - AvxF16c if CORRADE_
TARGET_ AVX_ F16C is defined
No extra instruction sets are currently defined for CORRADE_
In addition to the above, Default is a combination of both DefaultBase and the extra instruction sets. See also compiledFeatures() which returns these together with a combination of all base instruction sets available, and runtimeFeatures() which is capable of detecting the available CPU feature set at runtime.
DefaultT Corrade:: Cpu:: Default constexpr
#include <Corrade/Cpu.h>
Default tags.
A combination of DefaultBase and DefaultExtra, see their documentation for more information.