OpenGL » Driver workarounds

List of OpenGL driver workarounds used by Magnum.

Driver workarounds used by a particular app are listed in the engine startup log such as here:

Renderer: GeForce GT 740M/PCIe/SSE2 by NVIDIA Corporation
OpenGL version: 4.6.0 NVIDIA 390.25
Using optional features:
Using driver workarounds:

These identifiers correspond to the strings in the below listing. For debugging and diagnostic purpose it's possible to disable particular workarounds by passing their identifier string to the --magnum-disable-workarounds command-line option. See Command-line options for more information.

/* Calling glBufferData(), glMapBuffer(), glMapBufferRange() or glUnmapBuffer()
   on ANY buffer when ANY buffer is attached to a currently bound
   GL_TEXTURE_BUFFER crashes in gleUpdateCtxDirtyStateForBufStampChange deep
   inside Apple's GLengine. This can be worked around by unbinding all buffer
   textures before attempting to do such operation.

   A previous iteration of this workaround was to remember if a buffer is
   attached to a buffer texture, temporarily detaching it, calling given
   data-modifying API and then attaching it back with the same parameters.
   Unfortunately we also had to cache the internal texture format, as
   GL_TEXTURE_INTERNAL_FORMAT query is broken for buffer textures as well,
   returning always GL_R8 (the spec-mandated default). "Fortunately" macOS
   doesn't support ARB_texture_buffer_range so we didn't need to store also
   offset/size, only texture ID and its internal format, wasting 8 bytes per
   Buffer instance. HOWEVER, then we discovered this is not enough and also
   completely unrelated buffers suffer from the same crash. Fixing that
   properly in a similar manner would mean going through all live buffer
   texture instances and temporarily detaching their buffer when doing *any*
   data modification on *any* buffer, which would have extreme perf
   implications. So FORTUNATELY unbinding the textures worked around this too,
   and is a much nicer workaround after all. */

/* glBeginQuery() with GL_TIME_ELAPSED causes a GL_OUT_OF_MEMORY error when
   running from the Android shell (through ADB). No such error happens in an
   APK. Detecting using the $SHELL environment variable and disabling
   GL_EXT_disjoint_timer_query in that case. */

/* ARB_direct_state_access on AMD Windows drivers has broken
   glTextureSubImage3D() / glGetTextureImage() on cube map textures (but not
   cube map arrays), always failing with erros like
   `glTextureSubImage3D has generated an error (GL_INVALID_VALUE)` if Z size or
   offset is larger than 1. Working around that by up/downloading
   slice-by-slice using non-DSA APIs, similarly to the
   svga3d-texture-upload-slice-by-slice workaround. The compressed image up/
   download is affected as well, but we lack APIs for easy format-dependent
   slicing and offset calculation, so those currently still fail. */

/* AMD Windows drivers have broken the DSA glCopyTextureSubImage3D(), returning
   GL_INVALID_VALUE. The non-DSA code path works. */

/* AMD Windows glCreateQueries() works for everything except
   GL_TRANSFORM_FEEDBACK_[STREAM_]OVERFLOW, probably they just forgot to adapt
   it to this new GL 4.6 addition. Calling the non-DSA code path in that case
   instead. Similar to "mesa-dsa-createquery-except-pipeline-stats". */

/* Creating core context with specific version on AMD and NV proprietary
   drivers on Linux/Windows and Intel drivers on Windows causes the context to
   be forced to given version instead of selecting latest available version */

/* On Windows Intel drivers ARB_shading_language_420pack is exposed in GLSL
   even though the extension (e.g. binding keyword) is not supported */

/* Mesa glCreateQueries() works for everything except stuff from GL 4.6
   ARB_pipeline_statistics_query, probably just forgotten. Calling the non-DSA
   code path in that case instead. Similar to
   "amd-windows-dsa-createquery-except-xfb-overflow". */

/* Forward-compatible GL contexts on Mesa still report line width range as
   [1, 7], but setting wide line width fails. According to the specs the max
   value on forward compatible contexts should be 1.0, so patching it. */

/* On Windows NVidia drivers the glTransformFeedbackVaryings() does not make a
   copy of its char* arguments so it fails at link time when the original char
   arrays are not in scope anymore. Enabling *synchronous* debug output
   circumvents this bug. Can be triggered by running TransformFeedbackGLTest
   with GL_KHR_debug extension disabled. */

/* Layout qualifier causes compiler error with GLSL 1.20 on Mesa, GLSL 1.30 on
   NVidia and 1.40 on macOS. Everything is fine when using a newer GLSL
   version. */

/* NVidia drivers (358.16) report compressed block size from internal format
   query in bits instead of bytes */

/* NVidia drivers (358.16) report different compressed image size for cubemaps
   based on whether the texture is immutable or not and not based on whether
   I'm querying all faces (ARB_DSA) or a single face (non-DSA, EXT_DSA) */

/* NVidia drivers (358.16) return only the first slice of compressed cube map
   image when querying all six slices using the ARB_DSA API */

/* NVidia drivers return 0 when asked for GL_CONTEXT_PROFILE_MASK, so it needs
   to be worked around by asking for GL_ARB_compatibility */

/* (Headless) EGL contexts for desktop GL on NVidia 384 and 390 drivers don't
   have correct statically linked GL 1.0 and 1.1 functions (such as
   glGetString()) and one has to retrieve them explicitly using
   eglGetProcAddress(). Doesn't seem to happen on pre-384 and 396, but it's not
   possible to get driver version through EGL, so enabling this unconditionally
   on all EGL NV contexts. */

/* SVGA3D (VMware host GL driver) glDrawArrays() draws nothing when the vertex
   buffer memory is initialized using glNamedBufferData() from ARB_DSA. Using
   the non-DSA glBufferData() works. */

/* SVGA3D does out-of-bound writes in some cases of glGetTexSubImage(), leading
   to memory corruption on client machines. That's nasty, so the whole
   ARB_get_texture_sub_image is disabled. */

/* SVGA3D has broken handling of glTex[ture][Sub]Image*D() for 1D arrays, 2D
   arrays, 3D textures and cube map textures where it uploads just the first
   slice in the last dimension. This is only with copies from host memory, not
   with buffer images. Seems to be fixed in Mesa 13, but I have no such system
   to verify that on. */

/* Shader sources containing UTF-8 characters are converted to empty strings
   when running on Emscripten with -s USE_PTHREADS=1. Working around that by
   replacing all chars > 127 with spaces. Relevant code: */

/* Empty EGL_CONTEXT_FLAGS_KHR cause SwiftShader to fail context
   creation with EGL_BAD_ATTRIBUTE. Not sending the flags then. Relevant code: */

/* SwiftShader crashes deep inside eglMakeCurrent() when using
   EGL_NO_SURFACE. Supplying a 32x32 PBuffer to work around that. */

/* SwiftShader 4.1.0 on ES2 contexts reports GL_ANGLE_instanced_arrays and
   GL_EXT_instanced_arrays but has no glDrawArraysInstancedANGLE /
   glDrawArraysInstancedEXT nor glDrawElementsInstancedANGLE /
   glDrawElementsInstancedEXT entrypoints, only the unsuffixed versions for
   ES3. OTOH, glVertexAttribDivisor is there for both ANGLE and EXT. Relevant
   Disabling the two extensions on ES2 contexts to avoid nullptr crashes. */

/* SwiftShader 4.1.0 on ES2 contexts reports GL_OES_texture_3D but from all its
   entrypoints only glTexImage3DOES is present, all others are present only in
   the ES3 unsuffixed versions. Relevant code:
   Disabling the extension on ES2 contexts to avoid nullptr crashes. */

/* SwiftShader 4.1.0 has special handling for binding buffers to the transform
   feedback target, requiring an XFB object to be active when a buffer is bound
   to GL_TRANSFORM_FEEDBACK_BUFFER and ignoring the glBindBuffer() call
   otherwise. No other driver does that. As a workaround, setting
   Buffer::TargetHint::TransformFeedback will make it use
   Buffer::TargetHint::Array instead, as that works okay. */

/* SwiftShader 4.1.0 does implement gl_VertexID for ES3 contexts, but in
   practice it doesn't work, returning a constant value. In order to make this
   easier to check, there's a dummy MAGNUM_shader_vertex_id extension that's
   defined on all GL 3.0+ and GLES 3.0+ / WebGL 2+ contexts *except* for
   SwiftShader. */

/* Even with the DSA variant, where GL_IMPLEMENTATION_COLOR_READ_* is passed to
   glGetNamedFramebufferParameter(), Mesa complains that the framebuffer is not
   bound for reading. Relevant code:
   Workaround is to explicitly bind the framebuffer for reading. */

/* Intel drivers on Windows return GL_UNSIGNED_BYTE for *both*
   glGetNamedFramebufferParameter() or glGetFramebufferParameter(),
   independently on what's the actual framebuffer format. Using glGetInteger()
   makes it return GL_RGBA and GL_UNSIGNED_BYTE for RGBA8 framebuffers, and
   cause an "Error has been generated. GL error GL_INVALID_OPERATION in
   GetIntegerv: (ID: 2576729458) Generic error" when it is not. Since
   glGetInteger() is actually able to return a correct value in *one
   circumstance*, it's preferrable to the other random shit the driver is
   doing. */

/* Intel drivers on Windows have some synchronization / memory alignment bug in
   the DSA glNamedBufferData() when the same buffer is set as an index buffer
   to a mesh right after or repeatedly. Calling glBindBuffer() right before or
   after the data upload fixes the issue. The above is reproducible with the
   2019.01 ImGui example, and used to be worked around in a more hopeful way.
   However, the reports about things going *bad* in heavier ImGui-based apps
   didn't stop with that and none of my tests were able to reproduce anything.
   Since I lost patience already, I'm disabling the DSA code paths for
   everything related to buffers. (Two weeks pass.) But wait! while that fixed
   all issues for *some* users, it made things completely broken elsewhere,
   causing an app to render just a clear color and nothing else. The cancer
   apparently spread further, so I'm disabling all VAO-related DSA code paths
   as well now. Workarounds listed separately, in case someone might want to
   dig further or experience the misery of only one of them being active.

   To save you time experimenting:

   - (Epilepsy warning!) With the former disabled and no matter whether the
     second is disabled or not, the ImGui example (or any other ImGui-based
     app, really), the screen will start flickering heavily under *some*
     circumstances. This is known since drivers 24 at least.
   - With the former enabled and the second disabled, you might either
     experience a total doom, where just the framebuffer clear color is
     visible, or your app is totally fine. This is reproducible with drivers 25
     or 26 at least. Note that modifying the code to enable this workaround on
     other drivers (AMD on Windows, e.g.) doesn't break anything, so it's not
     like the workaround would be incomplete with some code paths still relying
     on DSA that's not there. It's clearly Intel drivers fault.
   - With both enabled, things seem to be fine, and I hope it stays that way
     also for future driver updates. */

/* ARB_direct_state_access implementation on Intel Windows drivers has broken
   *everything* related to cube map textures (but not cube map arrays) -- data
   upload, data queries, framebuffer attachment, framebuffer copies, all
   complaining about "Wrong <func> 6 provided for <target> 34067" and similar
   (GL_TEXTURE_CUBE_MAP is 34067). Using the non-DSA code paths as a
   workaround (for the 3D image up/download as well). */

/* DSA glBindTextureUnit() on Intel Windows drivers simply doesn't work when
   passing 0 to it. Non-zero IDs work correctly except for cube maps. Using the
   non-DSA code path for unbinding and cube maps as a workaround. */

/* DSA glNamedFramebufferTexture() on Intel Windows drivers doesn't work for
   layered cube map array attachments. Non-layered or non-array cube map
   attachment works. Using the non-DSA code path as a workaround. */

/* DSA glClearNamedFramebuffer*() on Intel Windows drivers doesn't do anything.
   Using the non-DSA code path as a workaournd. */

/* Using DSA glCreateQueries() on Intel Windows drivers breaks
   glBeginQueryIndexed(). Using the non-DSA glGenQueries() instead makes it
   work properly. See TransformFeedbackGLTest for a test. */

/* DSA-ified "vertex layout" glVertexArrayAttribIFormat() is broken when
   passing shorts instead of full 32bit ints. Using the old-style
   glVertexAttribIPointer() works correctly. No idea if the non-DSA
   glVertexAttribIFormat() works or not. A test that triggers this issue is in
   MeshGLTest::addVertexBufferIntWithShort(). */

/* When using more than just a vertex and fragment shader (geometry shader,
   e.g.), ARB_explicit_uniform_location on Intel silently uses wrong
   locations, blowing up with either a non-descript
    Error has been generated. GL error GL_INVALID_OPERATION in ProgramUniformMatrix4fv: (ID: 2052228270) Generic error
   or, if you are lucky, a highly-cryptic-but-still-better-than-nothing
    Error has been generated. GL error GL_INVALID_OPERATION in ProgramUniform4fv: (ID: 1725519030) GL error GL_INVALID_OPERATION: mismatched type setting uniform of location "3" in program 1, "" using shaders, 2, "", 3, "", 8, ""
   *unless* you have vertex uniform locations first, fragment locations second
   and geometry locations last. Another case is happening with color for a
   Flat3D shader --  because a (compiled out / unused) texture matrix was at
   location 1, setting color to location 2 didn't work, ending up with a
   Generic error again (driver version 27). Because this is impossible to
   prevent, the extension is completely disabled on all Intel Windows drivers. */

/* NVidia seems to be returning values for the default framebuffer when
   glGetNamedFramebufferParameter(). Using either glGetInteger() or
   glGetFramebufferParameter() works correctly. */

/* ApiTrace needs an explicit initial glViewport() call to initialize its
   framebuffer size, otherwise it assumes it's zero-sized. */

/* While the EXT_disjoint_timer_query extension should be only on WebGL 1 and
   EXT_disjoint_timer_query_webgl2 only on WebGL 2, Firefox reports
   EXT_disjoint_timer_query on both. The entry points work correctly however,
   so this workaround makes Magnum pretend EXT_disjoint_timer_query_webgl2 is
   available when it detects EXT_disjoint_timer_query on WebGL 2 builds on
   Firefox. See also,
   and for the
   Emscripten-side part of this workaround. */

Chromium has a similar list: