Magnum::Text::AbstractShaper class new in Git master

Base for text shapers.

Returned from AbstractFont::createShaper(), provides an interface for shaping text with the AbstractFont it originated from. Meant to be (privately) subclassed by AbstractFont plugin implementations.

Shaping is a process of converting a sequence of Unicode codepoints to a visual form, i.e. a list of glyphs of a particular font, their offsets and horizontal or vertical advances. Shaping is often not a 1:1 mapping from codepoints to glyphs, but involves merging, subdividing and reordering as well.

Usage

Call AbstractFont::createShaper() to get a shaper instance. The plugin will always return a valid instance so it's not needed to check for the pointer beging nullptr, however note that the originating AbstractFont instance has to stay in scope for at least as long as the AbstractShaper is alive.

A text is shaped by calling shape(), retrieving the shaped glyph count with glyphCount() and then getting the glyph data with glyphIdsInto() and glyphOffsetsAdvancesInto(). Glyph IDs can be then queried in (or inserted into) an AbstractGlyphCache, and the rendered glyphs positioned at offsets with the cursor moving by advances is what makes up the final shaped text.

Containers::Pointer<Text::AbstractFont> font = ;
Containers::Pointer<Text::AbstractShaper> shaper = font->createShaper();

/* Set text properties and shape it */
shaper->setScript(Text::Script::Latin);
shaper->setDirection(Text::ShapeDirection::LeftToRight);
shaper->setLanguage("en");
shaper->shape("Hello, world!");

/* Get the glyph info back */
struct GlyphInfo {
    UnsignedInt id;
    Vector2 offset;
    Vector2 advance;
};
Containers::Array<GlyphInfo> glyphs{NoInit, shaper->glyphCount()};
shaper->glyphIdsInto(
    stridedArrayView(glyphs).slice(&GlyphInfo::id));
shaper->glyphOffsetsAdvancesInto(
    stridedArrayView(glyphs).slice(&GlyphInfo::offset),
    stridedArrayView(glyphs).slice(&GlyphInfo::advance));

For best results, it's recommended to call (a subset of) setScript(), setLanguage() and setDirection() if at least some properties of the input text are known, as shown above. Without these, the font plugin may attempt to autodetect the properties, which might not always give a correct result. If a particular font plugin doesn't implement given script, language or direction or if it doesn't have any special handling for it, given function will return false. The script() const, language() const and direction() const can be used to inspect results of autodetection after shape() has been called. The set of supported scripts, languages and directions and exact behavior for unsupported values is plugin-specific — it may for example choose a fallback instead, or it may ignore the setting altogeter. See documentation of particular AbstractFont subclasses for more information.

Enabling and disabling typographic features

In the above snippet, the whole text is shaped using typographic features that are default in the font. For example, assuming the font would support small capitals (and the particular AbstractFont plugin would recognize and use the feature), we could render the "world" part with small caps, resulting in "Hello, ᴡᴏʀʟᴅ!".

shaper->shape("Hello, world!", {
    {Text::Feature::SmallCapitals, 7, 12}
});

Similarly, features can be enabled for the whole text by omitting the begin and end parameters, or for example a feature that a particular AbstractFont plugin uses by default can be disabled by passing an explicit false argument. The range, if present, is always given in bytes of the UTF-8 input. Capabilities of typographic features are rather broad, see the Feature enum and documentation linked from it for exhaustive information.

Combining different shapers

Sometimes it's desirable to render different parts of the text with different fonts, not just different features of the same font. A variation of the above example could be rendering the "world" part with a bold font:

Containers::Pointer<Text::AbstractFont> font = ;
Containers::Pointer<Text::AbstractFont> boldFont = ;
Containers::Pointer<Text::AbstractShaper> shaper = font->createShaper();
Containers::Pointer<Text::AbstractShaper> boldShaper = boldFont->createShaper();


Containers::Array<GlyphInfo> glyphs;

/* Shape "Hello, " with a regular font */
shaper->shape("Hello, world!", 0, 7);
Containers::StridedArrayView1D<GlyphInfo> glyphs1 =
    arrayAppend(glyphs, NoInit, shaper->glyphCount());
shaper->glyphIdsInto(
    glyphs1.slice(&GlyphInfo::id));
shaper->glyphOffsetsAdvancesInto(
    glyphs1.slice(&GlyphInfo::offset),
    glyphs1.slice(&GlyphInfo::advance));

/* Append "world" shaped with a bold font */
boldShaper->shape("Hello, world!", 7, 12);
Containers::StridedArrayView1D<GlyphInfo> glyphs2 =
    arrayAppend(glyphs, NoInit, boldShaper->glyphCount());
shaper->glyphIdsInto(
    glyphs2.slice(&GlyphInfo::id));
shaper->glyphOffsetsAdvancesInto(
    glyphs2.slice(&GlyphInfo::offset),
    glyphs2.slice(&GlyphInfo::advance));

/* Finally shape "!" with a regular font again */
shaper->shape("Hello, world!", 12, 13);
Containers::StridedArrayView1D<GlyphInfo> glyphs3 =
    arrayAppend(glyphs, NoInit, shaper->glyphCount());
shaper->glyphIdsInto(
    glyphs3.slice(&GlyphInfo::id));
shaper->glyphOffsetsAdvancesInto(
    glyphs3.slice(&GlyphInfo::offset),
    glyphs3.slice(&GlyphInfo::advance));

The resulting glyphs array is usable the same way as in the above case, with a difference that the glyph IDs have to be looked up in an AbstractGlyphCache with a font ID corresponding to the range they're in. Also note that the whole text is passed every time and a begin & end is specified for it instead of passing just the slice alone. While possibly not having any visible effect in this particular case, in general it allows the shaper to make additional decisions based on surrounding context, for example picking glyphs that are better connected to their neighbors in handwriting fonts.

Managing multiple instances

As shown above, a particular AbstractShaper instance is reusable, i.e. it's possible to call shape() (and potentially also setScript(), setLanguage() and setDirection()) several times to shape multiple pieces of text with it. Doing so allows the AbstractFont plugin implementation to reuse allocated buffers and other state compared to a fresh instance from AbstractFont::createShaper() having to be initialized every time.

The application may choose several strategies, for example have a single AbstractShaper instance and shape all texts with it, resetting its state every time. Or for example have a few persistent AbstractShaper instances for dynamic text that changes every frame, or have dedicated preconfigured per-font, per-script or per-language instances.

Mapping between input text and shaped glyphs

For implementing text selection or editing, mapping from screen position to concrete glyphs can be done using the advances returned from glyphOffsetsAdvancesInto(). From there however, in the general case, the text can consist of multi-byte UTF-8 characters, the shaper can perform ligature substitutions, glyph decomposition or reordering, and thus there's rarely a 1:1 mapping from the shaped glyphs back to the input text.

The mapping from glyph IDs to bytes of the text passed to shape() can be retrieved using glyphClustersInto(). In the following example, a range between glyphs 2 and 5 is mapped to the input text bytes, for example to copy it as a selection to clipboard:

Containers::StringView text = ;

shaper->shape(text);


Containers::Array<UnsignedInt> clusters{NoInit, shaper->glyphCount()};
shaper->glyphClustersInto(clusters);

Containers::StringView selection = text.slice(clusters[2], clusters[5]);

In the other direction, picking a range of glyphs corresponding to a range of input bytes, involves finding cluster IDs with a lower and upper bound for given byte positions. See the documentation of glyphClustersInto() for concrete examples of how retrieved cluster IDs may look like depending on what operations the shaper performs.

Subclassing

The AbstractFont plugin is meant to create a local AbstractShaper subclass. It implements at least doShape(), doGlyphIdsInto(), doGlyphOffsetsAdvancesInto() and doGlyphClustersInto(), and potentially also (a subset of) doSetScript(), doScript(), doSetLanguage(),doLanguage(), doSetDirection() and doDirection(). The public API does most sanity checks on its own, see documentation of particular do*() functions for more information about the guarantees.

Constructors, destructors, conversion operators

AbstractShaper(AbstractFont& font) explicit
Constructor.
AbstractShaper(AbstractShaper&) deleted
Copying is not allowed.
AbstractShaper(AbstractShaper&&) noexcept
Move constructor.

Public functions

auto operator=(AbstractShaper&) -> AbstractShaper& deleted
Copying is not allowed.
auto operator=(AbstractShaper&&) -> AbstractShaper& noexcept
Move assignment.
auto font() -> AbstractFont&
Font the shaper is originating from.
auto font() const -> const AbstractFont&
auto setScript(Script script) -> bool
Set text script.
auto setLanguage(Containers::StringView language) -> bool
Set text language.
auto setDirection(ShapeDirection direction) -> bool
Set direction the text is meant to be shaped in.
auto shape(Containers::StringView text, Containers::ArrayView<const FeatureRange> features = {}) -> UnsignedInt
Shape a text.
auto shape(Containers::StringView text, std::initializer_list<FeatureRange> features) -> UnsignedInt
auto shape(Containers::StringView text, UnsignedInt begin, UnsignedInt end, Containers::ArrayView<const FeatureRange> features = {}) -> UnsignedInt
Shape a slice of text.
auto shape(Containers::StringView text, UnsignedInt begin, UnsignedInt end, std::initializer_list<FeatureRange> features) -> UnsignedInt
auto glyphCount() const -> UnsignedInt
Count of glyphs produced by the last shape() call.
auto script() const -> Script
Script used for the last shape() call.
auto language() const -> Containers::StringView
Language used for the last shape() call.
auto direction() const -> ShapeDirection
Shape direction used for the last shape() call.
void glyphIdsInto(const Containers::StridedArrayView1D<UnsignedInt>& ids) const
Retrieve glyph IDs.
void glyphOffsetsAdvancesInto(const Containers::StridedArrayView1D<Vector2>& offsets, const Containers::StridedArrayView1D<Vector2>& advances) const
Retrieve glyph offsets and advances.
void glyphClustersInto(const Containers::StridedArrayView1D<UnsignedInt>& clusters) const
Retrieve glyph cluster IDs.

Private functions

auto doSetScript(Script script) -> bool virtual
Implemenation for setScript()
auto doSetLanguage(Containers::StringView language) -> bool virtual
Implemenation for setLanguage()
auto doSetDirection(ShapeDirection direction) -> bool virtual
Implemenation for setDirection()
auto doShape(Containers::StringView text, UnsignedInt begin, UnsignedInt end, Containers::ArrayView<const FeatureRange> features) -> UnsignedInt pure virtual
Implemenation for shape()
auto doScript() const -> Script virtual
Implemenation for script()
auto doLanguage() const -> Containers::StringView virtual
Implemenation for language()
auto doDirection() const -> ShapeDirection virtual
Implemenation for direction()
void doGlyphIdsInto(const Containers::StridedArrayView1D<UnsignedInt>& ids) const pure virtual
Implemenation for glyphIdsInto()
void doGlyphOffsetsAdvancesInto(const Containers::StridedArrayView1D<Vector2>& offsets, const Containers::StridedArrayView1D<Vector2>& advances) const pure virtual
Implemenation for glyphOffsetsAdvancesInto()
void doGlyphClustersInto(const Containers::StridedArrayView1D<UnsignedInt>& clusters) const pure virtual
Implemenation for glyphClustersInto()

Function documentation

Magnum::Text::AbstractShaper::AbstractShaper(AbstractFont& font) explicit

Constructor.

Parameters
font Font the shaper is originating from

const AbstractFont& Magnum::Text::AbstractShaper::font() const

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

bool Magnum::Text::AbstractShaper::setScript(Script script)

Set text script.

The script is used for all following shape() calls. If not called at all or if explicitly set to Script::Unspecified, the AbstractFont plugin may attempt to guess the script from the input text. The actual script used for shaping (if any) is queryable with script() const after shape() has been called.

Returns true if the plugin supports setting a script and the script is supported, false otherwise, in which case the shaping falls back to a generic behavior. See documentation of a particular plugin for more information.

bool Magnum::Text::AbstractShaper::setLanguage(Containers::StringView language)

Set text language.

The language is expected to be a BCP 47 language tag, either just the base tag such as "en" or "cs" alone, or further differentiating with a region subtag like for example "en-US" vs "en-GB".

The language is used for all following shape() calls. If not called at all or if explicitly set to an empty string, the AbstractFont plugin may attempt to guess the language from the input text or the execution environment, such as current locale. The actual language used for shaping (if any) is queryable with language() const after shape() has been called.

Returns true if the plugin supports setting a language and the language is supported, false otherwise, in which case the shaping falls back to a generic behavior. See documentation of a particular plugin for more information.

bool Magnum::Text::AbstractShaper::setDirection(ShapeDirection direction)

Set direction the text is meant to be shaped in.

The direction is used for all following shape() calls. If not called at all or if explicitly set to ShapeDirection::Unspecified, the AbstractFont plugin may attempt to guess the direction from the input text. The actual direction used for shaping (if any) is queryable with direction() const after shape() has been called.

Returns true if the plugin supports setting a language and the language is supported, false otherwise, in which case the shaping falls back to a generic behavior. See documentation of a particular font plugin for more information.

UnsignedInt Magnum::Text::AbstractShaper::shape(Containers::StringView text, Containers::ArrayView<const FeatureRange> features = {})

Shape a text.

Parameters
text Text in UTF-8
features Typographic features to apply for the whole text or its subranges

Expects that both begin and all FeatureRange::begin() are contained within text, and that end and all FeatureRange::end() are either contained within text or have a value of 0xffffffffu. Returns the number of shaped glyphs (which is also subsequently available through glyphCount() const) and updates the script() const, language() const and direction() const values.

Whether features are used depends on a particular AbstractFont plugin implementation and the font file itself as well — for example, a plugin may enable Feature::Kerning by default but the font may not even have appropriate tables for it included, in which case no kerning is performed. See documentation of a particular font plugin for more information.

UnsignedInt Magnum::Text::AbstractShaper::shape(Containers::StringView text, std::initializer_list<FeatureRange> features)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

UnsignedInt Magnum::Text::AbstractShaper::shape(Containers::StringView text, UnsignedInt begin, UnsignedInt end, Containers::ArrayView<const FeatureRange> features = {})

Shape a slice of text.

Parameters
text Text in UTF-8
begin Beginning byte in the input text
end (One byte after) the end byte in the input text
features Typographic features to apply for the whole text or its subranges

A variant of shape(Containers::StringView, Containers::ArrayView<const FeatureRange>) to be used when passing pieces of larger text with different shapers, for example when the script, language or direction changes in each piece or when the pieces are using a different font entirely. Compared to passing just the actually shaped slice of text this allows the implementation to perform shaping aware of surrounding context, such as picking correct glyphs for beginning, middle or end of a word or a paragraph.

UnsignedInt Magnum::Text::AbstractShaper::shape(Containers::StringView text, UnsignedInt begin, UnsignedInt end, std::initializer_list<FeatureRange> features)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

UnsignedInt Magnum::Text::AbstractShaper::glyphCount() const

Count of glyphs produced by the last shape() call.

If the last shape() call failed or it hasn't been called yet, returns 0.

Script Magnum::Text::AbstractShaper::script() const

Script used for the last shape() call.

May return Script::Unspecified if shape() hasn't been called yet or if the AbstractFont doesn't implement any script-specific behavior.

Containers::StringView Magnum::Text::AbstractShaper::language() const

Language used for the last shape() call.

May return an empty string if shape() hasn't been called yet or if the AbstractFont doesn't implement any language-specific behavior.

The returned view is generally neither Containers::StringViewFlag::Global nor Containers::StringViewFlag::NullTerminated and is only guaranteed to stay valid until the next setLanguage() or shape() call. Particular AbstractFont implementations may give better guarantees, see their documentation for more information.

ShapeDirection Magnum::Text::AbstractShaper::direction() const

Shape direction used for the last shape() call.

May return ShapeDirection::Unspecified if shape() hasn't been called yet or if the AbstractFont doesn't implement any script-specific behavior.

The direction affects properties of advances coming from glyphOffsetsAdvancesInto() and cluster IDs coming from glyphClustersInto(), see particular ShapeDirection values for more information.

void Magnum::Text::AbstractShaper::glyphIdsInto(const Containers::StridedArrayView1D<UnsignedInt>& ids) const

Retrieve glyph IDs.

Parameters
ids out Where to put glyph IDs

The ids view is expected to have a size of glyphCount(). After calling this function, the ids are commonly looked up in or inserted into an AbstractGlyphCache. Offsets and advances corresponding to the IDs can be retrieved with glyphOffsetsAdvancesInto().

void Magnum::Text::AbstractShaper::glyphOffsetsAdvancesInto(const Containers::StridedArrayView1D<Vector2>& offsets, const Containers::StridedArrayView1D<Vector2>& advances) const

Retrieve glyph offsets and advances.

Parameters
offsets out Where to put glyph offsets
advances out Where to put glyph advances

The offsets and advances views are expected to have a size of glyphCount(). The offsets specify where to put the glyph relative to current cursor (which is then further offset for the particular glyph rectangle returned from the glyph cache) and advances specify in which direction to move the cursor for the next glyph. For direction() being ShapeDirection::LeftToRight or RightToLeft Y components of advances are 0.0f, for TopToBottom or BottomToTop X components of advances are 0.0f. Glyph IDs corresponding to the offsets and advances can be retrieved with glyphIdsInto().

void Magnum::Text::AbstractShaper::glyphClustersInto(const Containers::StridedArrayView1D<UnsignedInt>& clusters) const

Retrieve glyph cluster IDs.

Parameters
clusters out Where to put glyph clusters

The clusters view is expected to have a size of glyphCount(). The cluster IDs are used to map shaped glyphs back to the text passed to shape(). By default, the cluster ID sequence is monotonically non-decreasing or non-increasing based on direction(), with the IDs being byte positions in the original text corresponding to particular glyphs:

  • For plain ASCII text and with the shaper not performing any ligature substitutions, glyph decomposition or reordering, the glyphCount() will be equal to the shaped text byte count, with clusters being a sequence of {0, 1, 2, 3, }, or additionally shifted if the begin parameter passed to shape() was non-zero.
  • For UTF-8 text and the shaper again not performing any ligature substitutions, glyph decomposition or reordering, the sequence will point to start bytes of multi-byte UTF-8 characters. For example {0, 1, 3, 4, 7, }, assuming a two-byte UTF-8 character at byte 1 and a three-byte character at byte 4. Similar output will be if the shaper performs a ligature substitution (such as fi at byte 1 and ffl at byte 4 both turned into a ligature in an otherwise ASCII input).
  • If the shaper performs glyph decomposition, one character in the input may end up being multiple glyphs. For example {0, 1, 1, 3, 4, }, assuming a two-byte UTF-8 character ě at byte 1 being decomposed into two glyphs, e and ˇ.
  • If the shaper performs glyph reordering, the cluster ID will become the whole range of bytes in which the reordering happened, to preserve monotonicity. For example {0, 1, 1, 1, 1, 4, }, assuming glyphs corresponding to bytes 1 to 3 were swapped during shaping.

Certain shaper implementations may offer behavior where the monotonicity is not preserved or the mapping is not to the original input bytes. Such behavior is however never the default, always opt-in. See documentation of particular font plugins for more information.

bool Magnum::Text::AbstractShaper::doSetScript(Script script) virtual private

Implemenation for setScript()

Default implementation does nothing and returns false.

bool Magnum::Text::AbstractShaper::doSetLanguage(Containers::StringView language) virtual private

Implemenation for setLanguage()

Default implementation does nothing and returns false.

bool Magnum::Text::AbstractShaper::doSetDirection(ShapeDirection direction) virtual private

Implemenation for setDirection()

Default implementation does nothing and returns false.

UnsignedInt Magnum::Text::AbstractShaper::doShape(Containers::StringView text, UnsignedInt begin, UnsignedInt end, Containers::ArrayView<const FeatureRange> features) pure virtual private

Implemenation for shape()

The begin as well as all FeatureRange::begin() values are guaranteed to be within text, end as well as all FeatureRange::end() values are guaranteed to be either within text or have a value of 0xffffffffu.

Script Magnum::Text::AbstractShaper::doScript() const virtual private

Implemenation for script()

Default implementation returns Script::Unspecified.

Containers::StringView Magnum::Text::AbstractShaper::doLanguage() const virtual private

Implemenation for language()

Default implementation returns an empty string.

ShapeDirection Magnum::Text::AbstractShaper::doDirection() const virtual private

Implemenation for direction()

Default implementation returns ShapeDirection::Unspecified.

void Magnum::Text::AbstractShaper::doGlyphIdsInto(const Containers::StridedArrayView1D<UnsignedInt>& ids) const pure virtual private

Implemenation for glyphIdsInto()

The ids are guaranteed to have a size of glyphCount(). Called only if glyphCount() is not 0.

void Magnum::Text::AbstractShaper::doGlyphOffsetsAdvancesInto(const Containers::StridedArrayView1D<Vector2>& offsets, const Containers::StridedArrayView1D<Vector2>& advances) const pure virtual private

Implemenation for glyphOffsetsAdvancesInto()

The offsets and advances are guaranteed to have a size of glyphCount(). Called only if glyphCount() is not 0.

void Magnum::Text::AbstractShaper::doGlyphClustersInto(const Containers::StridedArrayView1D<UnsignedInt>& clusters) const pure virtual private

Implemenation for glyphClustersInto()

The clusters are guaranteed to have a size of glyphCount(). Called only if glyphCount() is not 0.