Corrade::Utility::Unicode namespace

Unicode utilities.

This library is built if CORRADE_WITH_UTILITY is enabled when building Corrade. To use this library with CMake, request the Utility component of the Corrade package and link to the Corrade::Utility target.

find_package(Corrade REQUIRED Utility)

# ...
target_link_libraries(your-app PRIVATE Corrade::Utility)

See also Downloading and building Corrade and Using Corrade with CMake for more information.

Functions

auto currentChar(Containers::StringView text, std::size_t cursor) -> Containers::Triple<char32_t, std::size_t, std::size_t> new in Git master
Current UTF-8 character.
auto nextChar(Containers::StringView text, std::size_t cursor) -> Containers::Pair<char32_t, std::size_t>
Next UTF-8 character.
auto prevChar(Containers::StringView text, std::size_t cursor) -> Containers::Pair<char32_t, std::size_t>
Previous UTF-8 character.
auto utf32(Containers::StringView text) -> Containers::Optional<Containers::Array<char32_t>>
Convert a UTF-8 string to UTF-32.
auto utf8(char32_t character, Containers::ArrayView4<char> result) -> std::size_t
Convert a UTF-32 character to UTF-8.
auto widen(Containers::StringView text) -> Containers::Array<wchar_t>
Widen a UTF-8 string for use with Windows Unicode APIs.
auto narrow(Containers::ArrayView<const wchar_t> text) -> Containers::String
Narrow a string to UTF-8 for use with Windows Unicode APIs.
auto narrow(const wchar_t* text) -> Containers::String

Function documentation

Containers::Triple<char32_t, std::size_t, std::size_t> Corrade::Utility::Unicode::currentChar(Containers::StringView text, std::size_t cursor) new in Git master

Current UTF-8 character.

Returns a Unicode codepoint of a character at cursor, position of where it starts, i.e. either the same position or up to three bytes before, and position of where it ends, i.e. always one or four bytes after. Expects that cursor is less than text size. If the character is invalid, returns 0xffffffffu as the codepoint and position of the same and next byte, it's then up to the caller whether it gets treated as a fatal error or if the invalid character is simply skipped or replaced.

Containers::Pair<char32_t, std::size_t> Corrade::Utility::Unicode::nextChar(Containers::StringView text, std::size_t cursor)

Next UTF-8 character.

Returns a Unicode codepoint of a character at cursor and position of the following character. Expects that cursor is less than text size. If the character is invalid, returns 0xffffffffu as the codepoint and position of the next byte, it's then up to the caller whether it gets treated as a fatal error or if the invalid character is simply skipped or replaced. If cursor might point inside a UTF-8 character encoding, you can use currentChar() to retrieve its codepoint and position of the next character.

Containers::Pair<char32_t, std::size_t> Corrade::Utility::Unicode::prevChar(Containers::StringView text, std::size_t cursor)

Previous UTF-8 character.

Returns a Unicode codepoint of a character before cursor and its position. Expects that cursor is greater than 0 and less than or equal to text size. If the character is invalid, returns 0xffffffffu as the codepoint and position of the previous byte, it's then up to the caller whether it gets treated as a fatal error or if the invalid character is simply skipped or replaced. If cursor might point inside a UTF-8 character encoding, you can use currentChar() to retrieve its codepoint and starting position.

Containers::Optional<Containers::Array<char32_t>> Corrade::Utility::Unicode::utf32(Containers::StringView text)

Convert a UTF-8 string to UTF-32.

If an error occurs, returns Containers::NullOpt. Iterate over the string with nextChar() instead if you need custom handling for invalid characters.

std::size_t Corrade::Utility::Unicode::utf8(char32_t character, Containers::ArrayView4<char> result)

Convert a UTF-32 character to UTF-8.

Parameters
character in UTF-32 character to convert
result out Where to put the UTF-8 result

Returns length of the encoding (1, 2, 3 or 4). If character is outside of the UTF-32 range, returns 0.

Containers::Array<wchar_t> Corrade::Utility::Unicode::widen(Containers::StringView text)

Widen a UTF-8 string for use with Windows Unicode APIs.

Converts a UTF-8 string to a wide-string (UTF-16) representation. The primary purpose of this API is easy interaction with Windows Unicode APIs, thus the function doesn't return char16_t but rather a wchar_t. If the text is not empty, the returned array contains a sentinel null terminator (i.e., not counted into its size).

Containers::String Corrade::Utility::Unicode::narrow(Containers::ArrayView<const wchar_t> text)

Narrow a string to UTF-8 for use with Windows Unicode APIs.

Converts a wide-string (UTF-16) to a UTF-8 representation. The primary purpose is easy interaction with Windows Unicode APIs, thus the function doesn't take char16_t but rather a wchar_t.

Containers::String Corrade::Utility::Unicode::narrow(const wchar_t* text)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. Expects that text is null-terminated.