Character sets represent the mapping between numeric codes and characters. A numeric code
may use one or more bytes to display a character.
Execution character sets
Execution character sets include the available set of characters in an execution
environment and are defined by the QNX Neutrino implementation.
Execution character sets include:
- Single-byte character set (type char) — Uses one byte to
store a character.
- Multibyte character set (type char) — Encodes characters as UTF-8; uses one or more
bytes to represent complex characters.
- Wide character set (type wchar_t) — Encodes characters as UTF-32.
- 16-bit character set (char16_t) — Stores Unicode encoded as
UTF-16.
- 32-bit character set (char32_t) — Stores Unicode encoded as
UTF-32.
Note: QNX advises against using wide and multibyte character
sets of type
wchar_t as they are in an experimental state.
QNX Neutrino ships
the International Components for Unicode (ICU) libraries (
libicu*) that you can use
instead. For more information about ICU, see
https://icu.unicode.org/home.
To understand how conversions between
multibyte characters and characters of type char16_t and
char32_t are handled, see the functions c16rtomb() and mbrtoc16(). These functions
handle conversions between UTF-8 and UTF-16 or UTF-32 strings.
Alphabetic escape sequences
Alphabetic escape sequences are strings in the execution character set that represent an
action rather than plain characters. These actions include backspace, vertical and
horizontal tab, new line, and so on. For more information, see https://en.cppreference.com/w/cpp/language/escape.
Environment macros
When type
wchar_t,
char16_t, and
char32_t can follow the Unicode standard, the compiler or C library defines the following environment macros for the implementation, which affect the handling of environment character sets:
- __STDC_ISO_10646__
- Defined when type wchar_t can hold the short identifier of a Unicode
character and mbtowc() and mbrtowc()is
converted to Unicode. The <platform.h> header file defines this macro with the value
200009L.
- __STDC_MB_MIGHT_NEQ_WC__
- Indicates that the basic character set (ie., single byte
character set) may not have the same value as type wchar_t. Neither the qcc compiler nor the C library defines this macro.
- __STDC_UTF_16__
- Indicates type char16_t is UTF16 encoded. This relates to the conversion
behaviour of mbrtoc16(). The qcc compiler defines this
macro with the value of 1.
- __STDC_UTF_32__
- Indicates type char32_t is UTF32 encoded. This relates to the conversion
behavior of mbrtoc32(). The qcc compiler defines this
macro with the value of 1.