- 
                Notifications
    
You must be signed in to change notification settings  - Fork 11
 
ST::utf_conversion
Since string_theory 3.0.
#include <string_theory/utf_conversion>| Name | Summary | 
|---|---|
| utf_validation_t | Behavior for handling UTF validation in conversions | 
| Name | Summary | 
|---|---|
| utf16_to_utf8 | Convert UTF-16 text to UTF-8 | 
| utf32_to_utf8 | Convert UTF-32 text to UTF-8 | 
| wchar_to_utf8 | Convert wide text to UTF-8 | 
| latin_1_to_utf8 | Convert Latin-1 text to UTF-8 | 
| utf8_to_utf16 | Convert UTF-8 text to UTF-16 | 
| utf32_to_utf16 | Convert UTF-32 text to UTF-16 | 
| wchar_to_utf16 | Convert wide text to UTF-16 | 
| latin_1_to_utf16 | Convert Latin-1 text to UTF-16 | 
| utf8_to_utf32 | Convert UTF-8 text to UTF-32 | 
| utf16_to_utf32 | Convert UTF-16 text to UTF-32 | 
| wchar_to_utf32 | Convert wide text to UTF-32 | 
| latin_1_to_utf32 | Convert Latin-1 text to UTF-32 | 
| utf8_to_wchar | Convert UTF-8 text to wide text | 
| utf16_to_wchar | Convert UTF-16 text to wide text | 
| utf32_to_wchar | Convert UTF-32 text to wide text | 
| latin_1_to_wchar | Convert Latin-1 text to wide text | 
| utf8_to_latin_1 | Convert UTF-8 text to Latin-1 | 
| utf16_to_latin_1 | Convert UTF-16 text to Latin-1 | 
| utf32_to_latin_1 | Convert UTF-32 text to Latin-1 | 
| wchar_to_latin_1 | Convert wide text to Latin-1 | 
| Name | Summary | 
|---|---|
| ST_DEFAULT_VALIDATION | Default value for utf_validation_t values | 
These functions provide a standalone way to convert between string_theory's
supported character encodings, without having to go through ST::string.
- UTF-8
 - UTF-16
 - UTF-32 (or UCS4)
 - Latin-1
 - "wide" strings using the platform's native 
wchar_ttype. These are assumed to be encoded as either UTF-16 or UTF-32, depending on the size of thewchar_ttype. Other wide character encodings are not currently supported. 
Since string_theory 3.0.
enum utf_validation_t
{
    assume_valid,
    substitute_invalid,
    check_validity
};Options for dealing with invalid character sequences in encoding/decoding operations.
- assume_valid: Don't do any checking or substitution. Only use this value if you are certain the data is already correct for the target encoding.
 - 
substitute_invalid: Replace invalid sequences with a substitute.  For
conversions to Unicode encodings, this will use the Unicode replacement
character (U+FFFD).  For conversions to Latin-1, this will use 
'?'. - check_validity: Throw a ST::unicode_error exception if any invalid sequences are encountered in the source data. This is the default for most conversions.
 assert_validity: Call the string_theory assert handler if any invalid sequences are encountered in the source data.
Changed in 3.0:  Removed assert_validity.
| Signature | |
|---|---|
| ST::char_buffer ST::latin_1_to_utf8(const char *astr, size_t size) | (1) | 
| ST::char_buffer ST::latin_1_to_utf8(const char_buffer &astr) | (2) | 
Convert Latin-1 text to UTF-8.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf16_buffer ST::latin_1_to_utf16(const char *astr, size_t size) | (1) | 
| ST::utf16_buffer ST::latin_1_to_utf16(const char_buffer &astr) | (2) | 
Convert Latin-1 text to UTF-16.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf32_buffer ST::latin_1_to_utf32(const char *astr, size_t size) | (1) | 
| ST::utf32_buffer ST::latin_1_to_utf32(const char_buffer &astr) | (2) | 
Convert Latin-1 text to UTF-32.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::wchar_buffer ST::latin_1_to_wchar(const char *astr, size_t size) | (1) | 
| ST::wchar_buffer ST::latin_1_to_wchar(const char_buffer &astr) | (2) | 
Convert Latin-1 text to wide text.  The returned buffer will be encoded as
either UTF-16 or UTF-32, depending on the size of the wchar_t type.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::utf8_to_latin_1(const char *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (1) | 
| ST::char_buffer ST::utf8_to_latin_1(const char8_t *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (2) | 
| ST::char_buffer ST::utf8_to_latin_1(const char_buffer &utf8, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (3) | 
Convert UTF-8 text to Latin-1.  Any characters outside of the Latin-1 range
will be replaced by ? if substitute_out_of_range is true, or will cause
a ST::unicode_error to be thrown otherwise.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf16_buffer ST::utf8_to_utf16(const char *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::utf16_buffer ST::utf8_to_utf16(const char8_t *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
| ST::utf16_buffer ST::utf8_to_utf16(const char_buffer &utf8, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (3) | 
Convert UTF-8 text to UTF-16.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf32_buffer ST::utf8_to_utf32(const char *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::utf32_buffer ST::utf8_to_utf32(const char8_t *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
| ST::utf32_buffer ST::utf8_to_utf32(const char_buffer &utf8, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (3) | 
Convert UTF-8 text to UTF-32.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::wchar_buffer ST::utf8_to_wchar(const char *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::wchar_buffer ST::utf8_to_wchar(const char8_t *utf8, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
| ST::wchar_buffer ST::utf8_to_wchar(const char_buffer &utf8, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (3) | 
Convert UTF-8 text to wide text.  The returned buffer will be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::utf16_to_latin_1(const char16_t *utf16, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (1) | 
| ST::char_buffer ST::utf16_to_latin_1(const utf16_buffer &utf16, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (2) | 
Convert UTF-16 text to Latin-1.  Any characters outside of the Latin-1 range
will be replaced by ? if substitute_out_of_range is true, or will cause
a ST::unicode_error to be thrown otherwise.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::utf16_to_utf8(const char16_t *utf16, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::char_buffer ST::utf16_to_utf8(const utf16_buffer &utf16, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert UTF-16 text to UTF-8.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf32_buffer ST::utf16_to_utf32(const char16_t *utf16, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::utf32_buffer ST::utf16_to_utf32(const utf16_buffer &utf16, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert UTF-16 text to UTF-32.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::wchar_buffer ST::utf16_to_wchar(const char16_t *utf16, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::wchar_buffer ST::utf16_to_wchar(const utf16_buffer &utf16, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert UTF-16 text to wide text.  The returned buffer will be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.  When wchar_t
is 16 bits, this will return an unmodified copy of the buffer.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::utf32_to_latin_1(const char32_t *utf32, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (1) | 
| ST::char_buffer ST::utf32_to_latin_1(const utf32_buffer &utf32, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (2) | 
Convert UTF-32 text to Latin-1.  Any characters outside of the Latin-1 range
will be replaced by ? if substitute_out_of_range is true, or will cause
a ST::unicode_error to be thrown otherwise.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::utf32_to_utf8(const char32_t *utf32, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::char_buffer ST::utf32_to_utf8(const utf32_buffer &utf32, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert UTF-32 text to UTF-8.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf16_buffer ST::utf32_to_utf16(const char32_t *utf32, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::utf16_buffer ST::utf32_to_utf16(const utf32_buffer &utf32, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert UTF-32 text to UTF-16.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::wchar_buffer ST::utf32_to_wchar(const char32_t *utf32, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::wchar_buffer ST::utf32_to_wchar(const utf32_buffer &utf32, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert UTF-32 text to wide text.  The returned buffer will be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.  When wchar_t
is 32 bits, this will return an unmodified copy of the buffer.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::wchar_to_latin_1(const wchar_t *wstr, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (1) | 
| ST::char_buffer ST::wchar_to_latin_1(const wchar_buffer &wstr, utf_validation_t validation = ST_DEFAULT_VALIDATION, bool substitute_out_of_range = true) | (2) | 
Convert wide text to Latin-1.  wstr is assumed to be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.  Any characters
outside of the Latin-1 range will be replaced by ? if substitute_out_of_range
is true, or will cause a ST::unicode_error to be
thrown otherwise.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::char_buffer ST::wchar_to_utf8(const wchar_t *wstr, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::char_buffer ST::wchar_to_utf8(const wchar_buffer &wstr, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert wide text to UTF-8.  wstr is assumed to be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf16_buffer ST::wchar_to_utf16(const wchar_t *wstr, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::utf16_buffer ST::wchar_to_utf16(const wchar_buffer &wstr, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert wide text to UTF-16.  wstr is assumed to be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.  When wchar_t
is 16 bits, this will return an unmodified copy of the buffer.
Since string_theory 3.0.
| Signature | |
|---|---|
| ST::utf32_buffer ST::wchar_to_utf32(const wchar_t *wstr, size_t size, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (1) | 
| ST::utf32_buffer ST::wchar_to_utf32(const wchar_buffer &wstr, utf_validation_t validation = ST_DEFAULT_VALIDATION) | (2) | 
Convert wide text to UTF-32.  wstr is assumed to be encoded as either
UTF-16 or UTF-32, depending on the size of the wchar_t type.  When wchar_t
is 32 bits, this will return an unmodified copy of the buffer.
Since string_theory 3.0.
#ifndef ST_DEFAULT_VALIDATION
#   define ST_DEFAULT_VALIDATION ST::check_validity
#endifThe default checking type for methods which do validity checking. It is possible to override the default by defining an alternate ST_DEFAULT_VALIDATION before including any string_theory headers, however it is generally recommended to leave the default and explicitly set other values in method calls that need different behavior.
See also utf_validation_t