[LC++]How to generate UTF-8 output?
chris at cvine.freeserve.co.uk
Fri Nov 25 08:01:02 UTC 2005
On Thursday 24 November 2005 22:49, Chris Vine wrote:
> The interesting question is what wcrtomb() assumed about the wide character
> codeset it encountered - probably it did the sensible thing and assumed a
> UCS-4 codeset if you have a 4 bit wchar_t (Linux has 4 bit wchar_t and
> Windows a 2 bit wchar_t), and that happened to match the assumption of your
> compiler in setting up the wide character string literal at compilation
> stage). To that extent, it looks to be a matter of luck that it worked.
Actually, on further thought it wasn't luck, because the compiler was using
the C library it knew about and both would have agreed on the wide character
codeset used - the wide character codeset would be implementation defined,
This use of printf() to convert wide characters to the user's current narrow
character locale is an interesting one I have not seen before, but is a way
of "hardwiring" text into source code in a a way which guarantees it is
displayed in the codeset any user happens to use for narrow characters.
Normally this isn't an issue, as gettext() is used to convert between
languages and this will choose the correct narrow character representation,
but where you have a single language application, using printf() is a good
way of catering for different narrow character codesets. Did you get the
example from a textbook or someone else's code?
To use C++ streams, you would probably have to imbue a code conversion facet
into your narrow character stream to convert wide characters to narrow
characters. Perhaps your compiler already does this - have you tried?
More information about the tuxCPProgramming