[LC++]How to generate UTF-8 output?
pfjan at yahoo.com.br
Sun Nov 27 17:04:02 UTC 2005
I've been playing with a morphological analyser and I had many problems
using the C++ converter facet to convert to and from wstring (UCS-4 in
linux) and string (UTF-8). Well, not many problems, it just didn't work,
after a long time reading and trying.
My guess is that it's just not implemented correctly yet. I went through
many C++ books in safari (http://safari.oreilly.com/) and none gave good
(or working) descriptions on how to handle these internationalization
So, I'm sticking to the C functions (wcrtomb() and mbtowcr() family of
functions) for converting. I keep all my internal data in UCS-4 (wstring
and wchar_t) and convert back to the user's encoding just before printing.
If you have better luck, please share with us your experience :)
ps.: I created a small library based on the following template:
template <typename Target, typename Source> Target string_cast( const
Source &src );
If you are interested, let me know, I can send you a copy (LGPL license
I guess, haven't copyrighted it yet).
>When I run this program, the output is as follows:
> torsten at linux3:~$ LANG=de_DE print2_utf8
>As you can see, the output stops at the german Umlaut 'ö'. This is
>independent of the setting of $LANG.
>Who can show me a correct version of the above little C++ program?
> - Debian Sarge (stable)
> - gcc-3.3.5
> - libc6-2.3.2
> - Linux Kernel 2.4.24-1-686-smp
>Thanks for any hints,
>This is the Linux C++ Programming List
>: http://lists.linux.org.au/listinfo/tuxcpprogramming List
Faça do Yahoo! sua página inicial.
More information about the tuxCPProgramming