[LC++]How to generate UTF-8 output?
Jan Pfeifer
pfjan at yahoo.com.br
Sun Nov 27 17:04:02 UTC 2005
hi Torsten,
I've been playing with a morphological analyser and I had many problems
using the C++ converter facet to convert to and from wstring (UCS-4 in
linux) and string (UTF-8). Well, not many problems, it just didn't work,
after a long time reading and trying.
My guess is that it's just not implemented correctly yet. I went through
many C++ books in safari (http://safari.oreilly.com/) and none gave good
(or working) descriptions on how to handle these internationalization
issues :(
So, I'm sticking to the C functions (wcrtomb() and mbtowcr() family of
functions) for converting. I keep all my internal data in UCS-4 (wstring
and wchar_t) and convert back to the user's encoding just before printing.
If you have better luck, please share with us your experience :)
regards,
Jan
ps.: I created a small library based on the following template:
template <typename Target, typename Source> Target string_cast( const
Source &src );
If you are interested, let me know, I can send you a copy (LGPL license
I guess, haven't copyrighted it yet).
>When I run this program, the output is as follows:
> torsten at linux3:~$ LANG=de_DE print2_utf8
> loc1='de_DE'
> Sch
>
>As you can see, the output stops at the german Umlaut 'ö'. This is
>independent of the setting of $LANG.
>
>What's wrong?
>Who can show me a correct version of the above little C++ program?
>
>I'm using:
> - Debian Sarge (stable)
> - gcc-3.3.5
> - libc6-2.3.2
> - Linux Kernel 2.4.24-1-686-smp
>
>Thanks for any hints,
>Torsten
>
>_______________________________________________
>This is the Linux C++ Programming List
>: http://lists.linux.org.au/listinfo/tuxcpprogramming List
>
>
_______________________________________________________
Faça do Yahoo! sua página inicial.
http://br.yahoo.com/homepageset.html
More information about the tuxCPProgramming
mailing list