[LC++]How to generate UTF-8 output?

Torsten Rennett Torsten at rennett.de
Tue Nov 29 21:56:03 UTC 2005


Thank you Chris for your suggestions.

On Freitag 25 November 2005 00:59, Chris Vine wrote:
> On Thursday 24 November 2005 22:49, Chris Vine wrote:
> ..., but where you have a single language
> application, using printf() is a good way of catering for different
> narrow character codesets.  Did you get the example from a textbook or
> someone else's code?

Yes, the C example is from this WebPage:
http://www.cl.cam.ac.uk/~mgk25/unicode.html#c

The C++ example is my own attempt to do the same thing in C++.

> To use C++ streams, you would probably have to imbue a code conversion
> facet into your narrow character stream to convert wide characters to
> narrow characters.  Perhaps your compiler already does this - have you
> tried?

Yes, I have. 

With gcc-3.3.5 come the 'class __enc_traits' und a partial specialization 
of class codecvt:
  template<typename _InternT, typename _ExternT>
  class codecvt<_InternT, _ExternT, __enc_traits>
    : public __codecvt_abstract_base<_InternT, _ExternT, __enc_traits>

You will find this in the header
/usr/include/c++/3.3/i486-linux/bits/codecvt_specializations.h

Documention:
http://gcc.gnu.org/onlinedocs/libstdc++/22_locale/codecvt.html

This approach tries to merge C++ 'class codecvt' and the X/Open iconv(3), 
and I really think that's the way to go. The example in the above 
mentioned documentation uses the facet and 'class __enc_traits' directly, 
but not indirectly through iostreams (+ imbue) like in
	wcout << L"Schöne Grüße";

What I've found so far, this will generally not be possible with a partial 
specialization!  I'll comment more on this and a possible solution (which 
I'm testing currently) in a few days as I'm quite busy at the moment.

Torsten





More information about the tuxCPProgramming mailing list