[LC++]How to generate UTF-8 output?

Torsten Rennett Torsten at rennett.de
Fri Nov 25 04:49:06 UTC 2005


Hi,

I want to generate UTF-8 output with C++, but it does not work.

This little C program works as expected:

    #include <locale.h>
    #include <stdio.h>

    int main(int argc, char *argv[])
    {
      (void)argc;
      (void)argv;

      if (!setlocale(LC_CTYPE, ""))
      {
	fprintf(stderr, "Can't set the specified locale! "
		"Check LANG, LC_CTYPE, LC_ALL.\n");
	return 1;
      }
      printf("%ls\n", L"Schöne Grüße");
      return 0;
    }

Call this program with the locale setting LANG=de_DE and the output will
be in ISO 8859-1. Call it with LANG=de_DE.UTF-8 and the output will be
in UTF-8.

torsten at linux3:~$ LANG=de_DE print_utf8 | od -t x1
0000000 53 63 68 f6 6e 65 20 47 72 fc df 65 0a
0000015
torsten at linux3:~$ LANG=de_DE.UTF-8 print_utf8 | od -t x1
0000000 53 63 68 c3 b6 6e 65 20 47 72 c3 bc c3 9f 65 0a
0000020


Good, so far. Now the same thing in C++.

    #include <locale>
    #include <iostream>
    using namespace std;

    int main(int argc, char *argv[])
    {
      (void)argc;
      (void)argv;

      try
      {
	// environment default (usually determined by $LANG)
	locale loc1("");
	cout << "loc1='" << loc1.name() << '\'' << endl;
	wcout.imbue(loc1);
      }
      catch (const exception &ex)
      {
	cerr << "FAILED! " << ex.what() << endl;
      }

      wcout << L"Schöne Grüße";
      cout << endl;

      return 0;
    }

When I run this program, the output is as follows:
    torsten at linux3:~$ LANG=de_DE print2_utf8
    loc1='de_DE'
    Sch

As you can see, the output stops at the german Umlaut 'ö'. This is
independent of the setting of $LANG.

What's wrong? 
Who can show me a correct version of the above little C++ program?

I'm using:
    - Debian Sarge (stable)
    - gcc-3.3.5
    - libc6-2.3.2
    - Linux Kernel 2.4.24-1-686-smp

Thanks for any hints,
Torsten




More information about the tuxCPProgramming mailing list