[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PHPwestoz] StrÃngà chÃrÃctÃres ÂÂÂ



Adam Ashley wrote:

Nicolas Connault wrote:


I'm working on an I18n solution for my auction project, French-English. However, French is full of funny chÃrÃctÃrs which get into the database fine, but get output as basic characters when retrieved by PHP . . . (like ch?r?ct?rs).

<snip>

PHP itself is fully unicode compatible.

Actually, it's not. PHP is extremely ugly when it comes to unicode. This is stated in the manual when looking up Strings:

A *string* <http://au2.php.net/manual/en/language.types.string.php> is series of characters. In PHP, a character is the same as a byte, that is, there are exactly 256 different characters possible. This also implies that PHP has no native support of Unicode. See *utf8_encode()* <http://au2.php.net/manual/en/function.utf8-encode.php> and *utf8_decode()* <http://au2.php.net/manual/en/function.utf8-decode.php> for some Unicode support.

There are utf-8 and unicode help functions, though, as listed above. (http://php.net/utf8-encode; http://php.net/utf8-decode)

To get the output to work you need to add the correct encoding metadata
headers to your html output.


This is important, whatever solution you use. Don't forget to add your content type and encoding manually:
|header("Content-Type: text/html; charset=utf-8");


Also have a look at this PHP Class which does intelligent utf8 encoding (apparently):
http://www.phpclasses.org/browse/package/71.html


Personally, I've never dealt much with Unicode, but perhaps you could try looking at i18n packages and see what they use? Perhaps look at existing products, CMSes etc, and see what encoding functions and methods they use?

Hope that helps,

-- Samuel Cochran
  sj26@sj26.com
  +61 4 1544 1909

P.S. Even Thunderbird had trouble with your e-mail title!
|

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature