unicode - mysql doesn't support all symbols from range U+0000..U+FFFF -
consider following table:
create table t1 (f1 varchar(255));
then, ruby
:
#!/usr/bin/env ruby require 'json' require 'sequel' require 'mysql2' db = sequel.connect( :adapter => 'mysql2', :database => 'd1', :user => '<user>', :password => '<password>', :encoding => 'utf8') v1 = '{"a":"b\ud83c\udf4ec"}' v2 = json.parse(v1) p v2['a'] db[:t1].truncate db[:t1].insert(f1: v2['a']); p db[:t1].first[:f1]
or php
:
#!/usr/bin/env php <?php $dbh = new pdo('mysql:dbname=d1', '<user>', '<password>', [ pdo::mysql_attr_init_command => 'set names utf8', pdo::attr_errmode => pdo::errmode_exception, ]); $dbh->exec('truncate table t1'); $v1 = '{"a":"b\ud83c\udf4ec"}'; $v2 = json_decode($v1); var_dump($v2->a); $sth = $dbh->prepare("insert t1 values (?)"); $sth->execute([$v2->a]); $sth = $dbh->query("select * t1"); var_dump($sth->fetch()['f1']);
what gets in database b
. i'm running mysql-5.1
, documentation says:
mysql 5.1 supports 2 character sets storing unicode data:
ucs2
, ucs-2 encoding of unicode character set using 16 bits per character.
utf8
, utf-8 encoding of unicode character set using 1 3 bytes per character.these 2 character sets support characters basic multilingual plane (bmp) of unicode version 3.0. bmp characters have these characteristics:
- their code values between 0 , 65535 (or
u+0000
..u+ffff
).
what doing wrong?
upd
$ mysql -bne 'show create table t1' d1 t1 create table `t1` (\n `f1` varchar(255) default null\n) engine=innodb default charset=utf8
it appears 2 escape sequences represent 1 character: red apple (u+1f34e). first 1 being surrogate. , surrogates are:
the ucs uses surrogates address characters outside initial basic multilingual plane without resorting more 16 bit byte representations.
so must it, resulting character outside bmp
. , not supported mysql
's utf8
character set such.
Comments
Post a Comment