| home / programming / mysql_data / 1 | [previous] |
|
|
Two forms of notation can be used to force a string literal to be interpreted with a given character set:
The _latin2 'abc' _latin2 0x616263 _latin2 X'616263' _utf8 'def' _utf8 0x646566 _utf8 X'646566' For quoted strings, whitespace is optional between the introducer and the following string. For hexadecimal values, whitespace is required. N must be followed immediately
by a quoted literal string with no intervening whitespace. |
Introducer notation works for literal quoted strings or hexadecimal constants, but not for
string expressions or column values. However, any string or string expression can be used
to produce a string in a designated character set using the CONVERT() function:
CONVERT(str USING charset);
Introducers and CONVERT() are not the same. An introducer does not change the string
value; it merely modifies how the string is interpreted. CONVERT() takes a string argument
and produces a new string in the desired character set.To see the difference
between introducers and CONVERT(), consider the following two statements that refer to
the ucs2 double-byte character set:
mysql> SET @s1 = _ucs2 'ABCD';
mysql> SET @s2 = CONVERT('ABCD' USING ucs2);
Assume that the default character set is latin1 (a single-byte character set).The first
statement interprets each pair of characters in the string 'ABCD' as a single double-byte
ucs2 character, resulting in a two-character ucs2 string.The second statement converts
each character of the string 'ABCD' to the corresponding ucs2 character, resulting in a
four-character ucs2 string.
What is the “length” of each string? It depends. If you measure with CHAR_LENGTH(),
you get the length in characters. If you measure with LENGTH(), you get the length in
bytes:

Here is a somewhat subtle point: A binary string is not the same thing as a non-binary string that has a binary collation. The binary string has no character set. It is interpreted with byte semantics and comparisons use single-byte numeric codes. A non-binary string with a binary collation has character semantics and comparisons use numeric character values that might be based on multiple bytes per character.
Here’s one way to see the difference between binary and non-binary strings with
regard to lettercase. Create a binary string and a non-binary string that has a binary collation,
and then pass each string to the UPPER() function:

Why doesn’t UPPER() convert the binary string to uppercase? This occurs because it has
no character set, so there is no way to know which byte values correspond to uppercase
or lowercase characters.To use a binary string with functions such as UPPER() and
LOWER(), you must first convert it to a non-binary string:

Excerpted from Chapter 3: Working with Data in MySQL from the MySQL, 3rd Edition by Paul Dubois. ISBN 0672326736, Copyright © 2005. Used with the permission of Sams Publishing.
| home / programming / mysql_data / 1 | [previous] |
Created: March 27, 2003
Revised: April 24, 2006
URL: http://webreference.com/programming/mysql_data/1