Firebird Documentation IndexFirebird 2.5 Language Ref. UpdateData types and subtypes → New collations
Firebird Home Firebird Home Prev: Character set NONE handling changedFirebird Documentation IndexUp: Data types and subtypesNext: DDL statements

New collations

Unicode collations for all character sets

Added in: 1.0, 1.5, 1.5.1, 2.0, 2.1, 2.5

The following table lists the collations added in Firebird. The “Details” column is based on what has been reported in the Release Notes and other documents. The information in this column is probably incomplete; some collations with an empty Details field may still be case insensitive (ci), accent insensitive (ai) or dictionary-sorted (dic).

Please note that the default – binary – collations for new character sets are not listed here, as doing so would add no meaningful information.

Table 5.2. Collations new in Firebird

Character set Collation Language Details Added in
CP943C CP943C_UNICODE Japanese   2.1
GB18030 GB18030_UNICODE Chinese   2.5
GBK GBK_UNICODE Chinese   2.1
ISO8859_1 ES_ES_CI_AI Spanish ci, ai 2.0
FR_FR_CI_AI French ci, ai 2.1
PT_BR Brazilian Portuguese ci, ai 2.0
ISO8859_2 CS_CZ Czech   1.0
ISO_HUN Hungarian   1.5
ISO_PLK Polish   2.0
ISO8859_13 LT_LT Lithuanian   1.5.1
UTF8 UCS_BASIC All   2.0
UNICODE All dic 2.0
UNICODE_CI All ci 2.1
UNICODE_CI_AI All ci, ai 2.5
WIN1250 BS_BA Bosnian   2.0
PXW_HUN Hungarian ci 1.0
WIN_CZ Czech ci 2.0
WIN_CZ_CI_AI Czech ci, ai 2.0
WIN1251 WIN1251_UA Ukrainian and Russian   1.5
WIN1252 WIN_PTBR Brazilian Portuguese ci, ai 2.0
WIN1257 WIN1257_EE Estonian dic 2.0
WIN1257_LT Lithuanian dic 2.0
WIN1257_LV Latvian dic 2.0
KOI8R KOI8R_RU Russian dic 2.0
KOI8U KOI8U_UA Ukrainian dic 2.0
TIS620 TIS620_UNICODE Thai   2.1


A note on the UTF8 collations

The UCS_BASIC collation sorts in Unicode code-point order: A, B, a, b, á... This is exactly the same as UTF8 with no collation specified. UCS_BASIC was added to comply with the SQL standard.

The UNICODE collation sorts using UCA (Unicode Collation Algorithm): a, A, á, b, B...

UNICODE_CI is truly case-insensitive. In a search for e.g. 'Apple', it will also find 'apple', 'APPLE' and 'aPPLe'.

UNICODE_CI_AI is accent-insensitive as well. According to this collation, 'APPEL' equals 'Appèl'.

Unicode collations for all character sets

Added in: 2.1

Firebird now comes with UNICODE collations for all the standard character sets. However, except for the ones listed in the new collations table in the previous section, these collations are not automatically available in your databases. Instead, they must be added with the CREATE COLLATION statement, like this:

create collation ISO8859_1_UNICODE for ISO8859_1

The new Unicode collations all have the name of their character set with _UNICODE added. (The built-in Unicode collations for UTF8 are the exception to the rule.) They are defined, along with the other collations, in the manifest file fbintl.conf in Firebird's intl subdirectory.

Collations may also be registered under a user-chosen name, e.g.:

create collation LAT_UNI for ISO8859_1 from external ('ISO8859_1_UNICODE')

See CREATE COLLATION for the full syntax.

Prev: Character set NONE handling changedFirebird Documentation IndexUp: Data types and subtypesNext: DDL statements
Firebird Documentation IndexFirebird 2.5 Language Ref. UpdateData types and subtypes → New collations