New collations

Table of Contents

Unicode collations for all character sets

Added in: 1.0, 1.5, 1.5.1, 2.0, 2.1, 2.5

The following table lists the collations added in Firebird. The “Details” column is based on what has been reported in the Release Notes and other documents. The information in this column is probably incomplete; some collations with an empty Details field may still be case insensitive (ci), accent insensitive (ai) or dictionary-sorted (dic).

Please note that the default – binary – collations for new character sets are not listed here, as doing so would add no meaningful information.

Table 5.2. Collations new in Firebird

Character set	Collation	Language	Details	Added in
CP943C	CP943C_UNICODE	Japanese		2.1
GB18030	GB18030_UNICODE	Chinese		2.5
GBK	GBK_UNICODE	Chinese		2.1
ISO8859_1	ES_ES_CI_AI	Spanish	ci, ai	2.0
	FR_FR_CI_AI	French	ci, ai	2.1
	PT_BR	Brazilian Portuguese	ci, ai	2.0
ISO8859_2	CS_CZ	Czech		1.0
	ISO_HUN	Hungarian		1.5
	ISO_PLK	Polish		2.0
ISO8859_13	LT_LT	Lithuanian		1.5.1
UTF8	UCS_BASIC	All		2.0
	UNICODE	All	dic	2.0
	UNICODE_CI	All	ci	2.1
	UNICODE_CI_AI	All	ci, ai	2.5
WIN1250	BS_BA	Bosnian		2.0
	PXW_HUN	Hungarian	ci	1.0
	WIN_CZ	Czech	ci	2.0
	WIN_CZ_CI_AI	Czech	ci, ai	2.0
WIN1251	WIN1251_UA	Ukrainian and Russian		1.5
WIN1252	WIN_PTBR	Brazilian Portuguese	ci, ai	2.0
WIN1257	WIN1257_EE	Estonian	dic	2.0
	WIN1257_LT	Lithuanian	dic	2.0
	WIN1257_LV	Latvian	dic	2.0
KOI8R	KOI8R_RU	Russian	dic	2.0
KOI8U	KOI8U_UA	Ukrainian	dic	2.0
TIS620	TIS620_UNICODE	Thai		2.1

A note on the UTF8 collations

The UCS_BASIC collation sorts in Unicode code-point order: A, B, a, b, á... This is exactly the same as UTF8 with no collation specified. UCS_BASIC was added to comply with the SQL standard.

The UNICODE collation sorts using UCA (Unicode Collation Algorithm): a, A, á, b, B...

UNICODE_CI is truly case-insensitive. In a search for e.g. 'Apple', it will also find 'apple', 'APPLE' and 'aPPLe'.

UNICODE_CI_AI is accent-insensitive as well. According to this collation, 'APPEL' equals 'Appèl'.

Unicode collations for all character sets

Added in: 2.1

Firebird now comes with UNICODE collations for all the standard character sets. However, except for the ones listed in the new collations table in the previous section, these collations are not automatically available in your databases. Instead, they must be added with the CREATE COLLATION statement, like this:

create collation ISO8859_1_UNICODE for ISO8859_1

The new Unicode collations all have the name of their character set with _UNICODE added. (The built-in Unicode collations for UTF8 are the exception to the rule.) They are defined, along with the other collations, in the manifest file fbintl.conf in Firebird's intl subdirectory.

Collations may also be registered under a user-chosen name, e.g.:

create collation LAT_UNI for ISO8859_1 from external ('ISO8859_1_UNICODE')

See CREATE COLLATION for the full syntax.