Convert Unicode Hex Codepoint to Unicode Character in SQL Server

The Unicode standard uses the notation U+[x]xxxx to identify each Unicode character, i.e. “U+” followed by 4 or 5 hex digits.

If we remove the first 2 characters (as in the UCD), only hex digits will remain, and we need to convert them to an integer. Using this HexStrToVarBinary function, we get the binary representation of the hex number, which can than be CONVERTed to an int or bigint.

Next, we use the built-in NCHAR() function to create an NCHAR value of the int.

        CONVERT(INT, dbo.HexStrToVarBinary(UniHex)) AS UniInt,
        NCHAR(CONVERT(INT, dbo.HexStrToVarBinary(UniHex))) AS UniNChar
FROM    UniHexTable

Note that the NCHAR() function only works with int arguments from 0 to 65535, so surrogate characters are not handled, and NCHAR() returns NULL.

SQL Server stores Unicode data encoded as UCS-2, and therefore does not support supplementary characters (surrogate pairs).

Unicode Versions supported in SQL Server

SQL Server Unicode
2000 3.0 ?
2005 3.2
2008 5.0

(I did not find an explicit statement on SQL2000, so this is a guess based on the release of SQL Server 2000 and the release date of Unicode 3.0)

One Response to Convert Unicode Hex Codepoint to Unicode Character in SQL Server

  1. […] Versions supported in SQL Server Update to a previous post on SQL Server and […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: