“Why can’t I store/display [insert name] language data in SQL Server?”

I came across this and similar questions so many times on Stack Overflow:

My guess is that most of these question are due to a lack of understanding of Unicode and how the various components of an application must work together to display Unicode texts correctly.


If you store Unicode data in the database, you have to use the Unicode data types NVARCHAR(n), NVARCHAR(MAX), NTEXT (obsolete) or NCHAR (fixed-length strings). Unicode data is stored in SQL Server using the UCS-2 encoding, meaning every character uses 2 bytes to store, unless the database is created using an Supplementary Characters collation, causing the database to use the UTF-16 encoding (same encoding for the BMP, but support for all Unicode planes).

Except for the _SC collations, defining a collation does not have an influence on how Unicode data is stored.

In general, a collation defines how text data is compared and sorted. The collation can be set on database level (default for all text columns), on column level, and on operand level (i.e. the value of a comparison or ORDER BY clause) using the COLLATE clause.

Unicode Values

Unicode values can either be variables (or parameters) declared with the Unicode data types listed above, or a Unicode string literals using the N” notation. Most of the times, the questions above arise from either writing literals without the N”, or from believing that variable names must be attached the N somehow.

Let’s look at some examples:

  • SELECTing string literals
SELECT N'тест', 'тест'
тест    ????
  • Assigning (non-)Unicode values
declare @nvc1 nvarchar(50), @nvc2 nvarchar(50)
declare @vc1 varchar(50), @vc2 varchar(50)

set @nvc1 = N'тест'
set @nvc2 = 'тест'
set @vc1 = N'тест'
set @vc2 = 'тест'

print @nvc1
print @nvc2
print @vc1
print @vc2

The Unicode value only “survives” if every assignment is Unicode-enabled.

  • Storing from C#

.Net by definition supports Unicode. If you store data into or retrieve data from a Unicode-enabled column using either SqlCommand or a typical DAL/ORM component, you should not even notice that you are dealing with Unicode rather than plain ASCII.

Just be aware that in case of SqlCommand, you better use parameterized statements and SqlParameter rather than concatenating the complete statement including parameter values. (This approach is also most common to avoid SQL injections)

Displaying Unicode data

Next problem: You successfully stored Unicode data, but SELECTing in SSMS only returns lots of question marks or boxes. The reason is simple: SSMS cannot display the characters.

But what is required actually necessary to display Unicode data:

  • a Unicode font covering the characters you want to retrieve
  • a font rendering engine as part of the operating system capable of rendering the character string correctly (especially for right-to-left languages, and some South-East Asian languages)
  • an application that makes use of both

I know from my experience that SSMS 2005 was not capable by default to display some Unicode characters (sorry, can’t remember which) even though fonts were installed. It probably had to do with the fonts defined for the editor or the result pane.

As a fact, SSMS 2008 on Windows 7 does not have any problems in the results pane, and can even display supplementary characters.

If you still have problems display Unicode characters, make sure to check whether the data is stored correctly by analyzing the numerical values of the Unicode string using the UNICODE function:

declare @t nvarchar(4000)
select @t = SomeUnicodeColumn from MyTable where ...
select unicode(substring( @t, 1, 1))
select unicode(substring( @t, 2, 2))
select unicode(substring( @t, 3, 3))
select unicode(substring( @t, 4, 4))

If the values are greater than 128 (or 256, depending on language), your data is stored as Unicode.

One Response to “Why can’t I store/display [insert name] language data in SQL Server?”

  1. […] should always be aware that if you translate between code pages, not every character can be mapped to a character in the target code page, and some mappings will […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: