Why do I need the [Flags] Attribute?

August 19, 2011

As a developer you often take things for granted because they are “accepted best practices” until somebody asks about the most basic and simplest things, “why do I need [X]”, and you think, well that’s obvious, until you realize the (correct) answer was not so obvious at all, especially if the documentation is not accurate.

It happened to me when I came across the SO question, “What does the [Flags] Attribute Really do?

Time to investigate.

[Flags] can be attributes to Enums, and Enums are type-safe int’s (basically).

In C#, you can apply bitwise operators such as | and & (or, and), etc., so the documentation of the FlagsAttribute class

Indicates that an enumeration can be treated as a bit field; that is, a set of flags.

raises some questions: “can be?”, “as opposed to what?”, “is it really necessary?”

First, the enum values have to be powers of 2 to make the Flags attribute behave as expected.

Let’s take colors as example: 3 different color enumerations (yes, we only have 1-bit color depth here 😉 )

enum ColorEnum
{
	None,
	Red,
	Green,
	Blue
}

enum SimpleColor
{
	Red = 1,
	Green = 2,
	Blue = 4
}

[Flags]
enum FlagsColor
{
	Red = 1,
	Green = 2,
	Blue = 4,
	White = 7
}

ColorEnum can only take a single color, SimpleColor (powers of 2) *could* take a combination of colors, but lacks the [Flags] attribute, and FlagsColor has [Flags] and defines White as combination of Red|Green|Blue.

To test the different enum definitions, we call the enum’s .ToString() method, cast the enum values to int, and Parse() the result of the ToString() (using C#4 in VS2010):

ColorEnum ce = ColorEnum.Red;
Console.WriteLine("ColorEnum Red: " + ce.ToString());
Console.WriteLine("ColorEnum Red: " + ((int)ce).ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(ColorEnum), ce.ToString(), true).ToString());

which has the unsurprising result:

ColorEnum Red: Red
ColorEnum Red: 1
  parse: Red

A bit unexpected, but you can use bitwise operators on enums without a [Flag] attribute

ce = ColorEnum.Red | ColorEnum.Green;
Console.WriteLine("ColorEnum Red | Green: " + ce.ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(ColorEnum), ce.ToString(), true).ToString());

Since Red==1, Green==2, and Blue==3, the result is

ColorEnum Red | Green: Blue
  parse: Blue

Let’s experiment with SimpleColor, enum holding values of power of 2, but without the [Flags] attribute:

SimpleColor sc = SimpleColor.Red;
Console.WriteLine("SimpleColor Red: " + sc.ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(SimpleColor), sc.ToString(), true).ToString());

sc = SimpleColor.Red | SimpleColor.Green;
Console.WriteLine("SimpleColor Red | Green: " + sc.ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(SimpleColor), sc.ToString(), true).ToString());

Note that Red|Green gives 3, and the enum does not define a symbol for value 3:

SimpleColor Red: Red
  parse: Red
SimpleColor Red | Green: 3
  parse: 3

Finally, the FlagsColor enum with [Flags] attribute:

FlagsColor fc = FlagsColor.Red;
Console.WriteLine("FlagsColor Red: " + fc.ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(FlagsColor), fc.ToString(), true).ToString());
fc = FlagsColor.Red | FlagsColor.Green;
Console.WriteLine("FlagsColor Red | Green: " + fc.ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(FlagsColor), fc.ToString(), true).ToString());

Console.WriteLine("  parse FlagsColor as SimpleColor: " + 
    Enum.Parse(typeof(SimpleColor), fc.ToString(), true).ToString());
Console.WriteLine("  parse FlagsColor as ColorEnum: " + 
    Enum.Parse(typeof(ColorEnum), fc.ToString(), true).ToString());

The first part produces what is expected:

FlagsColor Red: Red
  parse: Red
FlagsColor Red | Green: Red, Green
  parse: Red, Green

But what about the stringified bit combination “Red, Green” if it is parsed by the other enums?

  parse FlagsColor as SimpleColor: 3
  parse FlagsColor as ColorEnum: Blue

Seems that the Enum.Parse() method ignores the (lack of the) [Flags] attribute!

The .HasFlag() method also works regardless of the [Flags] attribute:

Console.WriteLine("ColorEnum has Red? " + ce.HasFlag(ColorEnum.Red));
Console.WriteLine("SimpleColor has Red? " + sc.HasFlag(SimpleColor.Red));
Console.WriteLine("FlagsColor has Red? " + fc.HasFlag(FlagsColor.Red));

Console.WriteLine("ColorEnum has Blue? " + ce.HasFlag(ColorEnum.Blue));
Console.WriteLine("SimpleColor has Blue? " + sc.HasFlag(SimpleColor.Blue));
Console.WriteLine("FlagsColor has Blue? " + fc.HasFlag(FlagsColor.Blue));

results in:

ColorEnum has Red? True
SimpleColor has Red? True
FlagsColor has Red? True
ColorEnum has Blue? True
SimpleColor has Blue? False
FlagsColor has Blue? False

Finally, testing a composite bit value:

fc = FlagsColor.Red | FlagsColor.Green | FlagsColor.Blue;
Console.WriteLine("FlagsColor RGB: " + fc.ToString());
Console.WriteLine("  parse: " + 
    Enum.Parse(typeof(FlagsColor), "Red, Green, Blue", true).ToString());
Console.WriteLine("FlagsColor has White? " + fc.HasFlag(FlagsColor.White));

results in

FlagsColor RGB: White
  parse: White
FlagsColor has White? True

If you work with VB.Net rather than C#, this entry on social.msdn may be interesting for you:

Although C# happily allows users to perform bit operations on enums without the FlagsAttribute, Visual Basic does not. So if you are exposing types to other languages, then marking enums with the FlagsAttribute is a good idea.

It also states

(The [Flags] attribute) makes it clear that the members of the enum are designed to be used together.

Some things should be described more explicitly in the documentation.


Why do I need NVARCHAR columns?

September 2, 2009

I came across a (MSSQL) database that almost only contained VARCHAR columns, even though the target audience was expected to come from all over the European Union.

In my projects, I am used to define NVARCHAR columns for all values that can be written by the users, and use VARCHAR only for strings that are known to be ASCII, such as program identifiers, URLs, or email addresses.

I tried to show the disadvantage of using VARCHAR by inputing accented characters, expecting a default Western collation, only to fail insofar as that the characters were displayed correctly. I found later that the collation had been explicitly set to a character set support these accented characters.

I prepared more thoroughly for the next time, and wrote a little TSQL script to illustrate my point:

create table #t (
	id int identity,
	sA varchar(100),	-- insert varchar
	sN varchar(100),	-- insert nvarchar (result same as varchar)
	sC varchar(100) collate Croatian_CI_AS,	-- insert nvarchar
	n nvarchar(100)
)

declare @i int
declare @s varchar(100)	-- implicit collation of database
declare @n nvarchar(100)
set @s = ''
set @n = ''

set @i = 32
while @i < 1000 begin
	set @s = @s + nchar(@i)
	set @n = @n + nchar(@i)
	set @i = @i + 1

	if (len(@s)=32) begin
		insert into #t (sA, n, sN, sC) values (@s, @n, @n, @n)

		print @s
		print @n

		set @s = ''
		set @n = ''
	end
end

select id as Row, sA as DBCollated, sC as HRCollated, n as Unicoded
from #t
order by id

drop table #t

Table #T contains a couple of text columns:

  • sA varchar with database collation is set by a varchar variable
  • sN varchar with database collation is set by an nvarchar variable
  • sC varchar with explicit collation is also set by an nvarchar variable
  • n nvarchar is set by an nvarchar variable

In the loop, I construct strings with 32 Unicode characters, ranging from code point 32 to code point 1000 (that’s just arbitrary codes), and insert the same value into the 4 columns.

The findings of this experiment are:

VARCHAR variables have the same collation as the database. You cannot declare a COLLATE clause in the variable definition. Thus sA and sN contain the same values, as would be expected by mapping Unicode characters onto a code page.

The resulting table looks like this (only 16 characters per row for better readability):

VARCHAR Latin1_General_CI_AS VARCHAR Croatian_CI_AS NVARCHAR
Database Collation Explicit Collation Unicode
1 !”#$%&'()*+,-./ !”#$%&'()*+,-./ !”#$%&'()*+,-./
2 0123456789:;<=>? 0123456789:;<=>? 0123456789:;<=>?
3 @ABCDEFGHIJKLMNO @ABCDEFGHIJKLMNO @ABCDEFGHIJKLMNO
4 PQRSTUVWXYZ[\]^_ PQRSTUVWXYZ[\]^_ PQRSTUVWXYZ[\]^_
5 `abcdefghijklmno `abcdefghijklmno `abcdefghijklmno
6 pqrstuvwxyz{|}~ pqrstuvwxyz{|}~ pqrstuvwxyz{|}~
7 ????????????? ??ƒ????ˆ??????? €‚ƒ„…†‡ˆ‰Š‹ŒŽ
8 ?????????????? ???????˜??????? ‘’“”•–—˜™š›œžŸ
9 ¡¢£¤¥¦§¨©ª«¬­®¯ !cL¤Y¦§¨©a«¬­®— ¡¢£¤¥¦§¨©ª«¬­®¯
10 °±²³´µ¶·¸¹º»¼½¾¿ °±23´µ¶·¸1o»113? °±²³´µ¶·¸¹º»¼½¾¿
11 ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ AÁÂAÄAAÇEÉEËIÍÎI ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ
12 ÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß ?NOÓÔOÖ×OUÚUÜÝ?ß ÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß
13 àáâãäåæçèéêëìíîï aáâaäaaçeéeëiíîi àáâãäåæçèéêëìíîï
14 ðñòóôõö÷øùúûüýþÿ ?noóôoö÷ouúuüý?y ðñòóôõö÷øùúûüýþÿ
15 AaAaAaCcCcCcCcDd AaĂ㥹ĆćCcCcČčĎď ĀāĂ㥹ĆćĈĉĊċČčĎď
16 ÐdEeEeEeEeEeGgGg ĐđEeEeEeĘęĚěGgGg ĐđĒēĔĕĖėĘęĚěĜĝĞğ
17 GgGgHhHhIiIiIiIi GgGgHhHhIiIiIiIi ĠġĢģĤĥĦħĨĩĪīĬĭĮį
18 Ii??JjKk?LlLlLl? Ii??JjKk?ĹĺLlĽľ? İıIJijĴĵĶķĸĹĺĻļĽľĿ
19 ?LlNnNnNn???OoOo ?ŁłŃńNnŇň???OoOo ŀŁłŃńŅņŇňʼnŊŋŌōŎŏ
20 OoŒœRrRrRrSsSsSs ŐőOoŔŕRrŘřŚśSsŞş ŐőŒœŔŕŖŗŘřŚśŜŝŞş
21 ŠšTtTtTtUuUuUuUu ŠšŢţŤťTtUuUuUuŮů ŠšŢţŤťŦŧŨũŪūŬŭŮů
22 UuUuWwYyŸZzZzŽž? ŰűUuWwYyYŹźŻżŽž? ŰűŲųŴŵŶŷŸŹźŻżŽžſ
23 b????????Ð?????? b????????Đ?????? ƀƁƂƃƄƅƆƇƈƉƊƋƌƍƎƏ
24 ?ƒƒ????I??l????O ?Ff????I??l????O ƐƑƒƓƔƕƖƗƘƙƚƛƜƝƞƟ
25 Oo?????????t??TU Oo?????????t??TU ƠơƢƣƤƥƦƧƨƩƪƫƬƭƮƯ
26 u?????z????????? u?????z????????? ưƱƲƳƴƵƶƷƸƹƺƻƼƽƾƿ
27 |??!?????????AaI |??!?????????AaI ǀǁǂǃDŽDždžLJLjljNJNjnjǍǎǏ
28 iOoUuUuUuUuUu?Aa iOoUuUuUuUuUu?Aa ǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟ
29 ????GgGgKkOoOo?? ????GgGgKkOoOo?? ǠǡǢǣǤǥǦǧǨǩǪǫǬǭǮǯ
30 j??????????????? j??????????????? ǰDZDzdzǴǵǶǷǸǹǺǻǼǽǾǿ
31 ???????????????? ???????????????? ȀȁȂȃȄȅȆȇȈȉȊȋȌȍȎȏ
32 ???????????????? ???????????????? ȐȑȒȓȔȕȖȗȘșȚțȜȝȞȟ
33 ???????????????? ???????????????? ȠȡȢȣȤȥȦȧȨȩȪȫȬȭȮȯ
34 ???????????????? ???????????????? ȰȱȲȳȴȵȶȷȸȹȺȻȼȽȾȿ
35 ???????????????? ???????????????? ɀɁɂɃɄɅɆɇɈɉɊɋɌɍɎɏ
36 ???????????????? ???????????????? ɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟ
37 ?g?????????????? ?g?????????????? ɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯ
38 ???????????????? ???????????????? ɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿ
39 ???????????????? ???????????????? ʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏ
40 ???????????????? ???????????????? ʐʑʒʓʔʕʖʗʘʙʚʛʜʝʞʟ
41 ???????????????? ???????????????? ʠʡʢʣʤʥʦʧʨʩʪʫʬʭʮʯ
42 ?????????'”?’??? ?????????'”‘’??? ʰʱʲʳʴʵʶʷʸʹʺʻʼʽʾʿ
43 ????^?ˆ?’¯´`?_?? ????^?^ˇ’Ż´`?_?? ˀˁ˂˃˄˅ˆˇˈˉˊˋˌˍˎˏ
44 ??????????°?˜??? ????????˘˙°˛~˝?? ːˑ˒˓˔˕˖˗˘˙˚˛˜˝˞˟
45 ???????????????? ???????????????? ˠˡˢˣˤ˥˦˧˨˩˪˫ˬ˭ˮ˯
46 ???????????????? ???????????????? ˰˱˲˳˴˵˶˷˸˹˺˻˼˽˾˿
47 `´^~¯¯??¨?°???”? `´^~ŻŻ˘˙¨?°?ˇ?”? ̀́̂̃̄̅̆̇̈̉̊̋̌̍̎̏
48 ???????????????? ???????????????? ̛̖̗̘̙̜̝̞̟̐̑̒̓̔̕̚
49 ???????¸???????? ???????¸???????? ̡̢̧̨̠̣̤̥̦̩̪̫̬̭̮̯
50 ?__????????????? ?__????????????? ̴̵̶̷̸̰̱̲̳̹̺̻̼̽̾̿
51 ???????????????? ???????????????? ͇͈͉͍͎̀́͂̓̈́͆͊͋͌ͅ͏
52 ???????????????? ???????????????? ͓͔͕͖͙͚͐͑͒͗͛͘͜͟͝͞
53 ???????????????? ???????????????? ͣͤͥͦͧͨͩͪͫͬͭͮͯ͢͠͡
54 ??????????????;? ??????????????;? ͰͱͲͳʹ͵Ͷͷ͸͹ͺͻͼͽ;Ϳ
55 ???????????????? ???????????????? ΀΁΂΃΄΅Ά·ΈΉΊ΋Ό΍ΎΏ
56 ???G????T??????? ???????????????? ΐΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟ
57 ???S??F??O?????? ???????????????? ΠΡ΢ΣΤΥΦΧΨΩΪΫάέήί
58 ?aß?de??????µ??? ??ß?????????µ??? ΰαβγδεζηθικλμνξο
59 p??st?f????????? ???????????????? πρςστυφχψωϊϋόύώϏ
60 ???????????????? ???????????????? ϐϑϒϓϔϕϖϗϘϙϚϛϜϝϞϟ

One can easily see that the original Unicode values are translated into VARCHAR values according to the COLLATION setting. Some characters not contained in the collation are mapped onto characters without accents, others are simply replaced by question marks.

Use this code if you need to explain the differences between VARCHAR and NVARCHAR, and why using NVARCHAR is not a bad idea.