Stripping Accents from Strings in C#

Unicode defines a concept called normalization (Unicode, Wikipedia) to define the equivalence of composed and decomposed representations of characters.

In .Net, the string.Normalize() method can be used to convert strings between normalization forms. If a string is in normalization form NormalizationForm.FormKD (full compatibility decomposition), the combing and modified marks are stored as separate characters, and their Unicode category can be retrieved calling the GetUnicodeCategory() method.

Thus, stripping the characters of a string from their accents, one has to perform the following steps:

  • Normalize the string into full compatibility decomposition
  • Remove the characters belonging to a “Mark” category
  • Return the result

Here is the C# code implementing this function:

using System.Text;
using System.Globalization;

public string StripAccents(string s)
  StringBuilder sb = new StringBuilder();
  foreach (char c in s.Normalize(NormalizationForm.FormKD))
    switch (CharUnicodeInfo.GetUnicodeCategory(c))
      case UnicodeCategory.NonSpacingMark:
      case UnicodeCategory.SpacingCombiningMark:
      case UnicodeCategory.EnclosingMark:

  return sb.ToString();

1 thought on “Stripping Accents from Strings in C#

  1. Thanks for this. It really helped me. With a lot of confusing stuff out there this made it very simple.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.