Finding more information about Unicode characters using .NET’s StringInfo class

Posted: (EET/GMT+2)

 

Sometimes, you need to dig a little deeper into strings and characters. Recently, I had this need in a project, where I needed to study how non-Latin characters are used and stored. The .NET platform is fully Unicode based, but the regular string and character classes (data types) are not always optimal when talking about more specific things in Unicode, such as graphemes, surrogate pairs and combining character sequences.

In case you need to venture into this area, the StringInfo class in the System.Globalization namespace is your friend. This class can help you analyze your strings, and provide valuable information.

This is one of those classes that exists in the .NET framework library, but you might have never needed it before.

Hope this helps!