I am writing an application to import/generate a number of terms in a term set. I started to get some errors about my code trying to add terms that already exist – but it does check for the existence of terms with a given name.
You can see that the string comparison for the name of the existing term and the term I want to get/create returns false, and that the terms seem to be different lengths. Well, I tried looking for blank spaces, non-printing characters, carriage returns, etc., and found nothing. Then I looked really closely at the ampersand characters shown in the debugger:
Yup, they’re different characters. You can see a slightly different shape to each of them – one is wider.
Fine. Let’s see the ASCII code for them:
Ah. There we go, one is Unicode – despite the fact that it went in as an ASCII ampersand (character 38). So how the hell does that happen?
Well, now I knew what to Google for, I found this post about the problem. It seems David was trying to do almost exactly the same as me, and hit the same problem. He found that SharePoint replaces the ASCII & with the Unicode one – but fortunately there’s a helper method to correct this – TaxonomyItem.NormalizeName() . It says:
The name will be normalized to trim consecutive spaces into one and replace the & character with the wide character version of the character (uFF06).
So, white space removal as well, as an added bonus. Okay, I can live with that.
And if anyone want to do this through the Client Side Object Model (CSOM):
ClientResult<string> normalized = Term.NormalizeName(clientContext, termName); clientContext.ExecuteQuery(); string normalizedTermName = normalized.Value;
EDIT: You might be interested in the update on how to do this without CSOM – it’s much faster – http://www.novolocus.com/2014/02/01/what-does-taxonomyitem-normalizename-do/