SharePoint search can crawl managed metadata fields, and it stores the values inside a Text search property. But what are the values stored there? What format are they in? Do they contain the path to the term/s, etc?
Well, the project I’m working on needed me to retrieve some managed metadata fields from the REST search API. The values I got back were like:
GP0|#080af9e2-2349-48eb-8da3-5aa927a0b246 L0|#0080af9e2-2349-48eb-8da3-5aa927a0b246|Property D:Db:Db1 GTSet|#ad4b9f8d-2e65-4ce4-9ab5-ab1bad98f963 GPP|#72554ab8-62bc-4181-a050-6e84c28426d0 GPP|#f074d79a-4ab2-4005-86c0-025e62142281 GP0|#4b32503b-e11e-44ea-bdf5-03467fe3c28c L0|#04b32503b-e11e-44ea-bdf5-03467fe3c28c|Property A GP0|#c77f61d6-4e22-44bc-814c-e97916b1fc42 L0|#0c77f61d6-4e22-44bc-814c-e97916b1fc42|Property C:C2 GPP|#26192dcf-8512-45b4-b20e-6c5e151b9edb
Eeek! What the heck is all that? Well a quick search found this page (scroll up slightly) which describes the elements nicely:
To query for items tagged with a Managed Metadata field, you have to use the Unique Identifier for each label. You can find the Unique Identifier for each term in a term set in the Term Store Management Tool, on the GENERAL tab. In addition, the data format that is used in the query has to specify from which level in the term set the query should apply. This specification is set by adding one of the following prefixes to the Unique Identifier:
– To query for all items that are tagged with a term: GP0|#
– To query for all items that are tagged with a child of term: GPP|#
– To query for all items that are tagged with a term from a term set: GTSet|#
Nice, MSDN really came through this time, even if the information is a bit … buried.
But then the next question – how do I get the text that I’d like to display out of that string? Well, that turned out to be a little awkward:
function GetTermStringArray( input ) { var results = new Array(); var mmDataRegex = /#0[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}|(.*?)(GTSET|GP0|GPP|L0|$)/igm; var match = mmDataRegex.exec(input); while(match!= null) { results.push(match[1].replace(/n/g,"")); match = mmDataRegex.exec(input); } return results; }
This function accepts the string at the top of the page, and returns an array of strings of the different terms that were matched. You can then .join() them, or whatever you need to do for display. It seems to work nicely.
I tried this function with the following data and it failed to work. Has the format changed or can you suggest the problem?
GP0|#272b765a-abb4-4dd7-b4a8-ff06ebb00f76\n\nL0|#0272b765a-abb4-4dd7-b4a8-ff06ebb00f76|Chipping Sodbury\n\nGTSet|#b49f64b3-4722-4336-9a5c-56c326b344d4
Well the \n\n seems surprising; I don’t remember there being new lines in the text.
Ah, my code may have lost a \n somewhere in translation. In the Regex on line 7, maybe?