What happens with a large TaxonomyHiddenList?

SharePoint’s Taxonomy (or “Managed Metadata”) fields are a bit strange in how they work. In a lot of ways they’re actually like a lookup field, and part of this lookup field is that site collections that use them have a hidden list – called TaxonomyHiddenList – in the root of the site collection. You can find it at /lists/TaxonomyHiddenList/.

Hidden

Each term that is used in a Taxonomy field has it’s label (actually, labels) and other details kind of ‘cached’ in this list, and there is a timer job that will push changes to the terms out into these lists – and therefore to items that refer to the entries in the list.

Thus, the TaxonomyHiddenList is like a list that Managed Metadata fields use similarly to a lookup column. Here’s some more of the details that one shows:

Hidden list with more details

You’ll notice that it has the term’s ID and termset ID in the taxonomy service (or ‘Managed Metadata service’). These are important if you want to get information about a term from the taxonomy service.

What I needed to do in some Client Side Object Model (CSOM) code I was writing recently was, given a Term‘s details, I needed to query for that term being referred to in a large list (5000+ items). I could have tried something like:

CamlQuery qry = new CamlQuery();
qry.ViewXml = string.Format("<View><Query><Where><Eq><FieldRef Name='{0}'/><Value Type='Text'>{1}</Value></Eq></Where></Query></View>",
"MyField",
"My Value");

…but this would run into problems with the query throttle. Even if the column is indexed, you’re not using the LookupId of the field, so you’re not using the index. A working query would be:

CamlQuery qry = new CamlQuery();
qry.ViewXml = string.Format("<View><Query><Where><Eq><FieldRef Name='{0}' LookupId='true' /><Value Type='Lookup'>{1}</Value></Eq></Where></Query></View>",
"MyField",
4);

… but to do this you need the LookupId of the item you’re looking for – so you’ll have to use the TaxonomyHiddenList to resolve that Term‘s GUID to the TaxonomyHiddenList‘s integer ID.

“Fine”, I thought, “But what if the TaxonomyHiddenList itself has more than 5000 items?”. I went and checked it, assuming there’d be an index on the important fields, such as IdForTerm and IdForTermSet. I was wrong:

New Taxonomy List

What! No index? So if there are more than 5000 terms used in a Site collection there would be an exception? That can’t be right!

Okay, let’s test it. I put 5001 terms into my TermSet, and created 5001 ListItem‘s that used those terms (1 each), and checked what happened.

Everything still worked. I checked the TaxonomyHiddenList again:

Large Taxonomy List

The event receivers that populate the TaxonomyHiddenList (the one that’s added to a list if you add a ‘Managed Metadata’ field to a list) clearly also create the indexes when the 5000 item count is reached.

Okay, so there are indexes (indices?) – how would I query them? Well, I started by trying the following query on IdForTermSet and IdForTerm. This was a mistake.

Query First Attempt

There are more than 5000 items in the term set, so querying against IdForTermSet caused a query throttling exception. Instead, I queried for just the term ID:

Query Second Attempt

And that worked, despite the over 5000 terms, and I could use that ListItem‘s integer ID to query my other list using the LookupId, and therefore using that list’s index.

Advertisements
What happens with a large TaxonomyHiddenList?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.