If you set out to create a hierarchy of terms in MOSS 2007, you were likely either frustrated (because you actually tried), or disappointed (because there did not exist any OOB framework by which to accomplish such a feat).
SharePoint 2010 brought us managed metadata, the metadata term store, and the hope that enterprise terminology could be organized and tagged as easily as it is accessed. Along with it, at least in my case, came some frustration in understanding just how managed metadata is stored and accessed programmatically.
The TL;DR version of the underlying data representation behind managed metadata columns is as follows:
- Managed metadata is **cough** managed via Central Administration or Site Collection administration by modifying the Term Store. The Term Store can store hierarchies of terms in named groups and term sets, each with its own security and tagging scopes.
- SharePoint fields can be of type “Managed Metadata”; users populating the fields can choose from a specific group or term set.
- The values stored in these fields act similar to “choice” fields, in that the choices presented to the user come from an internal list called “TaxonomyHiddenList” at the site collection (root) level. This list is only accessible via the object model. And this is where it gets tricky.
Managed Metadata in Multiple Environments
Most enterprise SharePoint implementations are comprised of several farms: Development, Test, Stage (sometimes left out) and Production. Security (in terms of what facets of the farm are exposed to developers) increases with each successive level up to Production. Theoretically, the Production and Stage environments should be carbon copies of one another, with the Stage environment serving as the final dry run for “staging” a deployment to Production; the Test environment should be a carbon copy of the Stage environment save any applications which are being actively tested; and the Development environment should resemble loosely-governed chaos akin to endless shelves of partially-constructed Lego sets.
Applications are typically promoted via a dual-methodology approach:
- Sandboxed solutions (read: don’t contain custom code) are promoted from one environment to the next by backing up and restoring site collections.
- Farm solutions (read: contain custom code) are packaged as .WSP SharePoint Solutions and deployed via STSADM or Powershell.
So what is managed metadata? It contains no custom code, but as we’ll find out, it cannot be moved from one environment to another by simply backing it up and restoring it to a new environment.
The only way to migrate managed metadata from one environment to another is to build term sets in Central Administration and export them to CSV (I like to use this CSV export utility on Codeplex). This approach has a major shortcoming, however: importing a managed metadata CSV to the term store doesn’t actually import the terms themselves, complete with unique IDs– it just creates new terms with new GUIDs and the same labels. Why is this a problem? For one, fields of type “Managed Metadata,” which are mapped to term sets, contain references to the GUIDs that we can’t move…
More Implications: Migrating Fields of Type “Managed Metadata”
I wish I had read the MSDN article Migrating Managed Metadata in SharePoint Server 2010 prior to wiring up managed metadata fields as parts of vital content types, because if I had, I would have found this:
As part of the development process involving taxonomies, new lists may be created that use managed metadata fields. Like their associated taxonomies, these lists may also need to be migrated from the development environment to a testing, staging, or production environment. Unfortunately, migrating such lists poses problems with globally unique identifiers (GUIDs).
Bingo. Not only can managed metadata not be migrated from one environment to another with GUIDs intact, but any fields that depend on this metadata are effectively DOA, because the field definitions contain references to the GUIDs that don’t transcend environments. Fields like this will migrate “successfully” to new environments, but will lose their connection to the managed metadata term set.
In order to migrate these fields from one environment to another:
- Migrate the fields by whatever means you prefer (site collection promotion/restoration in the new environment, feature activation, etc.).
- Migrate the managed metadata term sets that the fields reference by exporting a .CSV file from the source environment and importing it in the destination environment.
- Open the settings for the field and re-associate it with the newly-created term set.
Accessing Managed Metadata in Code
If you’re like me, you’ve struggled with the mighty SharePoint to learn its many idiosyncrasies; you might have even compiled a short list of helpful best practices for writing code that runs against the server object model.
One such best practice of which I’ve made great use is to (ALMOST) always use a SharePoint field’s internal name, rather than its display name, when using said fields in custom code. The easiest way to do this is to keep track of your custom fields’ IDs and use these to obtain the internal names from SharePoint. I find it helpful to place field IDs in a separate class as static strings, then call them as such:
public class FieldIds { public static String FIELD_ID_MY_AWESOME_FIELD = "{1ea241c6-bfcd-3246-800b-3c043c57c6d4}"; } //... Client Code string myAwesomeFieldInternalName = <<WebName>>.Fields[new Guid(FieldIds.FIELD_ID_MY_AWESOME_FIELD)].InternalName;
Similar to fields, term sets and terms can be accessed in code by their GUIDs. But we can only store these GUIDs in a config file or a separate class if we’re not planning on moving them from one environment to another.
Luckily, Microsoft warned us:
When creating solutions that use the capabilities of the Managed Metadata Service application, consider what will happen to metadata when you migrate between environments.
Phew! Good thing they considered this, because I sure didn’t. So there IS a way to cleanly export and import managed metadata GUIDs after all, right?
SharePoint Server 2010 assigns GUID values when the terms are created or imported. If a taxonomy is exported from one environment and imported into another, the system assigns new GUID values.
Well, damn it.
When migrating term sets from one environment to another, as is typical for enterprises with Dev/Test/Stage/Prod environments, it is impossible maintain a term’s unique ID from one environment to the next.
As such, building robust, scale-able SharePoint solutions that access managed metadata becomes increasingly difficult.
Example: Finding Terms in a Term Set Without Knowing Their GUIDs
Consider the following example, where a timer job that writes new items to a document library must populate a column of type managed metadata. The column, which indicates the company (let us use Microsoft’s fictitious “Fabrikam” corporation) Branch to which a document belongs, has been assigned to a term set called “Branches” under the company-specific group of terms called “Fabrikam.”
The client code for the timer job knows the branch to which the document belongs, and must place this value in the “Branch” column for the document. The code also knows that the branch term exists in the Branches term set in the Fabrikam group. In order to find the branch given these constants, we write a method called “GetBranchTerm” that takes an SPSite object and three strings representing the groupName, termSetName in which to look, and the termName for which to search:
private static Term GetTermFromTermSet(SPSite site, string groupName, string termSetName, string termName) { TaxonomySession session = new TaxonomySession(site); // Get a TermStore from the session TermStore termStore = null; if (session.TermStores != null && session.TermStores.Count > 0) { termStore = session.TermStores[0]; } //Get a group from the term store Group group = termStore.Groups[groupName]; //Get all term sets for the group TermSetCollection termSets = group.TermSets; //Get the term set by the supplied term set name TermSet termSet = termSets[termSetName]; ...
You now have the term set containing the term you want. How do you get the term, though?
You could try this:
Term term = termSet.Terms[termName];
But executing this code throws an error at runtime.
You can try calling the GetTerm method like this:
Term term = termSet.GetTerm(<<<What do you put here?>>>);
But the GetTerm method requires knowledge of the term’s ID, a GUID– which we don’t have.
My solution was to load all of the terms in the term set into a collection, then loop through them until I found the term that I wanted:
//Load each term into a dictionary containing the branch name as the key and its GUID as the value Dictionary<string, Guid> allTerms = new Dictionary<string, Guid>(); foreach(Term term in termSet.Terms) { allTerms.Add(term.Name.ToString(), term.Id); } Guid termGuid = allTerms[termName]; //Throws exception if doesn't exist in dictionary Term theTermWeWant = termSet.GetTerm(termGuid); //Throws exception if doesn't exist in term set
This worked. But with large term sets, O(n) for this function would get unreasonably large if we add them all to a dictionary simply to find *one* term. It turns out that in order to find a single term in a term set by name, using GetTerm isn’t the way to go– but TermSet.GetTerms will allow you to find terms by label, and then iterate through the TermCollection object it returns to find the term you want:
TermCollection matchingTerms = termSet.GetTerms(termName, false); //Code to iterate through the TermCollection and/or return the first object if it contains multiple terms...
TL;DR: Use TermSet.GetTerms to find terms if you don’t know their GUID. But this is part of a much larger problem.
Conclusion
Microsoft took a huge step in the right direction when they debuted the term store in SharePoint 2010– but it’s not without its flaws:
- Managed metadata can’t be moved from one environment to the next without assigning it a new GUID. This creates problems for developers who need to access terms in custom code.
- Similarly, fields that rely on managed metadata won’t migrate properly to a new environment, either, as they are dependent upon the GUIDs of the terms to which they are linked.
- The whole process of importing term sets to term store is honky. Your only options for importing a term set are to upload a CSV file or write custom code– and even more confusingly, Microsoft writes (same article as before) that the only way to export a term set is to write custom code:
The challenge with migrating taxonomies is that no export functionality is exposed through the Term Store Management Tool. Therefore, exporting must be done entirely through custom code.
The whole thing amounts to a giant headache for developers. Here’s to hoping that SharePoint 2014 improves on the Term Store the same way that SharePoint 2010 improved on virtually everything from MOSS 2007.
Migrating termsets and managed metadata fields is supported using standard Powershell cmdlets: http://kjellsj.blogspot.com/2012/04/sharepoint-mms-migrate-termset.html