(For the purposes of this post, I’ll use Pages with a capital P to mean items in SharePoint of a Page content type, or a child content type of Page. I’ll also refer to all content types in italics)
Previously, I found that the document conversion service doesn’t map site column data from the Document type to the Page type. So, what are our options?
- Get users to fill in the metadata for the converted document
- Put the metadata into the Word document
- Bespoke coding
- Don’t use the conversion service
Let’s look at each in turn.
Get users to fill in the metadata for the converted document
Well, the first option is pretty obvious – get the users to fill in the Site Columns for the converted document’s Page. In my case, this would mean filling in the AWBText column on the ConvertableDocumentPage type. This will work! Unfortunately this means that the page and document’s data is not linked – a change in the AWBText field won’t be replicated between both items, or even just pushed from the ConvertableDocument the next time it’s converted. That sucks a bit, but this might be a valid option.
Put the metadata into the Word document
The second option is quite neat – Word document can have ‘Quick Parts’ – some of which are document properties, and this can be connected to the columns of the content type:
You can put these into the document itself. They’re like document ‘Fields’ in Word pre-2007, but these are much, much better. For a start, you can actually type into the quickpart and it’ll update the document properties – and when you save the document to SharePoint it’ll update the columns of the library! Very cool. Anyway, I updated my Word template from my previous example…
I then created a new document. Note that the AWBText field in the Document Information Panel and the Quick part is linked – I typed in the value in the document and it was reflecting the Document Information Panel.
I then converted these document. This resulted in:
Okay, so I’ve scribbled on this a bit. The area outlined in the purple-pink colour is the content of our document that we converted. You can see that this includes the value of the ConvertableDocument‘s AWBText column. Hurrah! However, above this is the value of the AWBText column on the ConvertableDocumentPage – and it is still empty. In other words, the original document’s metadata is now in the page content – but it still isn’t stored against the Page as metadata. This isn’t really suitable for our customer – they need that column data against the Pages for their navigation. Bah!
Bespoke Coding
Okay, I started to wonder if I could fix this via custom code (i.e. some sort of Feature). I dug through some of the hidden properties of my source ConvertableDocument and destination ConvertableDocumentPage using SharePoint Manager. I knew that there must be some sort of connection as if you Edit the ConvertableDocumentPage is shows you that is has a source document, and lets you edit that document instead. Therefore, they must know where they came from.
In SharePointManager, I found some interesting fields. The ConvertableDocument content type I’d created had a property RcaPageID, which was a GUID. ‘Rca’ stands for ‘Rich Client Authoring’, which is what they seemed to call this page authoring technique until some decided to call it ‘Smart Client Authoring‘ instead. Certainly, internally it’s normally referred to as Rich Client Authoring, or ‘Rca’.
I then checked the ConvertableDocumentPage type, which had a property RcaSourceDocID . This was a GUID, and this ID matched the RcaPageID of the document we used to create the page. Thus, and I’m pretty sure about this, it’s the connection between the source Document and destination Page.
Therefore, I could build an event handler that (when a page is updated or created) gets the Page’s source document, sees what columns they share, and copies across the values of those shared columns. Actually, it’d probably have to exclude some (like title), but you get the idea. Also, it’d have to run a query across all the documents in the site collection, but I’m pretty sure that this is possible.
I like this solution, and think it’s a fairly straight forward, generic candidate for a feature, but unfortunately our customer is unable to make server configuration changes – like installing new features. So that rules that idea out… damn.
Don’t use the Document Conversion Service
I know, this seems a bit crazy – but you could author your content in Word and just copy and paste the content into your pages. This is what our customer was doing. I know, it seems a little crazy to me too, but if you lock down the styles available in a Word template, then the code you’ll copy will have consistent CSS styles in it, and you can prevent any inline CSS through that restriction too. It has no server footprint and no duplication of metadata – but you still have to store the documents (which might require column data too).
So those were the options I was able to come up with. I like the coding option – an event handler could be a very elegant way of dealing with this.