Making Documents Unique

A Fundamental ECM Requirement

A fundamental tenet of any ECM solution is that it must be possible to uniquely identify a document and retrieve it based on a unique identifier.  This identifier is typically an incremental numeric value that increments by 1 for each document that is added to the system.

The lack of such an identifier is is one of the fundamental problems that many people have had with SharePoint and has now been addressed by the Document ID service in SharePoint 2010.  Behind the scenes, documents have always had a unique identifier in SharePoint but this was (and still is) a globally unique identifier (GUID) which isn’t particularly user friendly and looks something like this –

{31EC2020-3AEF-1069-A2DD-08012B30309D}

Before SharePoint 2010, the other way to identify a document was the URL that pointed to the document based on its location.  For most users this wasn’t an issue, as documents were saved into a specific location and never needed to be moved.  For large scale ECM projects, this did become an issue, especially when documents were transactional in nature and tied to business processes.  In these scenarios, a document could potentially be moved from one library to another as part of the business process and therefore be assigned a new URL identifier; this made it difficult when trying to provide uniform access to the document regardless of its location.  Many organisations integrate their line of business systems with their ECM system and like to store the unique ID of the document against a business record.  An example would be linking the scanned image of an invoice to an invoice record in an accounts payable system.  Not having a unique ID makes this linking very difficult and can result in broken links if the target document moves within SharePoint.

The Document ID Service

Thankfully SharePoint 2010 introduced the Document ID Service which delivers this missing piece of functionality.  The new Document ID service is actvated as a feature at the Site Collection level and only applies to Document Libraries; it therefore only works for documents and not list items.  What it does is assign Unique IDs to documents when they are initially created.  The Document ID is a numeric value and SharePoint allows you to prefix it with a text value which is typically unique between Site Collections e.g. “LEGAL”, “HRDOCS”, “FINANCE”.  The unique ID looks something like this:

LEGAL-1-101

Note that if you move a document then its Document ID does not change but if you copy a document, the new document will be assigned a new Document ID.

How People are Using it

Document IDs are proving to be useful for organisations that have document centric processes that rely on being able to uniquely identify a document and/or provide rapid access to documents via search.

An example use case could play out as follows:

  • A legal firm creates a contract using Microsoft Word.
  • The document template automatically applies the Document ID to the document footer or as a reference.
  • The contract is emailed to a third party in PDF format for review.
  • The third party calls back to discuss the contract.
  • By asking the third party to state the document reference, the legal firm can instantly search for and view the contract.

This is a very simple example but just shows the benefit of being able to access a document instantly via a Doucment ID search, rather than having to browse through sites, libraries and folders to locate the document.

Potential Problem

One of the problems that I have with Document IDs is that they are only unique to the Site Collection and therefore depend upon the prefix in order to make them unique across the SharePoint Farm.  If you decide to use the same prefix in two Site Collections then there is a risk for duplication of IDs.  This issue won’t affect too many organisations but must be borne in mind when designing a large scale archiving or ECM solution.

One of the risks of having duplicate Document IDs is that if you perform a Document ID search then it will only return the first document that matches that ID.  There is therefore a risk of accessing the wrong document.  Again, this would probably only happen if you copied a document to another Site Collection which had the same Document ID prefix.

My preference would be to be able to implement the ID at Farm level so that the Document ID is unique across all Site Collections.  Having the option to implement at the Site Collection or Farm level would be a useful addition to the feature and will hopefully be on Microsoft’s roadmap.

If you have software development skills in your organisation then it is possible to override the default Documen ID behaviour with your own unique ID, which could be set Farm wide, but this is outside the scope of this article.

All in all it is a useful feature for those organisations looking to use SharePoint as a platform for ECM or archiving.

Advertisements

About rearcardoor

Chairman and founder of ImageFast Ltd, a leading UK ECM consultancy business and Microsoft Gold Partner. Over 20 years experience delivering successful ECM projects utilising scanning, data capture, document management, records management, workflow, BPM and SharePoint.
This entry was posted in ECM, SharePoint 2010 and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s